Introduction

Methods

Results

Translocation detection in lung cancer
using mate-pair sequencing and iVIGS
Richard Meier & Stefan Graw
University of Kansas Medical Center

February 3, 2014

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Content

Introduction
Methods
Results

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Introduction

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Structural variations

Chromosome 1

Chromosome 17

No variation
Deletion
Insertion
Translocation

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Structural variations in mate pair mapping
Insertion

Deletion
reference genome

reference genome

cancer genome

cancer genome
reads map closer than expected

reads map farther away than expected

Translocation
reference genome

cancer genome
reads map to different chromosomes

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Breakpoint resolution with split reads
Where are the breakpoints ?

known reference

cluster

cluster

Looking at soft clipping reads

known reference

cluster

cluster

reads
unknown sample

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Methods

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Data

A set of mate-pair sequencing data from lung cancer patients was analysed.

35 samples were processed with the sv tool iVIGS

32 samples were processed with the sv tool delly

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Mate-pair preprocessing

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Translocation analysis: general strategy

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Translocation analysis: tools and workflow

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Comparison of the tools
Both tools
• cluster paired reads to find potential translocation regions
• use split reads to find potential breakpoint positions

delly
• re-assembles split reads
• re-maps the assembly to the cluster region

iVIGS (tool for identification of variations in genomic structure)
• is developed in our lab and currently still a work in progress
• performs Kernel Density Estimation on split read mapping positions
• estimates propability distribution of breakpoint positions
Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Results

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Effect of iVIGS quality control filtering
The separation distance distribution was similar for all samples
4e+06

Molina−Dataset

2e+06
0e+00

1e+06

counts

3e+06

unfiltered
filtered

0

1000

2000

3000

4000

5000

6000

distance between mate−reads

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Problems with delly
• Applying iVIGS filter resulted in delly not reporting any translocations
• Taking all reads and applying delly internal filtering resulted in finding

translocations
• Coverage for reads after strict iVIGS filtering was probably too

inconsistent for assembly.
The type of used assembly is also important. (see next slides)
• Thus the following results for delly are refering to a workflow that uses

the internal filtering method.

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Selection of breakpoint distributions calculated by iVIGS
Kernel Density Estimation

Kernel Density Estimation

27735000

27740000

|| | |
|

135500000

|

0.00010

breakpoint density
|
135496000

0.00000

0.0000

||||||||||||||||||| ||||||||||| |||||||||||||||| ||||| || ||||||||||||||| ||||||||| ||||||||||| ||||||||| || |||| | |
|
|
|
|
27730000

0.00005

0.0010

breakpoint density

0.0005

2e−04
0e+00

1e−04

breakpoint density

3e−04

0.0015

0.00015

4e−04

Kernel Density Estimation

|
135504000

135508000

|

|

33970000

|

||

|| || ||| | || |
|

|| |

33975000

base position

Kernel Density Estimation

Kernel Density Estimation

|

121484000

121486000

base position

121488000

0.015

breakpoint density

0.010
||
190194000

190198000

| || ||
|

||

190202000

base position

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

0.000

0.005

0.0005
0.0000

|||||| | || ||||||||||
| | |
121482000

0.020

0.025

0.0020
0.0015

breakpoint density

0.0010

1e−03
8e−04
6e−04
4e−04
2e−04

breakpoint density

|

33980000

Kernel Density Estimation

0e+00
121480000

|| | | ||

base position

0.030

base position

|
61792000

||||
|
61794000

61796000

61798000

base position

Richard Meier & Stefan Graw
Introduction

Methods

Results

Model of translocation divergence due to cancer proliferation
chromosomes with highly
active regions

breaking and translocation

subsequent variations in
daughter cells

breakpoint and cluster
distribution

density
position

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Error sources
• Adapter contamination

(before filtering approximately 15% of all reads are contaminated.)

• Ligation errors

• PCR bias

• Sequencing errors

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

General information
• Estimated breakpoint positions were highly variable (spanning up to

several thousand base positions in difference)
• Translocations were found to almost always overlap with potential

deletion or insertion cluster regions (estimated by iVIGS).
• In most cases around 35 translocations per sample were estimated

0.02
0.00

0.01

Density

0.03

0.04

translocation discovery of iVIGS

20

30

40

50

60

70

80

number of estimated translocations per sample

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Typical translocation distribution observed in samples
CHRY

CHR1

CHRX

CH

R2

R2

2

1

CH

CH

R2

0
R2

CH

R1
8

CH

R1

9

CH

CH

CHR16

CHR1

7

R3

CHR4

CHR15
4

CHR5

CHR1

R1

CH
CH

R6

3

CH
R1
2

CH

R1

R7

CH

1

CHR1

CHR8

0
CHR9

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Occuring genes altered by a translocation

Genes used for the diagram were altered in one or more samples
delly

108

41

158

iVIGS

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Potential gene fusions

Genes used for the diagram were altered in one or more samples
delly

32

11

73

iVIGS

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Potential gene to intergenic fusions

Genes used for the diagram were altered in one or more samples
delly

106

27

109

iVIGS

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Reproducibility: delly

Gene alterations
sample_A1

13

1

2

sample_A2

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Reproducibility: iVIGS

Gene alterations
sample_A1

11

15

10

sample_A2

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Conclusion
• Results varied significantly between delly and iVIGS

• Reproducibility of iVIGS seems promising

• It is still unclear how strong the influences of diversity in close related

cancer cells and library preparation errors are in respect to the results.

• It is thus still difficult to determine whether predictions are FP or TP.

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw
Introduction

Methods

Results

Plans for the future
• Apply adapter removal in preprocessing to improve mapping yield

• Pick a subset of potential gene fusions and validate them

• Examine other structural variation types (insertions, deletions,

inversions)

• Find and use additional sv tools

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

Richard Meier & Stefan Graw

Translocation detection in lung cancer using mate-pair sequencing and iVIGS

  • 1.
    Introduction Methods Results Translocation detection inlung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw University of Kansas Medical Center February 3, 2014 Translocation detection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 2.
    Introduction Methods Results Content Introduction Methods Results Translocation detection inlung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 3.
    Introduction Methods Results Introduction Translocation detection inlung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 4.
    Introduction Methods Results Structural variations Chromosome 1 Chromosome17 No variation Deletion Insertion Translocation Translocation detection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 5.
    Introduction Methods Results Structural variations inmate pair mapping Insertion Deletion reference genome reference genome cancer genome cancer genome reads map closer than expected reads map farther away than expected Translocation reference genome cancer genome reads map to different chromosomes Translocation detection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 6.
    Introduction Methods Results Breakpoint resolution withsplit reads Where are the breakpoints ? known reference cluster cluster Looking at soft clipping reads known reference cluster cluster reads unknown sample Translocation detection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 7.
    Introduction Methods Results Methods Translocation detection inlung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 8.
    Introduction Methods Results Data A set ofmate-pair sequencing data from lung cancer patients was analysed. 35 samples were processed with the sv tool iVIGS 32 samples were processed with the sv tool delly Translocation detection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 9.
    Introduction Methods Results Mate-pair preprocessing Translocation detectionin lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 10.
    Introduction Methods Results Translocation analysis: generalstrategy Translocation detection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 11.
    Introduction Methods Results Translocation analysis: toolsand workflow Translocation detection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 12.
    Introduction Methods Results Comparison of thetools Both tools • cluster paired reads to find potential translocation regions • use split reads to find potential breakpoint positions delly • re-assembles split reads • re-maps the assembly to the cluster region iVIGS (tool for identification of variations in genomic structure) • is developed in our lab and currently still a work in progress • performs Kernel Density Estimation on split read mapping positions • estimates propability distribution of breakpoint positions Translocation detection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 13.
    Introduction Methods Results Results Translocation detection inlung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 14.
    Introduction Methods Results Effect of iVIGSquality control filtering The separation distance distribution was similar for all samples 4e+06 Molina−Dataset 2e+06 0e+00 1e+06 counts 3e+06 unfiltered filtered 0 1000 2000 3000 4000 5000 6000 distance between mate−reads Translocation detection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 15.
    Introduction Methods Results Problems with delly •Applying iVIGS filter resulted in delly not reporting any translocations • Taking all reads and applying delly internal filtering resulted in finding translocations • Coverage for reads after strict iVIGS filtering was probably too inconsistent for assembly. The type of used assembly is also important. (see next slides) • Thus the following results for delly are refering to a workflow that uses the internal filtering method. Translocation detection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 16.
    Introduction Methods Results Selection of breakpointdistributions calculated by iVIGS Kernel Density Estimation Kernel Density Estimation 27735000 27740000 || | | | 135500000 | 0.00010 breakpoint density | 135496000 0.00000 0.0000 ||||||||||||||||||| ||||||||||| |||||||||||||||| ||||| || ||||||||||||||| ||||||||| ||||||||||| ||||||||| || |||| | | | | | | 27730000 0.00005 0.0010 breakpoint density 0.0005 2e−04 0e+00 1e−04 breakpoint density 3e−04 0.0015 0.00015 4e−04 Kernel Density Estimation | 135504000 135508000 | | 33970000 | || || || ||| | || | | || | 33975000 base position Kernel Density Estimation Kernel Density Estimation | 121484000 121486000 base position 121488000 0.015 breakpoint density 0.010 || 190194000 190198000 | || || | || 190202000 base position Translocation detection in lung cancer using mate-pair sequencing and iVIGS 0.000 0.005 0.0005 0.0000 |||||| | || |||||||||| | | | 121482000 0.020 0.025 0.0020 0.0015 breakpoint density 0.0010 1e−03 8e−04 6e−04 4e−04 2e−04 breakpoint density | 33980000 Kernel Density Estimation 0e+00 121480000 || | | || base position 0.030 base position | 61792000 |||| | 61794000 61796000 61798000 base position Richard Meier & Stefan Graw
  • 17.
    Introduction Methods Results Model of translocationdivergence due to cancer proliferation chromosomes with highly active regions breaking and translocation subsequent variations in daughter cells breakpoint and cluster distribution density position Translocation detection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 18.
    Introduction Methods Results Error sources • Adaptercontamination (before filtering approximately 15% of all reads are contaminated.) • Ligation errors • PCR bias • Sequencing errors Translocation detection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 19.
    Introduction Methods Results General information • Estimatedbreakpoint positions were highly variable (spanning up to several thousand base positions in difference) • Translocations were found to almost always overlap with potential deletion or insertion cluster regions (estimated by iVIGS). • In most cases around 35 translocations per sample were estimated 0.02 0.00 0.01 Density 0.03 0.04 translocation discovery of iVIGS 20 30 40 50 60 70 80 number of estimated translocations per sample Translocation detection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 20.
    Introduction Methods Results Typical translocation distributionobserved in samples CHRY CHR1 CHRX CH R2 R2 2 1 CH CH R2 0 R2 CH R1 8 CH R1 9 CH CH CHR16 CHR1 7 R3 CHR4 CHR15 4 CHR5 CHR1 R1 CH CH R6 3 CH R1 2 CH R1 R7 CH 1 CHR1 CHR8 0 CHR9 Translocation detection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 21.
    Introduction Methods Results Occuring genes alteredby a translocation Genes used for the diagram were altered in one or more samples delly 108 41 158 iVIGS Translocation detection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 22.
    Introduction Methods Results Potential gene fusions Genesused for the diagram were altered in one or more samples delly 32 11 73 iVIGS Translocation detection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 23.
    Introduction Methods Results Potential gene tointergenic fusions Genes used for the diagram were altered in one or more samples delly 106 27 109 iVIGS Translocation detection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 24.
    Introduction Methods Results Reproducibility: delly Gene alterations sample_A1 13 1 2 sample_A2 Translocationdetection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 25.
    Introduction Methods Results Reproducibility: iVIGS Gene alterations sample_A1 11 15 10 sample_A2 Translocationdetection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 26.
    Introduction Methods Results Conclusion • Results variedsignificantly between delly and iVIGS • Reproducibility of iVIGS seems promising • It is still unclear how strong the influences of diversity in close related cancer cells and library preparation errors are in respect to the results. • It is thus still difficult to determine whether predictions are FP or TP. Translocation detection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw
  • 27.
    Introduction Methods Results Plans for thefuture • Apply adapter removal in preprocessing to improve mapping yield • Pick a subset of potential gene fusions and validate them • Examine other structural variation types (insertions, deletions, inversions) • Find and use additional sv tools Translocation detection in lung cancer using mate-pair sequencing and iVIGS Richard Meier & Stefan Graw