SlideShare a Scribd company logo
1 of 27
Download to read offline
ChimeraScan
Chimeric transcript discovery by paired end
transcriptome sequencing.
AGENDA
• Overview: What is ChimeraScan?
• ChimeraScan Method(Algorithm)
• How to run ChimeraScan?
• ChimeraScan Results
• Limitations: What could be done better?
• Comparison with current software(deFuse, Trans-
Abyss)
WHAT IS CHIMERASCAN?
• A tool for discovering chimeric transcripts
or fusions in sequencing data.
ChimeraScan Method
● ChimerScan differs from
other fusion
finders(deFUSE) in that it
adds a fragmentation
step along with the
whole paired-end
approach which is also
used by deFUSE.
Tell me more!!!!
ChimeraScan Algorithm
 Fragmentation
ChimeraScan Algorithm
Step 1: Prepare reads for alignment
ChimeraScan parses FASTQ
1) converts all quality scores to Sanger format
(Phred + 33)
2) converts the qname for the reads from an arbitrarily long
string to a number (1/1, 1/2 for PE reads)
 Reduces storage requirements for intermediate steps
ChimeraScan Algorithm
 Pysam package is used.
Step 3: Create a sorted/indexed BAM file
Enables fast lookup of original read alignments by genomic coordinates.
Step 4: Estimate insert size distribution
Only uniquely mapping reads are used to sample the insert
size distribution (used in future steps to help localize fusion
breakpoints).
ChimeraScan Algorithm
Step 5: Realign initially unmapped reads(Fragmentation)
All of the initially unmapped reads are treated as single reads and realigned.
Additionally, the reads are trimmed such that only the sequences at the ends of the
fragment are aligned (default=25bp).
Step 6: Discover discordant reads
ChimeraScan Algorithm
Step 7: Nominate chimeras(fragment size distribution used)
Step 8: Extract chimeric breakpoint sequences(from genome FASTA file)
bowtie indexer used to create new alignment index of these breakpoint sequences
Step 9: Nominate reads that could span breakpoints
7
9
7 8
ChimeraScan Algorithm
Step 10: Align against breakpoint sequence database
(Created in step 8)
Step 11: Assess breakpoint spanning alignment
results (min anchor > #homologous bases between 5’->3’ at breakpoint
+ #mismatches allowed)
Reads that align to the breakpoint sequence index are
discarded if the overlap is small (less than anchor_min bases)
or have larger overlap but contain mismatches (red reads).
Reads overlapping the breakpoint by more than anchor_length
bases are retained (green read).
ChimeraScan Algorithm
Step 12: Filter chimeras
Many filters which the user can specify to minimize the amount of false
positives.(know-false-positives, filter-size-distribution, supporting reads)
Step 13: Produce a text output file (BEDPE file)
How to run ChimeraScan
STEP 1: Generate read paired fastq files from merged bam files
'Bash baprojects/trans_scratch/software/deFUSE/scripts/bam2fastq.converter.sh'
INPUT(S):
BAM_FILE_PATH(ABSOLUTE)
LIBRARY_ID
OUTPUT_DIRECTORY
How to run ChimeraScan
STEP 2: Submit Chimerascan to cluster
'python /projects/trans_scratch/chimerascan/chimerascan-0.4.5/bin/chimerascan_run.py'
INPUT(S):
-v: verbose (for logging and debugging)
-p: processors(tested with -p = 8)
chimerascan_index(generated during chimerascan installation)
Fastq_1, Fastq_1 (both generated in step 1)
How to run ChimeraScan
Combine steps 1 & 2:
'bash /projects/trans_scratch/chimerascan/chimerascan_setup.sh'
INPUT(S):
PATIENT_ID
LIBRARY_ID
BAM_FILE_PATH
PROJECT_DIRECTORY
output(S):'qsub_all_chimerascan.sh': a script that submits both
steps 1 & 2 to the cluster. Jobs are run serially(fastq files
are created before the chimerascan job is submitted)
ChimeraScan Results
Output(S):
Chimerascan outputs a chimeras.bedpe tabular file.
The chimeras.bedpe file contains information about the chromosomal regions, transcript ids,
genes, and statistics for each chimera. The file adapts to the BEDPE format for representing paired-
intervals (courtesy Aaron Quinlan and the BEDTools project).
The chimeras.bedpe also contains spanning and supporting reads(total score) for each reported
events.
Other intermediate files are also created during the run, but they do not contain any useful
information and thus can be deleted after the run is complete.
ChimeraScan Results
PROJECT LIBRIRAY_ID TOTAL TIME TOTAL
SPACE
MCF7 A37098 ~23 HRS 178 GB
UHR Z01229 ~21 HRS 132 GB
COLO-829 A36972 ~20 HRS 157 GB
OUR VALIDATION:
Run settings: 8 cores, 8 parallel jobs
Limitations: What could be better?
Lack of an injective(one to one) mapping from chimeras.bedpe event types to
our current set of event types.
Translocation ---> {interchromosomal}
Duplication ---> {intrachromosomal_complex, adjacent_complex}
Deletion ---> {intrachromosomal, intrachromosomal_diverging, intrachromosomal_complex}
Inversion --> {intrachromosomal_diverging}
Relies on an annotated set of genes(found in the reference index)
High sensitivity but also high number false positives. (tradeoff??)
Comparison with current software
MCF7 LIBRARY ChimeraScan DeFUSE
(filtered)
Trans-Abyss
(1.4.8)
Total events 629 503 161
Validated
events found
32/89 35/89 33/89
Validated
events not
found
57 54 56
Novel events
found
2 3 4
89 events were listed in the publications
18 events out of 89 were novel events
71 events out of 89 were previously known events
Note: The events for trans-abyss were taken from the
'sense_fusions.tsv' tabular file.
Comparison with current software
(Validated Events)
Library: MCF7(A37098)
Total Events Found: 45/89
Events unique to Chimerascan: 2/89
Events unique to deFUSE: 5/89
Events unque to Trans-Abyss: 5/89
2
55
22
3
Trans-abyss deFUSE
ChimeraScan
3 5
Comparison with current software
(All Events)
Library: MCF7(A37098)
Total Events Found: 10,160
Events unique to Chimerascan: 587/10,160
Events unique to deFUSE: 502/10,160
Events unque to Trans-Abyss: 8857/10,160
587
5028857
15
172
Trans-abyss deFUSE
ChimeraScan
8 19
Comparison with current software
UHR
LIBRARY
ChimeraScan DeFUSE
(filtered)
Tran-Abyss
(1.4.8)
Total events 1304 192 78
Validated events
found
21/68 14/68 21/68
Validated events
not found
47 54 47
68 events were listed in the publications
14 events out of 68 were externally verified events
44 events out of 68 were previously known events
Note: The events for trans-abyss were taken from the
'sense_fusions.tsv' tabular file.
Comparison with current software
(Validated Events)
Library: UHR(Z01229)
Total Events Found: 28/68
Events unique to Chimerascan: 4/68
Events unique to deFUSE: 0/68
Events unque to Trans-Abyss: 5/68
4
05
5
9
3
2
Trans-Abyss deFUSE
ChimeraScan
Comparison with current software
(All Events)
Library: UHR(Z01229)
Total Events Found: 18,015
Events unique to Chimerascan: 1279/18,015
Events unique to deFUSE: 154/18,015
Events unque to Trans-Abyss: 16,558/18,015
1279
15416,558
6
9
9
24
Trans-Abyss deFUSE
ChimeraScan
Comparison with current software
(All Events)
Library: COLO-829(a36972)
Total Events Found: 3,361
Events unique to Chimerascan: 458/3,361
Events unique to deFUSE: 225/3,361
Events unque to Trans-Abyss: 2,668/3,361
458
2252,668
0
1
6
3
Trans-Abyss deFUSE
ChimeraScan
What's Next???
• Improve Runtime
• Find an injective mapping from
chimeras.bdpe event types to our
current set of event types
Reference(s)
• Iyer MK, Chinnaiyan AM, Maher CA. ChimeraScan: a tool for
identifying chimeric transcription in sequencing data.
Bioinformatics. 2011;27(20):2903-2904.
doi:10.1093/bioinformatics/btr467.
• Weirather JL, Afshar PT, Clark TA, et al. Characterization of fusion
genes and the significantly expressed fusion isoforms in breast
cancer by hybrid sequencing. Nucleic Acids Research.
2015;43(18):e116. doi:10.1093/nar/gkv562.
Karen Mungall
Caleb Choo
AWKNOLEDGEMENTS

More Related Content

Viewers also liked

Digital games and health
Digital games and healthDigital games and health
Digital games and healthJaana Wessman
 
Diary from Cyprus by Ola
Diary from Cyprus by OlaDiary from Cyprus by Ola
Diary from Cyprus by Olae-twinning
 
Bionics by Group 5
Bionics by Group 5Bionics by Group 5
Bionics by Group 5e-twinning
 
Positive and Negative Impacts of Computer
Positive and Negative Impacts of ComputerPositive and Negative Impacts of Computer
Positive and Negative Impacts of ComputerHina Anjum
 
Introduction to the Wine of South Africa
Introduction to the Wine of South Africa Introduction to the Wine of South Africa
Introduction to the Wine of South Africa Margaux Burgess
 

Viewers also liked (6)

Digital games and health
Digital games and healthDigital games and health
Digital games and health
 
Diary from Cyprus by Ola
Diary from Cyprus by OlaDiary from Cyprus by Ola
Diary from Cyprus by Ola
 
Mercury
MercuryMercury
Mercury
 
Bionics by Group 5
Bionics by Group 5Bionics by Group 5
Bionics by Group 5
 
Positive and Negative Impacts of Computer
Positive and Negative Impacts of ComputerPositive and Negative Impacts of Computer
Positive and Negative Impacts of Computer
 
Introduction to the Wine of South Africa
Introduction to the Wine of South Africa Introduction to the Wine of South Africa
Introduction to the Wine of South Africa
 

Similar to BC-Cancer ChimeraScan Presentation

Burst Buffer: From Alpha to Omega
Burst Buffer: From Alpha to OmegaBurst Buffer: From Alpha to Omega
Burst Buffer: From Alpha to OmegaGeorge Markomanolis
 
Dgaston dec-06-2012
Dgaston dec-06-2012Dgaston dec-06-2012
Dgaston dec-06-2012Dan Gaston
 
20141219 workshop methylation sequencing analysis
20141219 workshop methylation sequencing analysis20141219 workshop methylation sequencing analysis
20141219 workshop methylation sequencing analysisYi-Feng Chang
 
HadoooIO.ppt
HadoooIO.pptHadoooIO.ppt
HadoooIO.pptSheba41
 
Towards Ultra-Large-Scale System: Design of Scalable Software and Next-Gen H...
Towards Ultra-Large-Scale System:  Design of Scalable Software and Next-Gen H...Towards Ultra-Large-Scale System:  Design of Scalable Software and Next-Gen H...
Towards Ultra-Large-Scale System: Design of Scalable Software and Next-Gen H...Arghya Kusum Das
 
Tools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisTools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisSANJANA PANDEY
 
Evaluating Classification Algorithms Applied To Data Streams Esteban Donato
Evaluating Classification Algorithms Applied To Data Streams   Esteban DonatoEvaluating Classification Algorithms Applied To Data Streams   Esteban Donato
Evaluating Classification Algorithms Applied To Data Streams Esteban DonatoEsteban Donato
 
Deep learning with kafka
Deep learning with kafkaDeep learning with kafka
Deep learning with kafkaNitin Kumar
 
SO-Memoria.pdf
SO-Memoria.pdfSO-Memoria.pdf
SO-Memoria.pdfKadu37
 
Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Li Shen
 
Next-generation sequencing data format and visualization with ngs.plot 2015
Next-generation sequencing data format and visualization with ngs.plot 2015Next-generation sequencing data format and visualization with ngs.plot 2015
Next-generation sequencing data format and visualization with ngs.plot 2015Li Shen
 
002 hbase clientapi
002 hbase clientapi002 hbase clientapi
002 hbase clientapiScott Miao
 
Malware Classification Using Structured Control Flow
Malware Classification Using Structured Control FlowMalware Classification Using Structured Control Flow
Malware Classification Using Structured Control FlowSilvio Cesare
 
Hb 1486-001 1074970 qsg-gene_readdataanalysis_1112
Hb 1486-001 1074970 qsg-gene_readdataanalysis_1112Hb 1486-001 1074970 qsg-gene_readdataanalysis_1112
Hb 1486-001 1074970 qsg-gene_readdataanalysis_1112Elsa von Licy
 
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...Spark Summit
 
Xian He Sun Data-Centric Into
Xian He Sun Data-Centric IntoXian He Sun Data-Centric Into
Xian He Sun Data-Centric IntoSciCompIIT
 
Alice data acquisition
Alice data acquisitionAlice data acquisition
Alice data acquisitionBertalan EGED
 
WRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation WorkbenchWRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation WorkbenchRafael Ferreira da Silva
 

Similar to BC-Cancer ChimeraScan Presentation (20)

Burst Buffer: From Alpha to Omega
Burst Buffer: From Alpha to OmegaBurst Buffer: From Alpha to Omega
Burst Buffer: From Alpha to Omega
 
Dgaston dec-06-2012
Dgaston dec-06-2012Dgaston dec-06-2012
Dgaston dec-06-2012
 
20141219 workshop methylation sequencing analysis
20141219 workshop methylation sequencing analysis20141219 workshop methylation sequencing analysis
20141219 workshop methylation sequencing analysis
 
HadoooIO.ppt
HadoooIO.pptHadoooIO.ppt
HadoooIO.ppt
 
Towards Ultra-Large-Scale System: Design of Scalable Software and Next-Gen H...
Towards Ultra-Large-Scale System:  Design of Scalable Software and Next-Gen H...Towards Ultra-Large-Scale System:  Design of Scalable Software and Next-Gen H...
Towards Ultra-Large-Scale System: Design of Scalable Software and Next-Gen H...
 
Tools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisTools for Transcriptome Data Analysis
Tools for Transcriptome Data Analysis
 
Evaluating Classification Algorithms Applied To Data Streams Esteban Donato
Evaluating Classification Algorithms Applied To Data Streams   Esteban DonatoEvaluating Classification Algorithms Applied To Data Streams   Esteban Donato
Evaluating Classification Algorithms Applied To Data Streams Esteban Donato
 
Deep learning with kafka
Deep learning with kafkaDeep learning with kafka
Deep learning with kafka
 
SO-Memoria.pdf
SO-Memoria.pdfSO-Memoria.pdf
SO-Memoria.pdf
 
SO-Memoria.pdf
SO-Memoria.pdfSO-Memoria.pdf
SO-Memoria.pdf
 
Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2
 
Next-generation sequencing data format and visualization with ngs.plot 2015
Next-generation sequencing data format and visualization with ngs.plot 2015Next-generation sequencing data format and visualization with ngs.plot 2015
Next-generation sequencing data format and visualization with ngs.plot 2015
 
002 hbase clientapi
002 hbase clientapi002 hbase clientapi
002 hbase clientapi
 
Bio Linux
Bio LinuxBio Linux
Bio Linux
 
Malware Classification Using Structured Control Flow
Malware Classification Using Structured Control FlowMalware Classification Using Structured Control Flow
Malware Classification Using Structured Control Flow
 
Hb 1486-001 1074970 qsg-gene_readdataanalysis_1112
Hb 1486-001 1074970 qsg-gene_readdataanalysis_1112Hb 1486-001 1074970 qsg-gene_readdataanalysis_1112
Hb 1486-001 1074970 qsg-gene_readdataanalysis_1112
 
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
Drizzle—Low Latency Execution for Apache Spark: Spark Summit East talk by Shi...
 
Xian He Sun Data-Centric Into
Xian He Sun Data-Centric IntoXian He Sun Data-Centric Into
Xian He Sun Data-Centric Into
 
Alice data acquisition
Alice data acquisitionAlice data acquisition
Alice data acquisition
 
WRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation WorkbenchWRENCH: Workflow Management System Simulation Workbench
WRENCH: Workflow Management System Simulation Workbench
 

More from Elijah Willie

Molecular_bilogy_lab_report_2
Molecular_bilogy_lab_report_2Molecular_bilogy_lab_report_2
Molecular_bilogy_lab_report_2Elijah Willie
 
Molecular_bilogy_lab_report_1
Molecular_bilogy_lab_report_1Molecular_bilogy_lab_report_1
Molecular_bilogy_lab_report_1Elijah Willie
 
Computational_biology_project_report
Computational_biology_project_reportComputational_biology_project_report
Computational_biology_project_reportElijah Willie
 
Target_heart_rate_monitor
Target_heart_rate_monitorTarget_heart_rate_monitor
Target_heart_rate_monitorElijah Willie
 

More from Elijah Willie (6)

Molecular_bilogy_lab_report_2
Molecular_bilogy_lab_report_2Molecular_bilogy_lab_report_2
Molecular_bilogy_lab_report_2
 
Molecular_bilogy_lab_report_1
Molecular_bilogy_lab_report_1Molecular_bilogy_lab_report_1
Molecular_bilogy_lab_report_1
 
Computational_biology_project_report
Computational_biology_project_reportComputational_biology_project_report
Computational_biology_project_report
 
Target_heart_rate_monitor
Target_heart_rate_monitorTarget_heart_rate_monitor
Target_heart_rate_monitor
 
Image_processing
Image_processingImage_processing
Image_processing
 
Fin_whales
Fin_whalesFin_whales
Fin_whales
 

BC-Cancer ChimeraScan Presentation

  • 1. ChimeraScan Chimeric transcript discovery by paired end transcriptome sequencing.
  • 2. AGENDA • Overview: What is ChimeraScan? • ChimeraScan Method(Algorithm) • How to run ChimeraScan? • ChimeraScan Results • Limitations: What could be done better? • Comparison with current software(deFuse, Trans- Abyss)
  • 3. WHAT IS CHIMERASCAN? • A tool for discovering chimeric transcripts or fusions in sequencing data.
  • 4. ChimeraScan Method ● ChimerScan differs from other fusion finders(deFUSE) in that it adds a fragmentation step along with the whole paired-end approach which is also used by deFUSE. Tell me more!!!!
  • 6. ChimeraScan Algorithm Step 1: Prepare reads for alignment ChimeraScan parses FASTQ 1) converts all quality scores to Sanger format (Phred + 33) 2) converts the qname for the reads from an arbitrarily long string to a number (1/1, 1/2 for PE reads)  Reduces storage requirements for intermediate steps
  • 7. ChimeraScan Algorithm  Pysam package is used. Step 3: Create a sorted/indexed BAM file Enables fast lookup of original read alignments by genomic coordinates. Step 4: Estimate insert size distribution Only uniquely mapping reads are used to sample the insert size distribution (used in future steps to help localize fusion breakpoints).
  • 8. ChimeraScan Algorithm Step 5: Realign initially unmapped reads(Fragmentation) All of the initially unmapped reads are treated as single reads and realigned. Additionally, the reads are trimmed such that only the sequences at the ends of the fragment are aligned (default=25bp). Step 6: Discover discordant reads
  • 9. ChimeraScan Algorithm Step 7: Nominate chimeras(fragment size distribution used) Step 8: Extract chimeric breakpoint sequences(from genome FASTA file) bowtie indexer used to create new alignment index of these breakpoint sequences Step 9: Nominate reads that could span breakpoints 7 9 7 8
  • 10. ChimeraScan Algorithm Step 10: Align against breakpoint sequence database (Created in step 8) Step 11: Assess breakpoint spanning alignment results (min anchor > #homologous bases between 5’->3’ at breakpoint + #mismatches allowed) Reads that align to the breakpoint sequence index are discarded if the overlap is small (less than anchor_min bases) or have larger overlap but contain mismatches (red reads). Reads overlapping the breakpoint by more than anchor_length bases are retained (green read).
  • 11. ChimeraScan Algorithm Step 12: Filter chimeras Many filters which the user can specify to minimize the amount of false positives.(know-false-positives, filter-size-distribution, supporting reads) Step 13: Produce a text output file (BEDPE file)
  • 12. How to run ChimeraScan STEP 1: Generate read paired fastq files from merged bam files 'Bash baprojects/trans_scratch/software/deFUSE/scripts/bam2fastq.converter.sh' INPUT(S): BAM_FILE_PATH(ABSOLUTE) LIBRARY_ID OUTPUT_DIRECTORY
  • 13. How to run ChimeraScan STEP 2: Submit Chimerascan to cluster 'python /projects/trans_scratch/chimerascan/chimerascan-0.4.5/bin/chimerascan_run.py' INPUT(S): -v: verbose (for logging and debugging) -p: processors(tested with -p = 8) chimerascan_index(generated during chimerascan installation) Fastq_1, Fastq_1 (both generated in step 1)
  • 14. How to run ChimeraScan Combine steps 1 & 2: 'bash /projects/trans_scratch/chimerascan/chimerascan_setup.sh' INPUT(S): PATIENT_ID LIBRARY_ID BAM_FILE_PATH PROJECT_DIRECTORY output(S):'qsub_all_chimerascan.sh': a script that submits both steps 1 & 2 to the cluster. Jobs are run serially(fastq files are created before the chimerascan job is submitted)
  • 15. ChimeraScan Results Output(S): Chimerascan outputs a chimeras.bedpe tabular file. The chimeras.bedpe file contains information about the chromosomal regions, transcript ids, genes, and statistics for each chimera. The file adapts to the BEDPE format for representing paired- intervals (courtesy Aaron Quinlan and the BEDTools project). The chimeras.bedpe also contains spanning and supporting reads(total score) for each reported events. Other intermediate files are also created during the run, but they do not contain any useful information and thus can be deleted after the run is complete.
  • 16. ChimeraScan Results PROJECT LIBRIRAY_ID TOTAL TIME TOTAL SPACE MCF7 A37098 ~23 HRS 178 GB UHR Z01229 ~21 HRS 132 GB COLO-829 A36972 ~20 HRS 157 GB OUR VALIDATION: Run settings: 8 cores, 8 parallel jobs
  • 17. Limitations: What could be better? Lack of an injective(one to one) mapping from chimeras.bedpe event types to our current set of event types. Translocation ---> {interchromosomal} Duplication ---> {intrachromosomal_complex, adjacent_complex} Deletion ---> {intrachromosomal, intrachromosomal_diverging, intrachromosomal_complex} Inversion --> {intrachromosomal_diverging} Relies on an annotated set of genes(found in the reference index) High sensitivity but also high number false positives. (tradeoff??)
  • 18. Comparison with current software MCF7 LIBRARY ChimeraScan DeFUSE (filtered) Trans-Abyss (1.4.8) Total events 629 503 161 Validated events found 32/89 35/89 33/89 Validated events not found 57 54 56 Novel events found 2 3 4 89 events were listed in the publications 18 events out of 89 were novel events 71 events out of 89 were previously known events Note: The events for trans-abyss were taken from the 'sense_fusions.tsv' tabular file.
  • 19. Comparison with current software (Validated Events) Library: MCF7(A37098) Total Events Found: 45/89 Events unique to Chimerascan: 2/89 Events unique to deFUSE: 5/89 Events unque to Trans-Abyss: 5/89 2 55 22 3 Trans-abyss deFUSE ChimeraScan 3 5
  • 20. Comparison with current software (All Events) Library: MCF7(A37098) Total Events Found: 10,160 Events unique to Chimerascan: 587/10,160 Events unique to deFUSE: 502/10,160 Events unque to Trans-Abyss: 8857/10,160 587 5028857 15 172 Trans-abyss deFUSE ChimeraScan 8 19
  • 21. Comparison with current software UHR LIBRARY ChimeraScan DeFUSE (filtered) Tran-Abyss (1.4.8) Total events 1304 192 78 Validated events found 21/68 14/68 21/68 Validated events not found 47 54 47 68 events were listed in the publications 14 events out of 68 were externally verified events 44 events out of 68 were previously known events Note: The events for trans-abyss were taken from the 'sense_fusions.tsv' tabular file.
  • 22. Comparison with current software (Validated Events) Library: UHR(Z01229) Total Events Found: 28/68 Events unique to Chimerascan: 4/68 Events unique to deFUSE: 0/68 Events unque to Trans-Abyss: 5/68 4 05 5 9 3 2 Trans-Abyss deFUSE ChimeraScan
  • 23. Comparison with current software (All Events) Library: UHR(Z01229) Total Events Found: 18,015 Events unique to Chimerascan: 1279/18,015 Events unique to deFUSE: 154/18,015 Events unque to Trans-Abyss: 16,558/18,015 1279 15416,558 6 9 9 24 Trans-Abyss deFUSE ChimeraScan
  • 24. Comparison with current software (All Events) Library: COLO-829(a36972) Total Events Found: 3,361 Events unique to Chimerascan: 458/3,361 Events unique to deFUSE: 225/3,361 Events unque to Trans-Abyss: 2,668/3,361 458 2252,668 0 1 6 3 Trans-Abyss deFUSE ChimeraScan
  • 25. What's Next??? • Improve Runtime • Find an injective mapping from chimeras.bdpe event types to our current set of event types
  • 26. Reference(s) • Iyer MK, Chinnaiyan AM, Maher CA. ChimeraScan: a tool for identifying chimeric transcription in sequencing data. Bioinformatics. 2011;27(20):2903-2904. doi:10.1093/bioinformatics/btr467. • Weirather JL, Afshar PT, Clark TA, et al. Characterization of fusion genes and the significantly expressed fusion isoforms in breast cancer by hybrid sequencing. Nucleic Acids Research. 2015;43(18):e116. doi:10.1093/nar/gkv562.