SlideShare a Scribd company logo
1 of 14
Download to read offline
Co-OP Presentation
My contribution to the Genome Sciences Centre
September 2015 –> April 2016
OVERVIEW
• Pipelines
• Projects
• Validation(s)
• ChimeraScan
• Trinity
• Manta
• Development
• Additional Work
• What I learned
• What I can improve
• Moving forward
• Acknowledgments
Pipelines
• ABySS: Assemble short reads by a de novo, parallel, paired-end sequence assembler
• Trans-ABySS: Analyze assemblies for structural variants and splice variants
using a reference genome and annotations.
• Genome-Validator: Validate fusion and indel events from Trans-ABySS
against given BAM files and attempt to assigning ‘tumourigenicity’ as ‘somatic’
or ‘germline’ to events when both a normal tumour genome are given.
• Delly: Discover split-read and paired-end structural variants and genotyping
from parallel sequencing data.
• Microbial Detection Pipeline: Detect bacterial and/or viral sequences to
determine potential contamination or integration into the genome.
• Integration Site Pipeline: Detect putative integrative sites of viral sequences
into human sequences.
• Probing Pipeline: Detect fusion and SNP mutations in genome and
transcriptome libraries.
• Compression and Transfer: Compress and transfer files off of scratch space
for archiving and reducing total space usage on scratch space.
Projects
• TCGA LIHC
• TCGA MESO
• NCI HER2 BRCA
• GPH Lymphoma
• TCGA BLCA
• TCGA SARC
• WES CHOL
• TCGA UVM
• COLO-829
• Kaplan
• HCI HIV Cervical
• MCF7
• TCGA THYM
Validations
• ChimeraScan-0.4.5
• hg38 Annotations
• Trinity(Partially)
• Manta
ChimeraScan-0.4.5
A software package that detects gene fusions in paired-end RNA
sequencing (RNA-Seq) datasets. differs from other fusion finders(deFUSE)
in that it adds a fragmentation step along with the whole paired-end
approach which is also used by deFUSE.
Script(s):
• setup:
– /projects/trans_scratch/software/chimerascan/scripts/chimerascan_setup_final.sh
• checker:
– /projects/trans_scratch/software/chimerascan/scripts/chimerascan_checker.sh
• cleaner:
– /projects/trans_scratch/software/chimerascan/scripts/chimerascan_cleaner.sh
• binner:
– /projects/trans_scratch/software/chimerascan/scripts/binning_beta.py
• summarizer:
– /projects/trans_scratch/software/chimerascan/scripts/chimerascan.sum.sh
• report generator:
– /projects/trans_scratch/software/chimerascan/scripts/ChimeraScan.report.sh
Manta
Rapid detection of structural variants and indels for clinical sequencing
applications
Script(s):
• manta_sum.sh:
– /home/ewillie/tools/scripts/manta_sum.sh
• manta_delly_overlay.py:
– /home/ewillie/tools/scripts/manta_delly_overlay.py
• Manta_gv2_overlay.py:
– /home/ewillie/tools/scripts/manta_gv2_overlay.py
• vcfToBedpe:
– /projects/trans_scratch/software/svtools-Manta2Bedpe/vcfToBedpe
Development
Overlay/Setup Script(s):
• manta_delly_overlay.py:
– /home/ewillie/tools/scripts/manta_delly_overlay.py
• Manta_gv2_overlay.py:
– /home/ewillie/tools/scripts/manta_gv2_overlay.py
• transabyss_defuse_overlay.py:
– /home/ewillie/tools/scripts/transabyss_defuse_overlay.py
• trinity_setup.sh:
– /projects/trans_scratch/trinityrnaseq-2.1.1/trinity_setup.sh
Additional Work
• Assemblies: Run ABySS to assemble sample(s) for further downstream analyzing.
• Analyses: Run various analysis tools on data and comparing their result by means of
overlays and/or visualization.
• Overlays: Compare results between different tools or different settings to find
similarities and differences. The overlays are done using appropriate scripts, and venn
diagrams are generated to help illustrate similarities and/or differences.
• Testing Scripts: new scirpts such as integration_pipeline.sh were tested for potential
bugs and ease of use. Testing was done iteratively, with each iteration providing more
confidence.
• ChimeraScan Wiki: Create a comprehensive wiki with information regarding
validation, and a detailed procedure for running the tool. Additional information
such as installation procedure, resource requirements, and interpreting the
outputs. The wiki also contains debugging information.
ChimeraScan Wiki
What I Learned
• Real world applications of bioinformatics.
• Problem solving including troubleshooting, debugging and querying the
literature.
• Bash scripting language including a significant knowledge of terminal
commands.
• Writing scripts to improve time and efficiency of jobs.(Do a job manually
for > 2hrs or write a script to do it in a fraction of that time.)
• A greater attention to detail to help reduce rate of errors.
• Time management, task prioritization and meeting deadlines.
• Visualize and analyzing structural variants using IGV.
What I could work on
Problem solving and troubleshooting skills.
Deeper understanding of the SVIA pipeline tools.
Clear and concise presentation of my results.
Minimizing my rate of error when performing tasks.
Verbal presentation skills.
Create an appetite for personal projects.
ANY SUGGESTIONS????????
Moving Forward
My interest in the algorithmic aspect of genomics has grown tremendously,
enticing me to take more applied algorithm courses.
Obtaining a genomics certificate as part of my degree to further develop my
interest in genomic sciences.
Since i am now aware of the qualities and skills that are needed to be successful
in this rapidly changing industry, I will be dedicating time to further develop
these qualities and sharpen these skills.
Improving my scripting abilities both in python and bash to build on the
experience I have already gained here during the last eight months.
Applying the knowledge and skills i have acquired here in order to be successful
in a different work environment.
Acknowledgement
Karen Mungall
Yussanne Ma
Caleb Choo
Caralyn Reisle
Dustin Bleile
Melika Bonakdar
Stuart Zong
Diana Palmquist
Gordon robertson
Serena Chan
Karen Eddy

More Related Content

Similar to Co-OP Presentation Insights

The Automation Firehose: Be Strategic and Tactical by Thomas Haver
The Automation Firehose: Be Strategic and Tactical by Thomas HaverThe Automation Firehose: Be Strategic and Tactical by Thomas Haver
The Automation Firehose: Be Strategic and Tactical by Thomas HaverQA or the Highway
 
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...Robert Grossman
 
Continuous Software Engineering - A tutorial
Continuous Software Engineering - A tutorialContinuous Software Engineering - A tutorial
Continuous Software Engineering - A tutorialBreno de França
 
! Testing for agile teams
! Testing for agile teams! Testing for agile teams
! Testing for agile teamsDennis Popov
 
Beyond "Quality Assurance"
Beyond "Quality Assurance"Beyond "Quality Assurance"
Beyond "Quality Assurance"Jason Benton
 
VIP Workshop: Effective Habits of Development Teams
VIP Workshop: Effective Habits of Development TeamsVIP Workshop: Effective Habits of Development Teams
VIP Workshop: Effective Habits of Development TeamsPaul Schreiber
 
Badneedles
BadneedlesBadneedles
Badneedlesdimisec
 
Measuring Your Code
Measuring Your CodeMeasuring Your Code
Measuring Your CodeNate Abele
 
Venkata Sateesh_BigData_Latest-Resume
Venkata Sateesh_BigData_Latest-ResumeVenkata Sateesh_BigData_Latest-Resume
Venkata Sateesh_BigData_Latest-Resumevenkata sateeshs
 
Giab workshop intro 180125
Giab workshop intro 180125Giab workshop intro 180125
Giab workshop intro 180125GenomeInABottle
 
Salesforce Testing Resume
Salesforce Testing ResumeSalesforce Testing Resume
Salesforce Testing ResumeSowmya J
 
Software Testing Training : Tonex Training
Software Testing Training : Tonex TrainingSoftware Testing Training : Tonex Training
Software Testing Training : Tonex TrainingBryan Len
 
Cerberus : Framework for Manual and Automated Testing (Web Application)
Cerberus : Framework for Manual and Automated Testing (Web Application)Cerberus : Framework for Manual and Automated Testing (Web Application)
Cerberus : Framework for Manual and Automated Testing (Web Application)CIVEL Benoit
 
Cerberus_Presentation1
Cerberus_Presentation1Cerberus_Presentation1
Cerberus_Presentation1CIVEL Benoit
 
Agile & DevOps - It's all about project success
Agile & DevOps - It's all about project successAgile & DevOps - It's all about project success
Agile & DevOps - It's all about project successAdam Stephensen
 
Continuous, continuous, continuous
Continuous, continuous, continuousContinuous, continuous, continuous
Continuous, continuous, continuousMichele Orselli
 

Similar to Co-OP Presentation Insights (20)

The Automation Firehose: Be Strategic and Tactical by Thomas Haver
The Automation Firehose: Be Strategic and Tactical by Thomas HaverThe Automation Firehose: Be Strategic and Tactical by Thomas Haver
The Automation Firehose: Be Strategic and Tactical by Thomas Haver
 
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
 
Continuous Software Engineering - A tutorial
Continuous Software Engineering - A tutorialContinuous Software Engineering - A tutorial
Continuous Software Engineering - A tutorial
 
! Testing for agile teams
! Testing for agile teams! Testing for agile teams
! Testing for agile teams
 
Beyond "Quality Assurance"
Beyond "Quality Assurance"Beyond "Quality Assurance"
Beyond "Quality Assurance"
 
VIP Workshop: Effective Habits of Development Teams
VIP Workshop: Effective Habits of Development TeamsVIP Workshop: Effective Habits of Development Teams
VIP Workshop: Effective Habits of Development Teams
 
Badneedles
BadneedlesBadneedles
Badneedles
 
Measuring Your Code
Measuring Your CodeMeasuring Your Code
Measuring Your Code
 
Sakshi Tripathi Resume
Sakshi Tripathi ResumeSakshi Tripathi Resume
Sakshi Tripathi Resume
 
Bijayalaxmi Behera_CV
Bijayalaxmi Behera_CVBijayalaxmi Behera_CV
Bijayalaxmi Behera_CV
 
Resume 2 year
Resume  2 yearResume  2 year
Resume 2 year
 
Venkata Sateesh_BigData_Latest-Resume
Venkata Sateesh_BigData_Latest-ResumeVenkata Sateesh_BigData_Latest-Resume
Venkata Sateesh_BigData_Latest-Resume
 
Giab workshop intro 180125
Giab workshop intro 180125Giab workshop intro 180125
Giab workshop intro 180125
 
Salesforce Testing Resume
Salesforce Testing ResumeSalesforce Testing Resume
Salesforce Testing Resume
 
Software Testing Training : Tonex Training
Software Testing Training : Tonex TrainingSoftware Testing Training : Tonex Training
Software Testing Training : Tonex Training
 
Cerberus : Framework for Manual and Automated Testing (Web Application)
Cerberus : Framework for Manual and Automated Testing (Web Application)Cerberus : Framework for Manual and Automated Testing (Web Application)
Cerberus : Framework for Manual and Automated Testing (Web Application)
 
Cerberus_Presentation1
Cerberus_Presentation1Cerberus_Presentation1
Cerberus_Presentation1
 
Agile & DevOps - It's all about project success
Agile & DevOps - It's all about project successAgile & DevOps - It's all about project success
Agile & DevOps - It's all about project success
 
Continuous, continuous, continuous
Continuous, continuous, continuousContinuous, continuous, continuous
Continuous, continuous, continuous
 
Why do Users kill HPC Jobs?
Why do Users kill HPC Jobs?Why do Users kill HPC Jobs?
Why do Users kill HPC Jobs?
 

Co-OP Presentation Insights

  • 1. Co-OP Presentation My contribution to the Genome Sciences Centre September 2015 –> April 2016
  • 2. OVERVIEW • Pipelines • Projects • Validation(s) • ChimeraScan • Trinity • Manta • Development • Additional Work • What I learned • What I can improve • Moving forward • Acknowledgments
  • 3. Pipelines • ABySS: Assemble short reads by a de novo, parallel, paired-end sequence assembler • Trans-ABySS: Analyze assemblies for structural variants and splice variants using a reference genome and annotations. • Genome-Validator: Validate fusion and indel events from Trans-ABySS against given BAM files and attempt to assigning ‘tumourigenicity’ as ‘somatic’ or ‘germline’ to events when both a normal tumour genome are given. • Delly: Discover split-read and paired-end structural variants and genotyping from parallel sequencing data. • Microbial Detection Pipeline: Detect bacterial and/or viral sequences to determine potential contamination or integration into the genome. • Integration Site Pipeline: Detect putative integrative sites of viral sequences into human sequences. • Probing Pipeline: Detect fusion and SNP mutations in genome and transcriptome libraries. • Compression and Transfer: Compress and transfer files off of scratch space for archiving and reducing total space usage on scratch space.
  • 4. Projects • TCGA LIHC • TCGA MESO • NCI HER2 BRCA • GPH Lymphoma • TCGA BLCA • TCGA SARC • WES CHOL • TCGA UVM • COLO-829 • Kaplan • HCI HIV Cervical • MCF7 • TCGA THYM
  • 5. Validations • ChimeraScan-0.4.5 • hg38 Annotations • Trinity(Partially) • Manta
  • 6. ChimeraScan-0.4.5 A software package that detects gene fusions in paired-end RNA sequencing (RNA-Seq) datasets. differs from other fusion finders(deFUSE) in that it adds a fragmentation step along with the whole paired-end approach which is also used by deFUSE. Script(s): • setup: – /projects/trans_scratch/software/chimerascan/scripts/chimerascan_setup_final.sh • checker: – /projects/trans_scratch/software/chimerascan/scripts/chimerascan_checker.sh • cleaner: – /projects/trans_scratch/software/chimerascan/scripts/chimerascan_cleaner.sh • binner: – /projects/trans_scratch/software/chimerascan/scripts/binning_beta.py • summarizer: – /projects/trans_scratch/software/chimerascan/scripts/chimerascan.sum.sh • report generator: – /projects/trans_scratch/software/chimerascan/scripts/ChimeraScan.report.sh
  • 7. Manta Rapid detection of structural variants and indels for clinical sequencing applications Script(s): • manta_sum.sh: – /home/ewillie/tools/scripts/manta_sum.sh • manta_delly_overlay.py: – /home/ewillie/tools/scripts/manta_delly_overlay.py • Manta_gv2_overlay.py: – /home/ewillie/tools/scripts/manta_gv2_overlay.py • vcfToBedpe: – /projects/trans_scratch/software/svtools-Manta2Bedpe/vcfToBedpe
  • 8. Development Overlay/Setup Script(s): • manta_delly_overlay.py: – /home/ewillie/tools/scripts/manta_delly_overlay.py • Manta_gv2_overlay.py: – /home/ewillie/tools/scripts/manta_gv2_overlay.py • transabyss_defuse_overlay.py: – /home/ewillie/tools/scripts/transabyss_defuse_overlay.py • trinity_setup.sh: – /projects/trans_scratch/trinityrnaseq-2.1.1/trinity_setup.sh
  • 9. Additional Work • Assemblies: Run ABySS to assemble sample(s) for further downstream analyzing. • Analyses: Run various analysis tools on data and comparing their result by means of overlays and/or visualization. • Overlays: Compare results between different tools or different settings to find similarities and differences. The overlays are done using appropriate scripts, and venn diagrams are generated to help illustrate similarities and/or differences. • Testing Scripts: new scirpts such as integration_pipeline.sh were tested for potential bugs and ease of use. Testing was done iteratively, with each iteration providing more confidence. • ChimeraScan Wiki: Create a comprehensive wiki with information regarding validation, and a detailed procedure for running the tool. Additional information such as installation procedure, resource requirements, and interpreting the outputs. The wiki also contains debugging information.
  • 11. What I Learned • Real world applications of bioinformatics. • Problem solving including troubleshooting, debugging and querying the literature. • Bash scripting language including a significant knowledge of terminal commands. • Writing scripts to improve time and efficiency of jobs.(Do a job manually for > 2hrs or write a script to do it in a fraction of that time.) • A greater attention to detail to help reduce rate of errors. • Time management, task prioritization and meeting deadlines. • Visualize and analyzing structural variants using IGV.
  • 12. What I could work on Problem solving and troubleshooting skills. Deeper understanding of the SVIA pipeline tools. Clear and concise presentation of my results. Minimizing my rate of error when performing tasks. Verbal presentation skills. Create an appetite for personal projects. ANY SUGGESTIONS????????
  • 13. Moving Forward My interest in the algorithmic aspect of genomics has grown tremendously, enticing me to take more applied algorithm courses. Obtaining a genomics certificate as part of my degree to further develop my interest in genomic sciences. Since i am now aware of the qualities and skills that are needed to be successful in this rapidly changing industry, I will be dedicating time to further develop these qualities and sharpen these skills. Improving my scripting abilities both in python and bash to build on the experience I have already gained here during the last eight months. Applying the knowledge and skills i have acquired here in order to be successful in a different work environment.
  • 14. Acknowledgement Karen Mungall Yussanne Ma Caleb Choo Caralyn Reisle Dustin Bleile Melika Bonakdar Stuart Zong Diana Palmquist Gordon robertson Serena Chan Karen Eddy