Talk at BaseSpace Developer conference SF 2013

702 views

Published on

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
702
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Figure 2. Graphical representation of the total number of downstream false positives expressed as a percentage of the true positive mutations detected following alignment with each aligner in single and paired-end mode across 67 groups of simulated Illumina reads.Each read group contained 20 SNPs and 13 INDELs. How to cite: Oliver GR. 2012 Considerations for clinical read alignment and mutational profiling using next-generation sequencing [v2; ref status: indexed, http://f1000r.es/NMpsFc] F1000Research 2012, 1:2 (doi: 10.12688/f1000research.1-2.v2)
  • Talk at BaseSpace Developer conference SF 2013

    1. 1. Bioinformatics|Software|Services NOVOALIGN BASESPACE APP Zayed Albertyn Bioinformatics Director, Novocraft technologies Sdn Bhd Illumina® BaseSpace Developer Conference, San Francisco 9th December 2013
    2. 2. Bioinformatics|Software|Services Novocraft Technologies Sdn Bhd • Incorporated in 2008, BioNexus Status Company • Small team of Mathematicians, Biologists & Software Engineeers • Develop Innovation & World Class Products • High-Performance Computing in growing Genomics Era • International Market & User Base
    3. 3. Bioinformatics|Software|Services Products • • • • Novoalign– Illumina, 454 NovoalignCS – SOLiD Novosort Cluster Solutions – NovoalignMPI, NovoalignCSMPI • NGS WorkBench (web) • All running on standard commodity hardware – No special GPU/supercomputer required – Mac OS & Linux versions available – Open source operating system (Linux) • NGS Cloud computing HPC workflows – Amazon EC2/S3/EBS
    4. 4. Bioinformatics|Software|Services NGS Services Automated pipelines Consultation on NextGen projects • • • Illumina and other platforms In-house/custom and open source software • • • • • Exome Whole genome SNV, Indel, Structural Variations (SVs) RNASeq CHIP-Seq Methylome Small RNA de-novo assembly Cloud Solutions -packaged AMIs,containers
    5. 5. Bioinformatics|Software|Services Collaborations • Academic/research institutes • Industry – HPC providers – Pharma – Cloud solutions • Resellers – US and Global
    6. 6. Bioinformatics|Software|Services A few of our NOVOALIGN users
    7. 7. Bioinformatics|Software|Services User Examples
    8. 8. Bioinformatics|Software|Services NOVOALIGN • Hash-based aligner • Peer reviewed publications: 2009-present • Accuracy – SNPs and short Indels • Read length > 250 bp as of V3.X.X
    9. 9. Bioinformatics|Software|Services ROC Curves • True Positive vs False positive rate • Higher Y value - better at finding the “true” result • Lower X value – better at excluding “false” results http://lh3lh3.users.sourceforge.net/alnROC.shtml
    10. 10. Bioinformatics|Software|Services The performance of various methods for mapping reads to reference repeats. Highnam G et al. Nucl. Acids Res. 2013;41:e32
    11. 11. Bioinformatics|Software|Services The performance of various methods for mapping reads to reference repeats. Highnam G et al. Nucl. Acids Res. 2013;41:e32
    12. 12. Bioinformatics|Software|Services Genome-in-a-bottle Consortium dataset http://www.bioplanet.com/gcat http://www.bioplanet.com/gcat/reports/112/variant-calls/ion-torrent-225bp-se-exome-30x/novoalign-gatk-ug/compare-183-119/group-read-depth
    13. 13. Bioinformatics|Software|Services “Our standard workflow uses novoalign based on its stringency in resolving large insertions and deletions. These results suggest equally good results using bwa mem, along with improved processing times” Courtesy Brad Chapman & Oliver Hoffman. HSPH http://bcbio.wordpress.com
    14. 14. Bioinformatics|Software|Services Graphical representation of the total number of downstream false positives expressed as a percentage... Oliver GR. 2012 [http://f1000r.es/NMpsFc] F1000Research 2012, 1:2 (doi: 10.12688/f1000research.1-2.v2)
    15. 15. Bioinformatics|Software|Services Novosort comparison on Illumina reads
    16. 16. Bioinformatics|Software|Services Developing on BaseSpace
    17. 17. Bioinformatics|Software|Services Motivation • Reach out to more users • Enable seamless integration with the cloud • Establish BaseSpace Novoalign community
    18. 18. Bioinformatics|Software|Services What is the App? Alignment • Alignment Quality Calibration • Multithreaded • Adaptor stripping Sorting • Novosort • Multithreaded Variant Calling • Freebayes • SNPs & Indels
    19. 19. Bioinformatics|Software|Services What is the App? • Novoalign – Paired-end – Human-genome only, others later – Caveat: require min. 8Gb RAM machine • Alignment coordinate-sorting – Novosort • Variant Calling – Freebayes (Erik Garrison & Gabor Marth )
    20. 20. Bioinformatics|Software|Services What is the App? • Novoalign – Paired-end – Human-genome only, others later – Caveat: require min. 8Gb RAM machine • Alignment coordinate-sorting – Novosort • Variant Calling – Freebayes (Erik Garrison & Gabor Marth )
    21. 21. Bioinformatics|Software|Services What is the App? • Novoalign – Paired-end – Human-genome only, others later – Caveat: require min. 8Gb RAM machine • Alignment coordinate-sorting – Novosort • Variant Calling – Freebayes (Erik Garrison & Gabor Marth )
    22. 22. Bioinformatics|Software|Services New-developer Challenges • The “Docker” way of doing things – Image vs Container • Front-end : Javascript/CSS • Basck-end: Algorithms/scripting
    23. 23. Bioinformatics|Software|Services Perl/C++/R/Python Frontend process Back-end process
    24. 24. Bioinformatics|Software|Services Back-end Development Process Start the Native VM •Vmware •Linux environment Start your own Docker Repository •Create new IMAGE on Docker.io •Done automatically on your first push Attach to your image Make small test dataset •Docker run … •Illumina cancer panel read •Subset chr22 alignmnents Develop the app back-end process Postprocess •Automated script runs pipeline •Alignment->sorting->variant calling •Charting with R •ggplot2
    25. 25. Bioinformatics|Software|Services Front-end Development Process BaseSpace Developer tools • Code editor • Preview form inputs Build Report form • Write Liquid/Js/HTML5 Initiate test runs • Send data to your backend Native app
    26. 26. Bioinformatics|Software|Services App Screenshots
    27. 27. Bioinformatics|Software|Services
    28. 28. Bioinformatics|Software|Services
    29. 29. Bioinformatics|Software|Services
    30. 30. Bioinformatics|Software|Services
    31. 31. Bioinformatics|Software|Services Acknowledgements Novocraft Leadership Colin Hercus Haniza Hashim Bioinformatics Akzam Saidin Kaamesh Kaamahalaran Abdul Malik Ahmad Software Development Deepa Murugan Sharon Chin Laura Hamit Illumina Raymond Teckotzky Mayank Tyagi VT/GeneByGene David Mittelman Gareth Highnam Nir Liebovich Jason Wang HSPH Bioinformatics Core Oliver Hoffman Brad Chapman

    ×