Your SlideShare is downloading. ×
  • Like
Talk at BaseSpace Developer conference SF 2013
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Talk at BaseSpace Developer conference SF 2013

  • 360 views
Published

 

Published in Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
360
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
3
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Figure 2. Graphical representation of the total number of downstream false positives expressed as a percentage of the true positive mutations detected following alignment with each aligner in single and paired-end mode across 67 groups of simulated Illumina reads.Each read group contained 20 SNPs and 13 INDELs. How to cite: Oliver GR. 2012 Considerations for clinical read alignment and mutational profiling using next-generation sequencing [v2; ref status: indexed, http://f1000r.es/NMpsFc] F1000Research 2012, 1:2 (doi: 10.12688/f1000research.1-2.v2)

Transcript

  • 1. Bioinformatics|Software|Services NOVOALIGN BASESPACE APP Zayed Albertyn Bioinformatics Director, Novocraft technologies Sdn Bhd Illumina® BaseSpace Developer Conference, San Francisco 9th December 2013
  • 2. Bioinformatics|Software|Services Novocraft Technologies Sdn Bhd • Incorporated in 2008, BioNexus Status Company • Small team of Mathematicians, Biologists & Software Engineeers • Develop Innovation & World Class Products • High-Performance Computing in growing Genomics Era • International Market & User Base
  • 3. Bioinformatics|Software|Services Products • • • • Novoalign– Illumina, 454 NovoalignCS – SOLiD Novosort Cluster Solutions – NovoalignMPI, NovoalignCSMPI • NGS WorkBench (web) • All running on standard commodity hardware – No special GPU/supercomputer required – Mac OS & Linux versions available – Open source operating system (Linux) • NGS Cloud computing HPC workflows – Amazon EC2/S3/EBS
  • 4. Bioinformatics|Software|Services NGS Services Automated pipelines Consultation on NextGen projects • • • Illumina and other platforms In-house/custom and open source software • • • • • Exome Whole genome SNV, Indel, Structural Variations (SVs) RNASeq CHIP-Seq Methylome Small RNA de-novo assembly Cloud Solutions -packaged AMIs,containers
  • 5. Bioinformatics|Software|Services Collaborations • Academic/research institutes • Industry – HPC providers – Pharma – Cloud solutions • Resellers – US and Global
  • 6. Bioinformatics|Software|Services A few of our NOVOALIGN users
  • 7. Bioinformatics|Software|Services User Examples
  • 8. Bioinformatics|Software|Services NOVOALIGN • Hash-based aligner • Peer reviewed publications: 2009-present • Accuracy – SNPs and short Indels • Read length > 250 bp as of V3.X.X
  • 9. Bioinformatics|Software|Services ROC Curves • True Positive vs False positive rate • Higher Y value - better at finding the “true” result • Lower X value – better at excluding “false” results http://lh3lh3.users.sourceforge.net/alnROC.shtml
  • 10. Bioinformatics|Software|Services The performance of various methods for mapping reads to reference repeats. Highnam G et al. Nucl. Acids Res. 2013;41:e32
  • 11. Bioinformatics|Software|Services The performance of various methods for mapping reads to reference repeats. Highnam G et al. Nucl. Acids Res. 2013;41:e32
  • 12. Bioinformatics|Software|Services Genome-in-a-bottle Consortium dataset http://www.bioplanet.com/gcat http://www.bioplanet.com/gcat/reports/112/variant-calls/ion-torrent-225bp-se-exome-30x/novoalign-gatk-ug/compare-183-119/group-read-depth
  • 13. Bioinformatics|Software|Services “Our standard workflow uses novoalign based on its stringency in resolving large insertions and deletions. These results suggest equally good results using bwa mem, along with improved processing times” Courtesy Brad Chapman & Oliver Hoffman. HSPH http://bcbio.wordpress.com
  • 14. Bioinformatics|Software|Services Graphical representation of the total number of downstream false positives expressed as a percentage... Oliver GR. 2012 [http://f1000r.es/NMpsFc] F1000Research 2012, 1:2 (doi: 10.12688/f1000research.1-2.v2)
  • 15. Bioinformatics|Software|Services Novosort comparison on Illumina reads
  • 16. Bioinformatics|Software|Services Developing on BaseSpace
  • 17. Bioinformatics|Software|Services Motivation • Reach out to more users • Enable seamless integration with the cloud • Establish BaseSpace Novoalign community
  • 18. Bioinformatics|Software|Services What is the App? Alignment • Alignment Quality Calibration • Multithreaded • Adaptor stripping Sorting • Novosort • Multithreaded Variant Calling • Freebayes • SNPs & Indels
  • 19. Bioinformatics|Software|Services What is the App? • Novoalign – Paired-end – Human-genome only, others later – Caveat: require min. 8Gb RAM machine • Alignment coordinate-sorting – Novosort • Variant Calling – Freebayes (Erik Garrison & Gabor Marth )
  • 20. Bioinformatics|Software|Services What is the App? • Novoalign – Paired-end – Human-genome only, others later – Caveat: require min. 8Gb RAM machine • Alignment coordinate-sorting – Novosort • Variant Calling – Freebayes (Erik Garrison & Gabor Marth )
  • 21. Bioinformatics|Software|Services What is the App? • Novoalign – Paired-end – Human-genome only, others later – Caveat: require min. 8Gb RAM machine • Alignment coordinate-sorting – Novosort • Variant Calling – Freebayes (Erik Garrison & Gabor Marth )
  • 22. Bioinformatics|Software|Services New-developer Challenges • The “Docker” way of doing things – Image vs Container • Front-end : Javascript/CSS • Basck-end: Algorithms/scripting
  • 23. Bioinformatics|Software|Services Perl/C++/R/Python Frontend process Back-end process
  • 24. Bioinformatics|Software|Services Back-end Development Process Start the Native VM •Vmware •Linux environment Start your own Docker Repository •Create new IMAGE on Docker.io •Done automatically on your first push Attach to your image Make small test dataset •Docker run … •Illumina cancer panel read •Subset chr22 alignmnents Develop the app back-end process Postprocess •Automated script runs pipeline •Alignment->sorting->variant calling •Charting with R •ggplot2
  • 25. Bioinformatics|Software|Services Front-end Development Process BaseSpace Developer tools • Code editor • Preview form inputs Build Report form • Write Liquid/Js/HTML5 Initiate test runs • Send data to your backend Native app
  • 26. Bioinformatics|Software|Services App Screenshots
  • 27. Bioinformatics|Software|Services
  • 28. Bioinformatics|Software|Services
  • 29. Bioinformatics|Software|Services
  • 30. Bioinformatics|Software|Services
  • 31. Bioinformatics|Software|Services Acknowledgements Novocraft Leadership Colin Hercus Haniza Hashim Bioinformatics Akzam Saidin Kaamesh Kaamahalaran Abdul Malik Ahmad Software Development Deepa Murugan Sharon Chin Laura Hamit Illumina Raymond Teckotzky Mayank Tyagi VT/GeneByGene David Mittelman Gareth Highnam Nir Liebovich Jason Wang HSPH Bioinformatics Core Oliver Hoffman Brad Chapman