SlideShare a Scribd company logo
THE GOBY FRAMEWORK: TOWARDS EFFICIENT NEXT-GENERATION SEQUENCING DATA ANALYSIS ,[object Object],[object Object],[object Object],[object Object],http://goby.campagnelab.org
Applications of Next Generation Sequencing McPherson J.D. Nat Methods. 2009
Next  Generation  Sequencers Metzker, M.L.  Nat Rev Genet. 2010 Roche/454  GS FLX Titanium Illumina/Solexa  GA IIe Life Technologies SOLiD 3 Helicos BioSciences Heliscope NGS Chemistry Pyrosequencing Reversible Terminators Sequencing by ligation Reversible Terminators Avg Read Length (bp) 330 75 50 32 Run Time (days) 0.35 4 7 8 Giga bases/run 0.45 18 30 37 Million reads/run 1.36 240 600 1156
Next Generation Sequence Data Formats ,[object Object],[object Object],[object Object]
File Format Wish List ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
File Formats Low Level APIs Tools/Utilities Applications Java, C++, Python RNA-Seq Pipeline IGV Plug-in The Goby Software Framework reads alignments histograms Readers Writers Iterators File  Format Conversions Alignment Processing Visualization
File Formats Low Level APIs Tools/Utilities Applications Java, C++, Python RNA-Seq Pipeline IGV Plug-in The Goby Software Framework Readers Writers Iterators File  Format Conversions Alignment Processing Visualization
Structured non-ambiguous representation ,[object Object],[object Object],[object Object]
Goby compact formats ,[object Object]
File Format Wish List ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Goby compact formats ,[object Object],[object Object],[object Object]
File Format Wish List ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Goby File Size Comparisons MAQC sample B = Ambion Human Brain Reference RNA (HBRR or HBR, Catalog #6050)  sequenced on four next-gen  platforms
File Formats Low Level APIs Tools/Utilities Applications Java, C++, Python Readers Writers Iterators The Goby Software Framework reads alignments histograms File  Format Conversions Alignment Processing Visualization RNA-Seq Pipeline IGV Plug-in
File Formats Low Level APIs Tools/Utilities Applications Java, C++, Python Readers Writers Iterators RNA-Seq Pipeline IGV Plug-in The Goby Software Framework reads alignments histograms File  Format Conversions Alignment Processing Visualization
Alignment Iterator ,[object Object],[object Object],[object Object],[object Object]
File Formats Low Level APIs Tools/Utilities Applications Java, C++, Python RNA-Seq Pipeline IGV Plug-in The Goby Software Framework reads alignments histograms Readers Writers Iterators File  Format Conversions Alignment Processing Visualization
File Formats Low Level APIs Tools/Utilities Applications Java, C++, Python RNA-Seq Pipeline IGV Plug-in The Goby Software Framework reads alignments histograms Readers Writers Iterators File  Format Conversions Alignment Processing Visualization
RNA-Seq Pipeline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Sample RNA-Seq Results
Conclusion ,[object Object],[object Object],[object Object],[object Object]
Acknowledgements ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],http://goby.campagnelab.org FDA/NCTR Leming Shi  Sequencing Quality Control  Project (SEQC) Helicos Illumina  Life Technologies  Roche
 
cDNA Search

More Related Content

Viewers also liked

Mainframe group presentation
Mainframe group presentationMainframe group presentation
Mainframe group presentation
Annie Robert
 
Nordic e commerce3
Nordic e commerce3Nordic e commerce3
Nordic e commerce3
fuck off dfasfd
 
Issr plodinec
Issr plodinecIssr plodinec
Issr plodinec
plodinec
 
Addiction
AddictionAddiction
Responding to Climate Change at the Local Level
Responding to Climate Change at the Local LevelResponding to Climate Change at the Local Level
Responding to Climate Change at the Local Level
Cary Institute of Ecosystem Studies
 
Empower students to write with digital tools slide share
Empower students to write with digital tools slide shareEmpower students to write with digital tools slide share
Empower students to write with digital tools slide share
Kevin Amboe
 
Introductiedag 11 12 [compatibiliteitsmodus]
Introductiedag 11 12 [compatibiliteitsmodus]Introductiedag 11 12 [compatibiliteitsmodus]
Introductiedag 11 12 [compatibiliteitsmodus]CVO-SSH
 
Foto loca
Foto locaFoto loca
Manifesto Assistenza Sessuale
Manifesto Assistenza SessualeManifesto Assistenza Sessuale
Manifesto Assistenza Sessuale
www.cambiaidea.org
 
Оптимизация интерактивного тестирования с использованием метрики Покрытие кода
Оптимизация интерактивного тестирования с использованием метрики Покрытие кодаОптимизация интерактивного тестирования с использованием метрики Покрытие кода
Оптимизация интерактивного тестирования с использованием метрики Покрытие кода
SPB SQA Group
 
Jisc webinar engaging building users 2013
Jisc webinar engaging building users 2013Jisc webinar engaging building users 2013
Jisc webinar engaging building users 2013
Nottingham Trent University
 
Conflux: GPGPU для .NET (ADD`2010)
Conflux: GPGPU для .NET (ADD`2010)Conflux: GPGPU для .NET (ADD`2010)
Conflux: GPGPU для .NET (ADD`2010)xenoby
 
101 tips for the classroom
101 tips for the classroom101 tips for the classroom
101 tips for the classroom
Ricardo Valenzuela
 
Artsmart2
Artsmart2Artsmart2
Artsmart2
Larry Langley
 
HP Server og Lagring SPOR 1
HP Server og Lagring SPOR 1HP Server og Lagring SPOR 1
HP Server og Lagring SPOR 1
HP Norge
 
WSRM_WriteUp
WSRM_WriteUpWSRM_WriteUp
WSRM_WriteUp
Shai Levit
 
Hiring and retaining legal staff in Asia-Pacific Businesses
Hiring and retaining legal staff in Asia-Pacific BusinessesHiring and retaining legal staff in Asia-Pacific Businesses
Hiring and retaining legal staff in Asia-Pacific Businesses
iohann Le Frapper
 

Viewers also liked (20)

Mainframe group presentation
Mainframe group presentationMainframe group presentation
Mainframe group presentation
 
Nordic e commerce3
Nordic e commerce3Nordic e commerce3
Nordic e commerce3
 
Limecoconut
LimecoconutLimecoconut
Limecoconut
 
Issr plodinec
Issr plodinecIssr plodinec
Issr plodinec
 
Color Illustrations
Color IllustrationsColor Illustrations
Color Illustrations
 
Addiction
AddictionAddiction
Addiction
 
Responding to Climate Change at the Local Level
Responding to Climate Change at the Local LevelResponding to Climate Change at the Local Level
Responding to Climate Change at the Local Level
 
Empower students to write with digital tools slide share
Empower students to write with digital tools slide shareEmpower students to write with digital tools slide share
Empower students to write with digital tools slide share
 
Primar nova filial
Primar nova filialPrimar nova filial
Primar nova filial
 
Introductiedag 11 12 [compatibiliteitsmodus]
Introductiedag 11 12 [compatibiliteitsmodus]Introductiedag 11 12 [compatibiliteitsmodus]
Introductiedag 11 12 [compatibiliteitsmodus]
 
Foto loca
Foto locaFoto loca
Foto loca
 
Manifesto Assistenza Sessuale
Manifesto Assistenza SessualeManifesto Assistenza Sessuale
Manifesto Assistenza Sessuale
 
Оптимизация интерактивного тестирования с использованием метрики Покрытие кода
Оптимизация интерактивного тестирования с использованием метрики Покрытие кодаОптимизация интерактивного тестирования с использованием метрики Покрытие кода
Оптимизация интерактивного тестирования с использованием метрики Покрытие кода
 
Jisc webinar engaging building users 2013
Jisc webinar engaging building users 2013Jisc webinar engaging building users 2013
Jisc webinar engaging building users 2013
 
Conflux: GPGPU для .NET (ADD`2010)
Conflux: GPGPU для .NET (ADD`2010)Conflux: GPGPU для .NET (ADD`2010)
Conflux: GPGPU для .NET (ADD`2010)
 
101 tips for the classroom
101 tips for the classroom101 tips for the classroom
101 tips for the classroom
 
Artsmart2
Artsmart2Artsmart2
Artsmart2
 
HP Server og Lagring SPOR 1
HP Server og Lagring SPOR 1HP Server og Lagring SPOR 1
HP Server og Lagring SPOR 1
 
WSRM_WriteUp
WSRM_WriteUpWSRM_WriteUp
WSRM_WriteUp
 
Hiring and retaining legal staff in Asia-Pacific Businesses
Hiring and retaining legal staff in Asia-Pacific BusinessesHiring and retaining legal staff in Asia-Pacific Businesses
Hiring and retaining legal staff in Asia-Pacific Businesses
 

Similar to Chambwe bosc2010

BiDiBlast Tool Presentation
BiDiBlast Tool PresentationBiDiBlast Tool Presentation
BiDiBlast Tool Presentation
Joao Feio de Almeida
 
Dgaston dec-06-2012
Dgaston dec-06-2012Dgaston dec-06-2012
Dgaston dec-06-2012
Dan Gaston
 
Enabling Large Scale Sequencing Studies through Science as a Service
Enabling Large Scale Sequencing Studies through Science as a ServiceEnabling Large Scale Sequencing Studies through Science as a Service
Enabling Large Scale Sequencing Studies through Science as a Service
Justin Johnson
 
Pipeline Scripting for the Parallel Alignment of Genomic Short Sequence Reads
Pipeline Scripting for the Parallel Alignment of Genomic Short Sequence ReadsPipeline Scripting for the Parallel Alignment of Genomic Short Sequence Reads
Pipeline Scripting for the Parallel Alignment of Genomic Short Sequence Reads
Adam Bradley
 
Under the Hood of Alignment Algorithms for NGS Researchers
Under the Hood of Alignment Algorithms for NGS ResearchersUnder the Hood of Alignment Algorithms for NGS Researchers
Under the Hood of Alignment Algorithms for NGS Researchers
Golden Helix Inc
 
Chado introduction
Chado introductionChado introduction
Chado introduction
Chris Mungall
 
Closing the Gap in Time: From Raw Data to Real Science
Closing the Gap in Time: From Raw Data to Real ScienceClosing the Gap in Time: From Raw Data to Real Science
Closing the Gap in Time: From Raw Data to Real Science
Justin Johnson
 
Folker Meyer: Metagenomic Data Annotation
Folker Meyer: Metagenomic Data AnnotationFolker Meyer: Metagenomic Data Annotation
Folker Meyer: Metagenomic Data Annotation
GigaScience, BGI Hong Kong
 
3rd presentation
3rd presentation3rd presentation
3rd presentation
Olabode Ajayi
 
PyData Meetup Presentation in Natal April 2024
PyData Meetup Presentation in Natal April 2024PyData Meetup Presentation in Natal April 2024
PyData Meetup Presentation in Natal April 2024
MarcelRibeiroDantas
 
Tools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisTools for Transcriptome Data Analysis
Tools for Transcriptome Data Analysis
SANJANA PANDEY
 
20100516 bioinformatics kapushesky_lecture08
20100516 bioinformatics kapushesky_lecture0820100516 bioinformatics kapushesky_lecture08
20100516 bioinformatics kapushesky_lecture08
Computer Science Club
 
Maxim Zaks: Deep dive into data serialisation
Maxim Zaks: Deep dive into data serialisationMaxim Zaks: Deep dive into data serialisation
Maxim Zaks: Deep dive into data serialisation
mdevtalk
 
Rare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
Rare Variant Analysis Workflows: Analyzing NGS Data in Large CohortsRare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
Rare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
Golden Helix Inc
 
OVium Bioinformatic Solutions
OVium Bioinformatic SolutionsOVium Bioinformatic Solutions
OVium Bioinformatic Solutions
OVium Solutions
 
Poster (1)
Poster (1)Poster (1)
Poster (1)
Daniel Osei
 
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Prof. Wim Van Criekinge
 
BioNLPSADI
BioNLPSADIBioNLPSADI
Internship Report
Internship ReportInternship Report
Internship Report
Neha Gupta
 
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
MLconf
 

Similar to Chambwe bosc2010 (20)

BiDiBlast Tool Presentation
BiDiBlast Tool PresentationBiDiBlast Tool Presentation
BiDiBlast Tool Presentation
 
Dgaston dec-06-2012
Dgaston dec-06-2012Dgaston dec-06-2012
Dgaston dec-06-2012
 
Enabling Large Scale Sequencing Studies through Science as a Service
Enabling Large Scale Sequencing Studies through Science as a ServiceEnabling Large Scale Sequencing Studies through Science as a Service
Enabling Large Scale Sequencing Studies through Science as a Service
 
Pipeline Scripting for the Parallel Alignment of Genomic Short Sequence Reads
Pipeline Scripting for the Parallel Alignment of Genomic Short Sequence ReadsPipeline Scripting for the Parallel Alignment of Genomic Short Sequence Reads
Pipeline Scripting for the Parallel Alignment of Genomic Short Sequence Reads
 
Under the Hood of Alignment Algorithms for NGS Researchers
Under the Hood of Alignment Algorithms for NGS ResearchersUnder the Hood of Alignment Algorithms for NGS Researchers
Under the Hood of Alignment Algorithms for NGS Researchers
 
Chado introduction
Chado introductionChado introduction
Chado introduction
 
Closing the Gap in Time: From Raw Data to Real Science
Closing the Gap in Time: From Raw Data to Real ScienceClosing the Gap in Time: From Raw Data to Real Science
Closing the Gap in Time: From Raw Data to Real Science
 
Folker Meyer: Metagenomic Data Annotation
Folker Meyer: Metagenomic Data AnnotationFolker Meyer: Metagenomic Data Annotation
Folker Meyer: Metagenomic Data Annotation
 
3rd presentation
3rd presentation3rd presentation
3rd presentation
 
PyData Meetup Presentation in Natal April 2024
PyData Meetup Presentation in Natal April 2024PyData Meetup Presentation in Natal April 2024
PyData Meetup Presentation in Natal April 2024
 
Tools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisTools for Transcriptome Data Analysis
Tools for Transcriptome Data Analysis
 
20100516 bioinformatics kapushesky_lecture08
20100516 bioinformatics kapushesky_lecture0820100516 bioinformatics kapushesky_lecture08
20100516 bioinformatics kapushesky_lecture08
 
Maxim Zaks: Deep dive into data serialisation
Maxim Zaks: Deep dive into data serialisationMaxim Zaks: Deep dive into data serialisation
Maxim Zaks: Deep dive into data serialisation
 
Rare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
Rare Variant Analysis Workflows: Analyzing NGS Data in Large CohortsRare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
Rare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
 
OVium Bioinformatic Solutions
OVium Bioinformatic SolutionsOVium Bioinformatic Solutions
OVium Bioinformatic Solutions
 
Poster (1)
Poster (1)Poster (1)
Poster (1)
 
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
 
BioNLPSADI
BioNLPSADIBioNLPSADI
BioNLPSADI
 
Internship Report
Internship ReportInternship Report
Internship Report
 
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
Chris Fregly, Research Scientist, PipelineIO at MLconf ATL 2016
 

More from BOSC 2010

Mercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_frameworkMercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_framework
BOSC 2010
 
Langmead bosc2010 cloud-genomics
Langmead bosc2010 cloud-genomicsLangmead bosc2010 cloud-genomics
Langmead bosc2010 cloud-genomics
BOSC 2010
 
Schultheiss bosc2010 persistance-web-services
Schultheiss bosc2010 persistance-web-servicesSchultheiss bosc2010 persistance-web-services
Schultheiss bosc2010 persistance-web-services
BOSC 2010
 
Swertz bosc2010 molgenis
Swertz bosc2010 molgenisSwertz bosc2010 molgenis
Swertz bosc2010 molgenis
BOSC 2010
 
Rice bosc2010 emboss
Rice bosc2010 embossRice bosc2010 emboss
Rice bosc2010 emboss
BOSC 2010
 
Morris bosc2010 evoker
Morris bosc2010 evokerMorris bosc2010 evoker
Morris bosc2010 evoker
BOSC 2010
 
Kono bosc2010 pathway_projector
Kono bosc2010 pathway_projectorKono bosc2010 pathway_projector
Kono bosc2010 pathway_projector
BOSC 2010
 
Kanterakis bosc2010 molgenis
Kanterakis bosc2010 molgenisKanterakis bosc2010 molgenis
Kanterakis bosc2010 molgenis
BOSC 2010
 
Gautier bosc2010 pythonbioconductor
Gautier bosc2010 pythonbioconductorGautier bosc2010 pythonbioconductor
Gautier bosc2010 pythonbioconductor
BOSC 2010
 
Gardler bosc2010 community_developmentattheasf
Gardler bosc2010 community_developmentattheasfGardler bosc2010 community_developmentattheasf
Gardler bosc2010 community_developmentattheasf
BOSC 2010
 
Friedberg bosc2010 iprstats
Friedberg bosc2010 iprstatsFriedberg bosc2010 iprstats
Friedberg bosc2010 iprstats
BOSC 2010
 
Fields bosc2010 bio_perl
Fields bosc2010 bio_perlFields bosc2010 bio_perl
Fields bosc2010 bio_perl
BOSC 2010
 
Chapman bosc2010 biopython
Chapman bosc2010 biopythonChapman bosc2010 biopython
Chapman bosc2010 biopython
BOSC 2010
 
Bonnal bosc2010 bio_ruby
Bonnal bosc2010 bio_rubyBonnal bosc2010 bio_ruby
Bonnal bosc2010 bio_ruby
BOSC 2010
 
Puton bosc2010 bio_python-modules-rna
Puton bosc2010 bio_python-modules-rnaPuton bosc2010 bio_python-modules-rna
Puton bosc2010 bio_python-modules-rna
BOSC 2010
 
Bader bosc2010 cytoweb
Bader bosc2010 cytowebBader bosc2010 cytoweb
Bader bosc2010 cytoweb
BOSC 2010
 
Talevich bosc2010 bio-phylo
Talevich bosc2010 bio-phyloTalevich bosc2010 bio-phylo
Talevich bosc2010 bio-phylo
BOSC 2010
 
Zmasek bosc2010 aptx
Zmasek bosc2010 aptxZmasek bosc2010 aptx
Zmasek bosc2010 aptx
BOSC 2010
 
Wilkinson bosc2010 moby-to-sadi
Wilkinson bosc2010 moby-to-sadiWilkinson bosc2010 moby-to-sadi
Wilkinson bosc2010 moby-to-sadi
BOSC 2010
 
Venkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkitVenkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkit
BOSC 2010
 

More from BOSC 2010 (20)

Mercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_frameworkMercer bosc2010 microsoft_framework
Mercer bosc2010 microsoft_framework
 
Langmead bosc2010 cloud-genomics
Langmead bosc2010 cloud-genomicsLangmead bosc2010 cloud-genomics
Langmead bosc2010 cloud-genomics
 
Schultheiss bosc2010 persistance-web-services
Schultheiss bosc2010 persistance-web-servicesSchultheiss bosc2010 persistance-web-services
Schultheiss bosc2010 persistance-web-services
 
Swertz bosc2010 molgenis
Swertz bosc2010 molgenisSwertz bosc2010 molgenis
Swertz bosc2010 molgenis
 
Rice bosc2010 emboss
Rice bosc2010 embossRice bosc2010 emboss
Rice bosc2010 emboss
 
Morris bosc2010 evoker
Morris bosc2010 evokerMorris bosc2010 evoker
Morris bosc2010 evoker
 
Kono bosc2010 pathway_projector
Kono bosc2010 pathway_projectorKono bosc2010 pathway_projector
Kono bosc2010 pathway_projector
 
Kanterakis bosc2010 molgenis
Kanterakis bosc2010 molgenisKanterakis bosc2010 molgenis
Kanterakis bosc2010 molgenis
 
Gautier bosc2010 pythonbioconductor
Gautier bosc2010 pythonbioconductorGautier bosc2010 pythonbioconductor
Gautier bosc2010 pythonbioconductor
 
Gardler bosc2010 community_developmentattheasf
Gardler bosc2010 community_developmentattheasfGardler bosc2010 community_developmentattheasf
Gardler bosc2010 community_developmentattheasf
 
Friedberg bosc2010 iprstats
Friedberg bosc2010 iprstatsFriedberg bosc2010 iprstats
Friedberg bosc2010 iprstats
 
Fields bosc2010 bio_perl
Fields bosc2010 bio_perlFields bosc2010 bio_perl
Fields bosc2010 bio_perl
 
Chapman bosc2010 biopython
Chapman bosc2010 biopythonChapman bosc2010 biopython
Chapman bosc2010 biopython
 
Bonnal bosc2010 bio_ruby
Bonnal bosc2010 bio_rubyBonnal bosc2010 bio_ruby
Bonnal bosc2010 bio_ruby
 
Puton bosc2010 bio_python-modules-rna
Puton bosc2010 bio_python-modules-rnaPuton bosc2010 bio_python-modules-rna
Puton bosc2010 bio_python-modules-rna
 
Bader bosc2010 cytoweb
Bader bosc2010 cytowebBader bosc2010 cytoweb
Bader bosc2010 cytoweb
 
Talevich bosc2010 bio-phylo
Talevich bosc2010 bio-phyloTalevich bosc2010 bio-phylo
Talevich bosc2010 bio-phylo
 
Zmasek bosc2010 aptx
Zmasek bosc2010 aptxZmasek bosc2010 aptx
Zmasek bosc2010 aptx
 
Wilkinson bosc2010 moby-to-sadi
Wilkinson bosc2010 moby-to-sadiWilkinson bosc2010 moby-to-sadi
Wilkinson bosc2010 moby-to-sadi
 
Venkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkitVenkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkit
 

Recently uploaded

Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
Pablo Gómez Abajo
 
High performance Serverless Java on AWS- GoTo Amsterdam 2024
High performance Serverless Java on AWS- GoTo Amsterdam 2024High performance Serverless Java on AWS- GoTo Amsterdam 2024
High performance Serverless Java on AWS- GoTo Amsterdam 2024
Vadym Kazulkin
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
Enterprise Knowledge
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
ScyllaDB
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
christinelarrosa
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
Fwdays
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
DanBrown980551
 
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
Fwdays
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Pitangent Analytics & Technology Solutions Pvt. Ltd
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 

Recently uploaded (20)

Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
 
High performance Serverless Java on AWS- GoTo Amsterdam 2024
High performance Serverless Java on AWS- GoTo Amsterdam 2024High performance Serverless Java on AWS- GoTo Amsterdam 2024
High performance Serverless Java on AWS- GoTo Amsterdam 2024
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
A Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's ArchitectureA Deep Dive into ScyllaDB's Architecture
A Deep Dive into ScyllaDB's Architecture
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
 
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin..."$10 thousand per minute of downtime: architecture, queues, streaming and fin...
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
 
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
"Scaling RAG Applications to serve millions of users",  Kevin Goedecke"Scaling RAG Applications to serve millions of users",  Kevin Goedecke
"Scaling RAG Applications to serve millions of users", Kevin Goedecke
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 

Chambwe bosc2010

  • 1.
  • 2. Applications of Next Generation Sequencing McPherson J.D. Nat Methods. 2009
  • 3. Next Generation Sequencers Metzker, M.L. Nat Rev Genet. 2010 Roche/454 GS FLX Titanium Illumina/Solexa GA IIe Life Technologies SOLiD 3 Helicos BioSciences Heliscope NGS Chemistry Pyrosequencing Reversible Terminators Sequencing by ligation Reversible Terminators Avg Read Length (bp) 330 75 50 32 Run Time (days) 0.35 4 7 8 Giga bases/run 0.45 18 30 37 Million reads/run 1.36 240 600 1156
  • 4.
  • 5.
  • 6. File Formats Low Level APIs Tools/Utilities Applications Java, C++, Python RNA-Seq Pipeline IGV Plug-in The Goby Software Framework reads alignments histograms Readers Writers Iterators File Format Conversions Alignment Processing Visualization
  • 7. File Formats Low Level APIs Tools/Utilities Applications Java, C++, Python RNA-Seq Pipeline IGV Plug-in The Goby Software Framework Readers Writers Iterators File Format Conversions Alignment Processing Visualization
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13. Goby File Size Comparisons MAQC sample B = Ambion Human Brain Reference RNA (HBRR or HBR, Catalog #6050) sequenced on four next-gen platforms
  • 14. File Formats Low Level APIs Tools/Utilities Applications Java, C++, Python Readers Writers Iterators The Goby Software Framework reads alignments histograms File Format Conversions Alignment Processing Visualization RNA-Seq Pipeline IGV Plug-in
  • 15. File Formats Low Level APIs Tools/Utilities Applications Java, C++, Python Readers Writers Iterators RNA-Seq Pipeline IGV Plug-in The Goby Software Framework reads alignments histograms File Format Conversions Alignment Processing Visualization
  • 16.
  • 17. File Formats Low Level APIs Tools/Utilities Applications Java, C++, Python RNA-Seq Pipeline IGV Plug-in The Goby Software Framework reads alignments histograms Readers Writers Iterators File Format Conversions Alignment Processing Visualization
  • 18. File Formats Low Level APIs Tools/Utilities Applications Java, C++, Python RNA-Seq Pipeline IGV Plug-in The Goby Software Framework reads alignments histograms Readers Writers Iterators File Format Conversions Alignment Processing Visualization
  • 19.
  • 21.
  • 22.
  • 23.  

Editor's Notes

  1. Applications of NGS include Explosion of NGS A gap exists between current sequence-generation and data analysis capabilities to extract relevant biological insights
  2. Several sequencing platforms available on the market Each with unique chemistry and producing data with different characteristics Throughput varies  very large
  3. Preponderance of NGS data file formats to represent these data
  4. Here is a list of characteristics we find desirable in a NGS file format Transition: Developed file formats that meet these requirements. File formats are not sufficient therefore we have developed a framework to use these formats and create analysis tools
  5. This is an outline of the Goby Software Framework
  6. Now I will discuss File formats
  7. PB think xml but better
  8. Brief overview of how schemas are written using PBs A collection of messages of type readEntries
  9. Transition: to achieve compression we gzip collections of messages
  10. Protocol buffers do not support messages larger than a few megabytes Contribution of Goby is implementing Protocol buffers in such way to remove the collection size limitation scale for very large messages Overcome by splitting messages into chunks Each Chunk of a compact reads file represents 10,000 or less ReadEntry messages Supports semi random access Chunking leveraged for parallel processing – different servers can access chunks independently - Semi Random Access
  11. Gzip and chunking meet the requirements for random access and streaming Transition: how well do we do with respect to file sizes
  12. Apple --- apples comparison Multiple alignments
  13. Formats are compact How can YOU use it? Low level API’s
  14. One practical example of printing entries in an alignment file Goby makes it easy to write code to iterate over the contents of multiple compact alignment files
  15. Goby provides utilities to help build analysis pipeline
  16. MAQC sample B = Ambion Human Brain Reference RNA (HBRR or HBR, Catalog #6050) sequenced on multiple platforms Normalized gene expression counts RPKM Random hexamer priming results in a bias in nucleotide composition at the start of sequence reads Hansen KD. et al. Nucleic Acids Res. 2010 Jul 1;38(12):e131. Epub2010 Apr 14 Hansen Reweighting scheme to correct for that bias implemented in Goby for genes
  17. Ambion Human Brain Reference RNA -- MAQCII sample B Different Brain regions from 23 donors.