SlideShare a Scribd company logo
1 of 57
Dr. Stickle

an EST-Database
Dr. Stickle
an EST-Database
The Goal
The Goal


• Creating a pipeline for EST-Analysis
The Goal


• Creating a pipeline for EST-Analysis
• Displaying the results via an online
 framework
wtf is a pipeline?
Different steps of analysis performed in an
             automated fashion




       wtf is a pipeline?
wtf is a pipeline?


Different steps of analysis performed in an
             automated fashion
Analysis:
✓Assembly of EST-Reads into contigs
✓SNP-Detection
               MIRA
               But:
       ★Takes ages
       ★not well documented
       ★buggy
✓Assembly of EST-Reads into contigs
✓SNP-Detection
               MIRA
               But:
       ★Takes ages
       ★not well documented
       ★buggy
✓Assembly of EST-Reads into contigs
✓SNP-Detection
               MIRA
               But:
       ★Takes ages
       ★not well documented
       ★buggy
MIRA
✓Assembly of EST-Reads into contigs
✓SNP-Detection


               But:
       ★Takes ages
       ★not well documented
       ★buggy
MIRA
✓Assembly of EST-Reads into contigs
✓SNP-Detection


               But:
       ★Takes ages
       ★not well documented
       ★buggy
MIRA
✓Assembly of EST-Reads into contigs
✓SNP-Detection

               But:
       ★Takes ages
       ★not well documented
       ★buggy
MIRA
✓Assembly of EST-Reads into contigs
✓SNP-Detection

               But:
       ★Takes ages
       ★not well documented
       ★buggy
SNPs   ORFs   Contigs BLAST
               MIRA




       PFAM        BLAST2GO
MIRA




SNPs   ORFs   Contigs BLAST




       PFAM           BLAST2GO
MIRA
              Contigs




SNPs   ORFs             BLAST




       PFAM        BLAST2GO
MIRA
              Contigs




SNPs   ORFs             BLAST




       PFAM        BLAST2GO
MIRA
              Contigs




SNPs   ORFs             BLAST




       PFAM        BLAST2GO
MIRA
              Contigs




SNPs   ORFs             BLAST




       PFAM        BLAST2GO
MIRA
              Contigs




SNPs   ORFs             BLAST




       PFAM        BLAST2GO
BLAST
Basic Local Alignment Search Tool
BLAST
Basic Local Alignment Search Tool
BLAST
              Basic Local Alignment Search Tool




• Standard for searching sequences against
 a database
BLAST
            Basic Local Alignment Search Tool




• Standard for searching sequences against
  a database
• emphasizes speed over sensitivity
BLAST
            Basic Local Alignment Search Tool




• Standard for searching sequences against
  a database
• emphasizes speed over sensitivity
BLAST
            Basic Local Alignment Search Tool




• Standard for searching sequences against
  a database
• emphasizes speed over sensitivity
BLAST
            Basic Local Alignment Search Tool




• Standard for searching sequences against
  a database
• emphasizes speed over sensitivity
Tools


                   gene ontology
Blast2GO
                     mapping


              open reading frame
  ORF
                  prediction


 PFAM          domain annotation
Blast2GO
Tools


                   gene ontology
Blast2GO
                     mapping


              open reading frame
  ORF
                  prediction


 PFAM          domain annotation
ORF
Tools


                   gene ontology
Blast2GO
                     mapping


              open reading frame
  ORF
                  prediction


 PFAM          domain annotation
PFAM
Ugly *.ace-output generated via MIRA
What we‘ve got here:
What we‘ve got here:

•Different tools
•many different output-files
What we‘ve got here:

     •Different tools
     •many different output-files

             What we want:

a structured database containing all the
              information
How to parse


Class «Parser»
 •Function BLAST-Parser
 •Function PFAM-Parser
 •Function FASTA-Parser
 •...
                                         Data



Script
 •read input
 •use parser
 •insert db
How to parse


Class «Parser»
 •Function BLAST-Parser
 •Function PFAM-Parser
 •Function FASTA-Parser
 •...
                                         Data



Script
 •read input
 •use parser
 •insert db
How to parse


Class «Parser»
 •Function BLAST-Parser
 •Function PFAM-Parser
 •Function FASTA-Parser
 •...
                                         Data



Script
 •read input
 •use parser
 •insert db
How to parse


Class «Parser»
 •Function BLAST-Parser
 •Function PFAM-Parser
 •Function FASTA-Parser
 •...
                                         Data



Script
 •read input
 •use parser
 •insert db
How to parse

                                Data
Class «Parser»
 •Function BLAST-Parser
 •Function PFAM-Parser
 •Function FASTA-Parser
 •...            Script
                  •read input
                  •use parser
                  •insert db


                           Database
>abc_123
agtagtacgtacgtggacgtatgact
>def_456
agtagtacgtacgtggacgtatgact
Summary & Results
Summary & Results

•created the pipeline
•analysed data
•started filling the database
Summary & Results

•created the pipeline
•analysed data
•started filling the database

        To be done


•wait for MIRA
•SNP-parser
thx to:


•Marvin, for «time till scooter» and sending us to Lothar
•Lothar, for providing always friendly and calm advice
•Suse, for actually having used MIRA at least once
•Andrew, for Andreas
•Andreas, for Andrew
•Bastien Chevreux, for not fixing those damn bugs in MIRA
k,thx,bai
Dr. Iglo
Dr. Iglo

More Related Content

Similar to Dr. Stickle

Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
Databricks
 
The RNA workbench - Galaxy User Conference 2018
The RNA workbench - Galaxy User Conference 2018The RNA workbench - Galaxy User Conference 2018
The RNA workbench - Galaxy User Conference 2018
Florian Eggenhofer
 
Computational biology bls 303
Computational biology bls 303Computational biology bls 303
Computational biology bls 303
Bruno Mmassy
 
Programming for biologists
Programming for biologistsProgramming for biologists
Programming for biologists
jigma
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pub
sesejun
 

Similar to Dr. Stickle (20)

ChipSeq Data Analysis
ChipSeq Data AnalysisChipSeq Data Analysis
ChipSeq Data Analysis
 
ICAR 2015 Workshop - Agnes Chan
ICAR 2015 Workshop - Agnes ChanICAR 2015 Workshop - Agnes Chan
ICAR 2015 Workshop - Agnes Chan
 
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
Microservices and Teraflops: Effortlessly Scaling Data Science with PyWren wi...
 
தமிழ்க்கணிமை கட்டமைப்பு
தமிழ்க்கணிமை கட்டமைப்புதமிழ்க்கணிமை கட்டமைப்பு
தமிழ்க்கணிமை கட்டமைப்பு
 
From Zero to Nextflow 2017
From Zero to Nextflow 2017From Zero to Nextflow 2017
From Zero to Nextflow 2017
 
Jan2015 GIAB intro, Update, and Data Analysis Planning
Jan2015 GIAB intro, Update, and Data Analysis PlanningJan2015 GIAB intro, Update, and Data Analysis Planning
Jan2015 GIAB intro, Update, and Data Analysis Planning
 
The RNA workbench - Galaxy User Conference 2018
The RNA workbench - Galaxy User Conference 2018The RNA workbench - Galaxy User Conference 2018
The RNA workbench - Galaxy User Conference 2018
 
A guided tour of Araport
A guided tour of AraportA guided tour of Araport
A guided tour of Araport
 
Evaluation of the impact of error correction algorithms on SNP calling.
Evaluation of the impact of error correction algorithms on SNP calling.Evaluation of the impact of error correction algorithms on SNP calling.
Evaluation of the impact of error correction algorithms on SNP calling.
 
Computational biology bls 303
Computational biology bls 303Computational biology bls 303
Computational biology bls 303
 
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
Galaxy dna-seq-variant calling-presentationandpractical_gent_april-2016
 
Creating a SNP calling pipeline
Creating a SNP calling pipelineCreating a SNP calling pipeline
Creating a SNP calling pipeline
 
Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGS
 
Enabling Biobank-Scale Genomic Processing with Spark SQL
Enabling Biobank-Scale Genomic Processing with Spark SQLEnabling Biobank-Scale Genomic Processing with Spark SQL
Enabling Biobank-Scale Genomic Processing with Spark SQL
 
Programming for biologists
Programming for biologistsProgramming for biologists
Programming for biologists
 
20110524zurichngs 1st pub
20110524zurichngs 1st pub20110524zurichngs 1st pub
20110524zurichngs 1st pub
 
The Postmodern Binary Analysis
The Postmodern Binary AnalysisThe Postmodern Binary Analysis
The Postmodern Binary Analysis
 
Fasta
FastaFasta
Fasta
 
Presentation on FASTA
Presentation on FASTAPresentation on FASTA
Presentation on FASTA
 
Species identification.pptx
Species identification.pptxSpecies identification.pptx
Species identification.pptx
 

More from Bastian Greshake

Medienkompetenz in Sozialen Netzwerken
Medienkompetenz in Sozialen NetzwerkenMedienkompetenz in Sozialen Netzwerken
Medienkompetenz in Sozialen Netzwerken
Bastian Greshake
 

More from Bastian Greshake (19)

My Life in Lockdown
My Life in LockdownMy Life in Lockdown
My Life in Lockdown
 
2020 03-11-open-life-sciences
2020 03-11-open-life-sciences2020 03-11-open-life-sciences
2020 03-11-open-life-sciences
 
openSNP @ Geekend Darmstadt
openSNP @ Geekend DarmstadtopenSNP @ Geekend Darmstadt
openSNP @ Geekend Darmstadt
 
Crowdsourcing the Analysis of Genomes
Crowdsourcing the Analysis of GenomesCrowdsourcing the Analysis of Genomes
Crowdsourcing the Analysis of Genomes
 
openSNP - QS Cologne Meetup
openSNP - QS Cologne MeetupopenSNP - QS Cologne Meetup
openSNP - QS Cologne Meetup
 
The Future of Genetics
The Future of GeneticsThe Future of Genetics
The Future of Genetics
 
openSNP - Crowdsourcing Genome Wide Association Studies
openSNP - Crowdsourcing Genome Wide Association StudiesopenSNP - Crowdsourcing Genome Wide Association Studies
openSNP - Crowdsourcing Genome Wide Association Studies
 
Was die Post-Genomics-Ära für die Privatssphäre bedeutet
Was die Post-Genomics-Ära für die Privatssphäre bedeutetWas die Post-Genomics-Ära für die Privatssphäre bedeutet
Was die Post-Genomics-Ära für die Privatssphäre bedeutet
 
Crowdsourcing GWAS
Crowdsourcing GWASCrowdsourcing GWAS
Crowdsourcing GWAS
 
Gentechnik
GentechnikGentechnik
Gentechnik
 
Lernen durch Lehren
Lernen durch LehrenLernen durch Lehren
Lernen durch Lehren
 
Haushalt 2011 Münster
Haushalt 2011 MünsterHaushalt 2011 Münster
Haushalt 2011 Münster
 
LiquidFeedback Workshop
LiquidFeedback WorkshopLiquidFeedback Workshop
LiquidFeedback Workshop
 
PiratenMS - Google Street View
PiratenMS - Google Street ViewPiratenMS - Google Street View
PiratenMS - Google Street View
 
Next Generation Sequencing & Transcriptome Analysis
Next Generation Sequencing & Transcriptome AnalysisNext Generation Sequencing & Transcriptome Analysis
Next Generation Sequencing & Transcriptome Analysis
 
Medienkompetenz in Sozialen Netzwerken
Medienkompetenz in Sozialen NetzwerkenMedienkompetenz in Sozialen Netzwerken
Medienkompetenz in Sozialen Netzwerken
 
Denkt denn keiner an die Kernthemen?
Denkt denn keiner an die Kernthemen?Denkt denn keiner an die Kernthemen?
Denkt denn keiner an die Kernthemen?
 
Uncanny Valley - Affen vs. Menschen
Uncanny Valley - Affen vs. MenschenUncanny Valley - Affen vs. Menschen
Uncanny Valley - Affen vs. Menschen
 
Meerschweinchen
MeerschweinchenMeerschweinchen
Meerschweinchen
 

Recently uploaded

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Recently uploaded (20)

Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 

Dr. Stickle

  • 4. The Goal • Creating a pipeline for EST-Analysis
  • 5. The Goal • Creating a pipeline for EST-Analysis • Displaying the results via an online framework
  • 6. wtf is a pipeline?
  • 7. Different steps of analysis performed in an automated fashion wtf is a pipeline?
  • 8. wtf is a pipeline? Different steps of analysis performed in an automated fashion
  • 10. ✓Assembly of EST-Reads into contigs ✓SNP-Detection MIRA But: ★Takes ages ★not well documented ★buggy
  • 11. ✓Assembly of EST-Reads into contigs ✓SNP-Detection MIRA But: ★Takes ages ★not well documented ★buggy
  • 12. ✓Assembly of EST-Reads into contigs ✓SNP-Detection MIRA But: ★Takes ages ★not well documented ★buggy
  • 13. MIRA ✓Assembly of EST-Reads into contigs ✓SNP-Detection But: ★Takes ages ★not well documented ★buggy
  • 14. MIRA ✓Assembly of EST-Reads into contigs ✓SNP-Detection But: ★Takes ages ★not well documented ★buggy
  • 15. MIRA ✓Assembly of EST-Reads into contigs ✓SNP-Detection But: ★Takes ages ★not well documented ★buggy
  • 16. MIRA ✓Assembly of EST-Reads into contigs ✓SNP-Detection But: ★Takes ages ★not well documented ★buggy
  • 17. SNPs ORFs Contigs BLAST MIRA PFAM BLAST2GO
  • 18. MIRA SNPs ORFs Contigs BLAST PFAM BLAST2GO
  • 19. MIRA Contigs SNPs ORFs BLAST PFAM BLAST2GO
  • 20. MIRA Contigs SNPs ORFs BLAST PFAM BLAST2GO
  • 21. MIRA Contigs SNPs ORFs BLAST PFAM BLAST2GO
  • 22. MIRA Contigs SNPs ORFs BLAST PFAM BLAST2GO
  • 23. MIRA Contigs SNPs ORFs BLAST PFAM BLAST2GO
  • 26. BLAST Basic Local Alignment Search Tool • Standard for searching sequences against a database
  • 27. BLAST Basic Local Alignment Search Tool • Standard for searching sequences against a database • emphasizes speed over sensitivity
  • 28. BLAST Basic Local Alignment Search Tool • Standard for searching sequences against a database • emphasizes speed over sensitivity
  • 29. BLAST Basic Local Alignment Search Tool • Standard for searching sequences against a database • emphasizes speed over sensitivity
  • 30. BLAST Basic Local Alignment Search Tool • Standard for searching sequences against a database • emphasizes speed over sensitivity
  • 31. Tools gene ontology Blast2GO mapping open reading frame ORF prediction PFAM domain annotation
  • 33. Tools gene ontology Blast2GO mapping open reading frame ORF prediction PFAM domain annotation
  • 34. ORF
  • 35. Tools gene ontology Blast2GO mapping open reading frame ORF prediction PFAM domain annotation
  • 36. PFAM
  • 37.
  • 40. What we‘ve got here: •Different tools •many different output-files
  • 41. What we‘ve got here: •Different tools •many different output-files What we want: a structured database containing all the information
  • 42. How to parse Class «Parser» •Function BLAST-Parser •Function PFAM-Parser •Function FASTA-Parser •... Data Script •read input •use parser •insert db
  • 43. How to parse Class «Parser» •Function BLAST-Parser •Function PFAM-Parser •Function FASTA-Parser •... Data Script •read input •use parser •insert db
  • 44. How to parse Class «Parser» •Function BLAST-Parser •Function PFAM-Parser •Function FASTA-Parser •... Data Script •read input •use parser •insert db
  • 45. How to parse Class «Parser» •Function BLAST-Parser •Function PFAM-Parser •Function FASTA-Parser •... Data Script •read input •use parser •insert db
  • 46. How to parse Data Class «Parser» •Function BLAST-Parser •Function PFAM-Parser •Function FASTA-Parser •... Script •read input •use parser •insert db Database
  • 48.
  • 49.
  • 50.
  • 52. Summary & Results •created the pipeline •analysed data •started filling the database
  • 53. Summary & Results •created the pipeline •analysed data •started filling the database To be done •wait for MIRA •SNP-parser
  • 54. thx to: •Marvin, for «time till scooter» and sending us to Lothar •Lothar, for providing always friendly and calm advice •Suse, for actually having used MIRA at least once •Andrew, for Andreas •Andreas, for Andrew •Bastien Chevreux, for not fixing those damn bugs in MIRA