SlideShare a Scribd company logo
Introduction to Bioinformatics A tale of myths and legends [Freevector] June 16, 2011
June 16, 2011 “Bioinformatics is the field of science in which biology, computer science, and information technology merge to form a single discipline.” National Center for Biotechnology Information (NCBI)
Areas where bioinformatics is applied  Genomics Genomic feature prediction Sequencing data analysis Proteomics Protein 3D structure modeling Drug design Systems Biology Gene set enrichment Pathway analysis Phenotype Image analysis Integration June 16, 2011
Approach Biological Question Generate Data Translate into a computer solvable task Develop an algorithm Implement algorithm Run algorithm Condense result in human readable form Answer Biological Question Example Genes regulated by protein X  ChIP-Seq data “Align reads and identify clusters in the genome” Choose data structures Write source code Align reads Write script to summarize results genome wide  Report protein’s binding sites June 16, 2011 “Bioinformatics is the field of science in which biology, computer science, and information technology merge to form a single discipline.” NCBI
The challenges in bioinformatics Acceptance by biological collaborators when all that matters for the publication is the biology Retaining quality work Workflows poorly annotated in papers Programs poorly written No reproducibility Keeping up-to-date New programs are published every week New formats because no time to evaluate existing standards New databases because existing ones full of noise June 16, 2011
Bioinformatics a mythical creature? June 16, 2011 Christos Ouzounis Head of the Computational Genomics Group at the  European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Cambridge UK
Myth #1: Anybody can do it! Assumption: Most bioinformatics analysis can be done by using web applications and commercial programs with GUIs June 16, 2011 http://www.broadinstitute.org/cancer/software/genepattern/index.html http://main.g2.bx.psu.edu/ Ouzounis C. Two or three myths about bioinformatics. Bioinformatics. 2000 Mar;16(3):187-9. PubMed PMID: 10869011.
Customized answers require expert input Anyone can do predefined analysis with web pages and off-the-shelf programs, however Using tools without understanding the methodology is dangerous ,[object Object],E.g. Program might produce output that has a certain bias, not knowing this the researchers could publish this artificial bias as biological result. A standard bioinformatics tool that works well for general tasks does not exist E.g. Local-Alignment Algorithm (1981) vs. PCR (1983) in NGS Only novel tools/pipelines can provide customized answers ,[object Object],E.g. You might have to settle for comparing your features with known genes because the program is not able to compare to novel transcripts June 16, 2011 ! Smith, Temple F.; and Waterman, Michael S. (1981). "Identification of Common Molecular Subsequences". Journal of Molecular Biology 147: 195–197.
Myth #2: Bioinformatics is a service Assumption: Bioinformatics merely supports the experimental research and can be a disconnect service June 16, 2011 Traditional Biology Hypothesis Experiment Eyeballing Experimental Design Evaluation Biology High Throughput Biology (assumption) Experiment Biology Hypothesis Experimental Design Evaluation Data analysis Bioinformatics
Interdisciplinary analysis requires an interdisciplinary team throughout ! June 16, 2011 Standard data analysis can be a service task, however  Having a service performed without knowing the methodology is dangerous.  ,[object Object],Repeating a statistical test for all genes requires an E-value to be calculated. Producing data not suitable for the planned analysis is wasteful. ,[object Object],Comparing the distribution of mapped reads of runs with different read lengths will result in a difference that is due to the mapping bias of different read lengths. High Throughput Biology Experiment Analysis Evaluation Experimental Design Hypothesis
Myth #4: Bioinformatics is quick Assumption: bioinformatics analysis can be done quickly because computers are involved. June 16, 2011 http://www.ads-links.com/images/wp/keyboard-fast.gif
Bioinformatics analysis is a scientific experiment in itself Bioinformatics is faster than manual work, however Quick tasks accumulated take a long time Task: Map 15 million reads of 76 bp length against the complete human genome (hg18) Manual: couple of decades Brute-Force: couple of years BLAST (1995): couple of days Modern Aligners: BWA ~ 4 h Bioinformatics is a proper scientific experiment in itself requires time for experimental design, development of controls, parameter tuning, evaluation, and summarizing.    June 16, 2011 ! Bao S, Jiang R, Kwan W, Wang B, Ma X, Song YQ. Evaluation of next-generation sequencing software in mapping and assembly. J Hum Genet. 2011 Apr 28. PubMed PMID: 21525877.
Myth #5: All dry-lab research does the same Assumption: bioinformatics is interchangeable with other dry lab research area because they all “analyze data”. Assumption: All biological research areas are interchangeable because they all “work with samples”.  June 16, 2011
Three things to remember Bioinformatics requires dedication and continuity Bioinformatics data analysis is a full research experiment in itself We get the most out of our research if we work as a interdisciplinary research team throughout June 16, 2011 Experiment Analysis Evaluation Experimental Design Hypothesis
Next week: June 16, 2011 Abstract: An introduction to second generation sequencing will be given with focus on the production informatics: The basic approach of read-mapping and feature extraction will be introduced and challenges associated with sequencing errors discussed.  http://web.qbi.uq.edu.au/labs/gseq/analysis/bioinformatics-seminar-series/
TIP Puts several images in one file convert -adjoin unicorn.pngunicorn.pngunicorn.pngadjoin.pdf Joins several images into one image convert –append unicorn.pngunicorn.pngunicorn.pngappend.pdf June 16, 2011

More Related Content

What's hot

Computational Biology and Bioinformatics
Computational Biology and BioinformaticsComputational Biology and Bioinformatics
Computational Biology and Bioinformatics
Sharif Shuvo
 
GENOMICS AND BIOINFORMATICS
GENOMICS AND BIOINFORMATICSGENOMICS AND BIOINFORMATICS
GENOMICS AND BIOINFORMATICS
sandeshGM
 
Genome Database Systems
Genome Database Systems Genome Database Systems
Genome Database Systems
Harindu Chathuranga Korala
 
Protein Database
Protein DatabaseProtein Database
Swiss prot database
Swiss prot databaseSwiss prot database
Swiss prot database
sagrika chugh
 
Bioinformatics, its application main
Bioinformatics, its application mainBioinformatics, its application main
Bioinformatics, its application main
KAUSHAL SAHU
 
Protein data bank
Protein data bankProtein data bank
Protein data bank
Alichy Sowmya
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES nadeem akhter
 
SWISS-PROT
SWISS-PROTSWISS-PROT
Primary and secondary database
Primary and secondary databasePrimary and secondary database
Primary and secondary database
KAUSHAL SAHU
 
Protein database
Protein databaseProtein database
Protein database
Khalid Hakeem
 
Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics
Sachin Kumar
 
EMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology LaboratoryEMBL- European Molecular Biology Laboratory
Tools of bioinforformatics by kk
Tools of bioinforformatics by kkTools of bioinforformatics by kk
Tools of bioinforformatics by kk
KAUSHAL SAHU
 
Kegg databse
Kegg databseKegg databse
Kegg databse
Rashi Srivastava
 
Biological databases
Biological databasesBiological databases
Biological databases
Sucheta Tripathy
 
Computational biology
Computational biologyComputational biology
Computational biology
Zeina Abdelmoez
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformaticsbiinoida
 
Sequence Submission Tools
Sequence Submission ToolsSequence Submission Tools
Sequence Submission Tools
RishikaMaji
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
Somdutt Sharma
 

What's hot (20)

Computational Biology and Bioinformatics
Computational Biology and BioinformaticsComputational Biology and Bioinformatics
Computational Biology and Bioinformatics
 
GENOMICS AND BIOINFORMATICS
GENOMICS AND BIOINFORMATICSGENOMICS AND BIOINFORMATICS
GENOMICS AND BIOINFORMATICS
 
Genome Database Systems
Genome Database Systems Genome Database Systems
Genome Database Systems
 
Protein Database
Protein DatabaseProtein Database
Protein Database
 
Swiss prot database
Swiss prot databaseSwiss prot database
Swiss prot database
 
Bioinformatics, its application main
Bioinformatics, its application mainBioinformatics, its application main
Bioinformatics, its application main
 
Protein data bank
Protein data bankProtein data bank
Protein data bank
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES
 
SWISS-PROT
SWISS-PROTSWISS-PROT
SWISS-PROT
 
Primary and secondary database
Primary and secondary databasePrimary and secondary database
Primary and secondary database
 
Protein database
Protein databaseProtein database
Protein database
 
Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics Databases pathways of genomics and proteomics
Databases pathways of genomics and proteomics
 
EMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology LaboratoryEMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology Laboratory
 
Tools of bioinforformatics by kk
Tools of bioinforformatics by kkTools of bioinforformatics by kk
Tools of bioinforformatics by kk
 
Kegg databse
Kegg databseKegg databse
Kegg databse
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Computational biology
Computational biologyComputational biology
Computational biology
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Sequence Submission Tools
Sequence Submission ToolsSequence Submission Tools
Sequence Submission Tools
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 

Viewers also liked

Variant (SNPs/Indels) calling in DNA sequences, Part 2
Variant (SNPs/Indels) calling in DNA sequences, Part 2Variant (SNPs/Indels) calling in DNA sequences, Part 2
Variant (SNPs/Indels) calling in DNA sequences, Part 2
Denis C. Bauer
 
Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1 Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1
Denis C. Bauer
 
Functionally annotate genomic variants
Functionally annotate genomic variantsFunctionally annotate genomic variants
Functionally annotate genomic variants
Denis C. Bauer
 
Protein function and bioinformatics
Protein function and bioinformaticsProtein function and bioinformatics
Protein function and bioinformatics
Neil Saunders
 
Introduction to Bioinformatics.
 Introduction to Bioinformatics. Introduction to Bioinformatics.
Introduction to Bioinformatics.
Elena Sügis
 
100505 koenig biological_databases
100505 koenig biological_databases100505 koenig biological_databases
100505 koenig biological_databasesMeetika Gupta
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of Bioinformatics
Duncan Hull
 
Introduction to Data Mining / Bioinformatics
Introduction to Data Mining / BioinformaticsIntroduction to Data Mining / Bioinformatics
Introduction to Data Mining / Bioinformatics
Gerald Lushington
 
How to write bioinformatics software people will use and cite - t.seemann - ...
How to write bioinformatics software people will use and cite -  t.seemann - ...How to write bioinformatics software people will use and cite -  t.seemann - ...
How to write bioinformatics software people will use and cite - t.seemann - ...
Torsten Seemann
 
Introduction to second generation sequencing
Introduction to second generation sequencingIntroduction to second generation sequencing
Introduction to second generation sequencing
Denis C. Bauer
 
1.bioinformatics introduction 32.03.2071
1.bioinformatics introduction 32.03.20711.bioinformatics introduction 32.03.2071
1.bioinformatics introduction 32.03.2071RajDip Basnet
 
B.sc biochem i bobi u-1 introduction to bioinformatics
B.sc biochem i bobi u-1 introduction to bioinformaticsB.sc biochem i bobi u-1 introduction to bioinformatics
B.sc biochem i bobi u-1 introduction to bioinformatics
Rai University
 
Bioinformatics issues and challanges presentation at s p college
Bioinformatics  issues and challanges  presentation at s p collegeBioinformatics  issues and challanges  presentation at s p college
Bioinformatics issues and challanges presentation at s p college
SKUASTKashmir
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
Promila Sharan
 
Bioinformatics and Drug Discovery
Bioinformatics and Drug DiscoveryBioinformatics and Drug Discovery
Bioinformatics and Drug Discovery
Dr. Paulsharma Chakravarthy
 
Nucleic Acid Sequence databases
Nucleic Acid Sequence databasesNucleic Acid Sequence databases
Nucleic Acid Sequence databases
Pranavathiyani G
 
Biological Databases
Biological DatabasesBiological Databases
Biological Databases
Shweta Kagliwal
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
Nuno Barreto
 
Major databases in bioinformatics
Major databases in bioinformaticsMajor databases in bioinformatics
Major databases in bioinformatics
Vidya Kalaivani Rajkumar
 

Viewers also liked (20)

Variant (SNPs/Indels) calling in DNA sequences, Part 2
Variant (SNPs/Indels) calling in DNA sequences, Part 2Variant (SNPs/Indels) calling in DNA sequences, Part 2
Variant (SNPs/Indels) calling in DNA sequences, Part 2
 
Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1 Variant (SNPs/Indels) calling in DNA sequences, Part 1
Variant (SNPs/Indels) calling in DNA sequences, Part 1
 
Functionally annotate genomic variants
Functionally annotate genomic variantsFunctionally annotate genomic variants
Functionally annotate genomic variants
 
Protein function and bioinformatics
Protein function and bioinformaticsProtein function and bioinformatics
Protein function and bioinformatics
 
Introduction to Bioinformatics.
 Introduction to Bioinformatics. Introduction to Bioinformatics.
Introduction to Bioinformatics.
 
100505 koenig biological_databases
100505 koenig biological_databases100505 koenig biological_databases
100505 koenig biological_databases
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of Bioinformatics
 
Introduction to Data Mining / Bioinformatics
Introduction to Data Mining / BioinformaticsIntroduction to Data Mining / Bioinformatics
Introduction to Data Mining / Bioinformatics
 
How to write bioinformatics software people will use and cite - t.seemann - ...
How to write bioinformatics software people will use and cite -  t.seemann - ...How to write bioinformatics software people will use and cite -  t.seemann - ...
How to write bioinformatics software people will use and cite - t.seemann - ...
 
Introduction to second generation sequencing
Introduction to second generation sequencingIntroduction to second generation sequencing
Introduction to second generation sequencing
 
1.bioinformatics introduction 32.03.2071
1.bioinformatics introduction 32.03.20711.bioinformatics introduction 32.03.2071
1.bioinformatics introduction 32.03.2071
 
B.sc biochem i bobi u-1 introduction to bioinformatics
B.sc biochem i bobi u-1 introduction to bioinformaticsB.sc biochem i bobi u-1 introduction to bioinformatics
B.sc biochem i bobi u-1 introduction to bioinformatics
 
Bioinformatics issues and challanges presentation at s p college
Bioinformatics  issues and challanges  presentation at s p collegeBioinformatics  issues and challanges  presentation at s p college
Bioinformatics issues and challanges presentation at s p college
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Bioinformatics and Drug Discovery
Bioinformatics and Drug DiscoveryBioinformatics and Drug Discovery
Bioinformatics and Drug Discovery
 
Nucleic Acid Sequence databases
Nucleic Acid Sequence databasesNucleic Acid Sequence databases
Nucleic Acid Sequence databases
 
Biological Databases
Biological DatabasesBiological Databases
Biological Databases
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Major databases in bioinformatics
Major databases in bioinformaticsMajor databases in bioinformatics
Major databases in bioinformatics
 

Similar to Introduction to Bioinformatics

Bioinformatica 29-09-2011-t1-bioinformatics
Bioinformatica 29-09-2011-t1-bioinformaticsBioinformatica 29-09-2011-t1-bioinformatics
Bioinformatica 29-09-2011-t1-bioinformatics
Prof. Wim Van Criekinge
 
Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomics
mikaelhuss
 
Uses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsUses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in Bioinformatics
Pragya Pai
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
Amna Jalil
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
Vidya Kalaivani Rajkumar
 
Explorations in bioinformatics
Explorations in bioinformaticsExplorations in bioinformatics
Explorations in bioinformatics
Douglas Joubert
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
Bivek Rai
 
2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe
open_phacts
 
Bioinformatics—an introduction for computer scientists
Bioinformatics—an introduction for computer scientistsBioinformatics—an introduction for computer scientists
Bioinformatics—an introduction for computer scientistsunyil96
 
Bioinformatics workflows and study design
Bioinformatics workflows and study designBioinformatics workflows and study design
Bioinformatics workflows and study design
ElanaFertig
 
BIOINFO unit 1.pptx
BIOINFO unit 1.pptxBIOINFO unit 1.pptx
BIOINFO unit 1.pptx
rnath286
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
Carole Goble
 
An analysis of recent advancements in computational biology and Bioinformatic...
An analysis of recent advancements in computational biology and Bioinformatic...An analysis of recent advancements in computational biology and Bioinformatic...
An analysis of recent advancements in computational biology and Bioinformatic...
Pubrica
 
Computational Genomics - Bioinformatics - IK
Computational Genomics - Bioinformatics - IKComputational Genomics - Bioinformatics - IK
Computational Genomics - Bioinformatics - IK
Ilgın Kavaklıoğulları
 
Chemistry made mobile – the expanding world of chemistry in the hand
Chemistry made mobile – the expanding world of chemistry in the handChemistry made mobile – the expanding world of chemistry in the hand
Chemistry made mobile – the expanding world of chemistry in the hand
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
History and devolopment of bioinfomatics.ppt (1)
History and devolopment of bioinfomatics.ppt (1)History and devolopment of bioinfomatics.ppt (1)
History and devolopment of bioinfomatics.ppt (1)
Madan Kumar Ca
 
Introduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdfIntroduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdf
kigaruantony
 
Computational of Bioinformatics
Computational of BioinformaticsComputational of Bioinformatics
Computational of Bioinformatics
ijtsrd
 
Biocomputing
BiocomputingBiocomputing
Biocomputing
ijtsrd
 
An analysis of recent advancements in computational biology and Bioinformatic...
An analysis of recent advancements in computational biology and Bioinformatic...An analysis of recent advancements in computational biology and Bioinformatic...
An analysis of recent advancements in computational biology and Bioinformatic...
Pubrica
 

Similar to Introduction to Bioinformatics (20)

Bioinformatica 29-09-2011-t1-bioinformatics
Bioinformatica 29-09-2011-t1-bioinformaticsBioinformatica 29-09-2011-t1-bioinformatics
Bioinformatica 29-09-2011-t1-bioinformatics
 
Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomics
 
Uses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in BioinformaticsUses of Artificial Intelligence in Bioinformatics
Uses of Artificial Intelligence in Bioinformatics
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Explorations in bioinformatics
Explorations in bioinformaticsExplorations in bioinformatics
Explorations in bioinformatics
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe
 
Bioinformatics—an introduction for computer scientists
Bioinformatics—an introduction for computer scientistsBioinformatics—an introduction for computer scientists
Bioinformatics—an introduction for computer scientists
 
Bioinformatics workflows and study design
Bioinformatics workflows and study designBioinformatics workflows and study design
Bioinformatics workflows and study design
 
BIOINFO unit 1.pptx
BIOINFO unit 1.pptxBIOINFO unit 1.pptx
BIOINFO unit 1.pptx
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
 
An analysis of recent advancements in computational biology and Bioinformatic...
An analysis of recent advancements in computational biology and Bioinformatic...An analysis of recent advancements in computational biology and Bioinformatic...
An analysis of recent advancements in computational biology and Bioinformatic...
 
Computational Genomics - Bioinformatics - IK
Computational Genomics - Bioinformatics - IKComputational Genomics - Bioinformatics - IK
Computational Genomics - Bioinformatics - IK
 
Chemistry made mobile – the expanding world of chemistry in the hand
Chemistry made mobile – the expanding world of chemistry in the handChemistry made mobile – the expanding world of chemistry in the hand
Chemistry made mobile – the expanding world of chemistry in the hand
 
History and devolopment of bioinfomatics.ppt (1)
History and devolopment of bioinfomatics.ppt (1)History and devolopment of bioinfomatics.ppt (1)
History and devolopment of bioinfomatics.ppt (1)
 
Introduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdfIntroduction to Bioinformatics-1.pdf
Introduction to Bioinformatics-1.pdf
 
Computational of Bioinformatics
Computational of BioinformaticsComputational of Bioinformatics
Computational of Bioinformatics
 
Biocomputing
BiocomputingBiocomputing
Biocomputing
 
An analysis of recent advancements in computational biology and Bioinformatic...
An analysis of recent advancements in computational biology and Bioinformatic...An analysis of recent advancements in computational biology and Bioinformatic...
An analysis of recent advancements in computational biology and Bioinformatic...
 

More from Denis C. Bauer

Cloud-native machine learning - Transforming bioinformatics research
Cloud-native machine learning - Transforming bioinformatics research Cloud-native machine learning - Transforming bioinformatics research
Cloud-native machine learning - Transforming bioinformatics research
Denis C. Bauer
 
Translating genomics into clinical practice - 2018 AWS summit keynote
Translating genomics into clinical practice - 2018 AWS summit keynoteTranslating genomics into clinical practice - 2018 AWS summit keynote
Translating genomics into clinical practice - 2018 AWS summit keynote
Denis C. Bauer
 
Going Server-less for Web-Services that need to Crunch Large Volumes of Data
Going Server-less for Web-Services that need to Crunch Large Volumes of DataGoing Server-less for Web-Services that need to Crunch Large Volumes of Data
Going Server-less for Web-Services that need to Crunch Large Volumes of Data
Denis C. Bauer
 
How novel compute technology transforms life science research
How novel compute technology transforms life science researchHow novel compute technology transforms life science research
How novel compute technology transforms life science research
Denis C. Bauer
 
How novel compute technology transforms life science research
How novel compute technology transforms life science researchHow novel compute technology transforms life science research
How novel compute technology transforms life science research
Denis C. Bauer
 
VariantSpark: applying Spark-based machine learning methods to genomic inform...
VariantSpark: applying Spark-based machine learning methods to genomic inform...VariantSpark: applying Spark-based machine learning methods to genomic inform...
VariantSpark: applying Spark-based machine learning methods to genomic inform...
Denis C. Bauer
 
Population-scale high-throughput sequencing data analysis
Population-scale high-throughput sequencing data analysisPopulation-scale high-throughput sequencing data analysis
Population-scale high-throughput sequencing data analysis
Denis C. Bauer
 
Trip Report Seattle
Trip Report SeattleTrip Report Seattle
Trip Report Seattle
Denis C. Bauer
 
Allelic Imbalance for Pre-capture Whole Exome Sequencing
Allelic Imbalance for Pre-capture Whole Exome SequencingAllelic Imbalance for Pre-capture Whole Exome Sequencing
Allelic Imbalance for Pre-capture Whole Exome Sequencing
Denis C. Bauer
 
Centralizing sequence analysis
Centralizing sequence analysisCentralizing sequence analysis
Centralizing sequence analysis
Denis C. Bauer
 
Qbi Centre for Brain genomics (Informatics side)
Qbi Centre for Brain genomics (Informatics side)Qbi Centre for Brain genomics (Informatics side)
Qbi Centre for Brain genomics (Informatics side)
Denis C. Bauer
 
Differential gene expression
Differential gene expressionDifferential gene expression
Differential gene expression
Denis C. Bauer
 
Transcript detection in RNAseq
Transcript detection in RNAseqTranscript detection in RNAseq
Transcript detection in RNAseq
Denis C. Bauer
 
The missing data issue for HiSeq runs
The missing data issue for HiSeq runsThe missing data issue for HiSeq runs
The missing data issue for HiSeq runs
Denis C. Bauer
 
Deciphering the regulatory code in the genome
Deciphering the regulatory code in the genomeDeciphering the regulatory code in the genome
Deciphering the regulatory code in the genome
Denis C. Bauer
 
ReliF
ReliFReliF
STAR: Recombination site prediction
STAR: Recombination site predictionSTAR: Recombination site prediction
STAR: Recombination site prediction
Denis C. Bauer
 
SUMOylation site prediction
SUMOylation site predictionSUMOylation site prediction
SUMOylation site prediction
Denis C. Bauer
 

More from Denis C. Bauer (18)

Cloud-native machine learning - Transforming bioinformatics research
Cloud-native machine learning - Transforming bioinformatics research Cloud-native machine learning - Transforming bioinformatics research
Cloud-native machine learning - Transforming bioinformatics research
 
Translating genomics into clinical practice - 2018 AWS summit keynote
Translating genomics into clinical practice - 2018 AWS summit keynoteTranslating genomics into clinical practice - 2018 AWS summit keynote
Translating genomics into clinical practice - 2018 AWS summit keynote
 
Going Server-less for Web-Services that need to Crunch Large Volumes of Data
Going Server-less for Web-Services that need to Crunch Large Volumes of DataGoing Server-less for Web-Services that need to Crunch Large Volumes of Data
Going Server-less for Web-Services that need to Crunch Large Volumes of Data
 
How novel compute technology transforms life science research
How novel compute technology transforms life science researchHow novel compute technology transforms life science research
How novel compute technology transforms life science research
 
How novel compute technology transforms life science research
How novel compute technology transforms life science researchHow novel compute technology transforms life science research
How novel compute technology transforms life science research
 
VariantSpark: applying Spark-based machine learning methods to genomic inform...
VariantSpark: applying Spark-based machine learning methods to genomic inform...VariantSpark: applying Spark-based machine learning methods to genomic inform...
VariantSpark: applying Spark-based machine learning methods to genomic inform...
 
Population-scale high-throughput sequencing data analysis
Population-scale high-throughput sequencing data analysisPopulation-scale high-throughput sequencing data analysis
Population-scale high-throughput sequencing data analysis
 
Trip Report Seattle
Trip Report SeattleTrip Report Seattle
Trip Report Seattle
 
Allelic Imbalance for Pre-capture Whole Exome Sequencing
Allelic Imbalance for Pre-capture Whole Exome SequencingAllelic Imbalance for Pre-capture Whole Exome Sequencing
Allelic Imbalance for Pre-capture Whole Exome Sequencing
 
Centralizing sequence analysis
Centralizing sequence analysisCentralizing sequence analysis
Centralizing sequence analysis
 
Qbi Centre for Brain genomics (Informatics side)
Qbi Centre for Brain genomics (Informatics side)Qbi Centre for Brain genomics (Informatics side)
Qbi Centre for Brain genomics (Informatics side)
 
Differential gene expression
Differential gene expressionDifferential gene expression
Differential gene expression
 
Transcript detection in RNAseq
Transcript detection in RNAseqTranscript detection in RNAseq
Transcript detection in RNAseq
 
The missing data issue for HiSeq runs
The missing data issue for HiSeq runsThe missing data issue for HiSeq runs
The missing data issue for HiSeq runs
 
Deciphering the regulatory code in the genome
Deciphering the regulatory code in the genomeDeciphering the regulatory code in the genome
Deciphering the regulatory code in the genome
 
ReliF
ReliFReliF
ReliF
 
STAR: Recombination site prediction
STAR: Recombination site predictionSTAR: Recombination site prediction
STAR: Recombination site prediction
 
SUMOylation site prediction
SUMOylation site predictionSUMOylation site prediction
SUMOylation site prediction
 

Recently uploaded

How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 

Recently uploaded (20)

How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 

Introduction to Bioinformatics

  • 1. Introduction to Bioinformatics A tale of myths and legends [Freevector] June 16, 2011
  • 2. June 16, 2011 “Bioinformatics is the field of science in which biology, computer science, and information technology merge to form a single discipline.” National Center for Biotechnology Information (NCBI)
  • 3. Areas where bioinformatics is applied Genomics Genomic feature prediction Sequencing data analysis Proteomics Protein 3D structure modeling Drug design Systems Biology Gene set enrichment Pathway analysis Phenotype Image analysis Integration June 16, 2011
  • 4. Approach Biological Question Generate Data Translate into a computer solvable task Develop an algorithm Implement algorithm Run algorithm Condense result in human readable form Answer Biological Question Example Genes regulated by protein X ChIP-Seq data “Align reads and identify clusters in the genome” Choose data structures Write source code Align reads Write script to summarize results genome wide Report protein’s binding sites June 16, 2011 “Bioinformatics is the field of science in which biology, computer science, and information technology merge to form a single discipline.” NCBI
  • 5. The challenges in bioinformatics Acceptance by biological collaborators when all that matters for the publication is the biology Retaining quality work Workflows poorly annotated in papers Programs poorly written No reproducibility Keeping up-to-date New programs are published every week New formats because no time to evaluate existing standards New databases because existing ones full of noise June 16, 2011
  • 6. Bioinformatics a mythical creature? June 16, 2011 Christos Ouzounis Head of the Computational Genomics Group at the European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Cambridge UK
  • 7. Myth #1: Anybody can do it! Assumption: Most bioinformatics analysis can be done by using web applications and commercial programs with GUIs June 16, 2011 http://www.broadinstitute.org/cancer/software/genepattern/index.html http://main.g2.bx.psu.edu/ Ouzounis C. Two or three myths about bioinformatics. Bioinformatics. 2000 Mar;16(3):187-9. PubMed PMID: 10869011.
  • 8.
  • 9. Myth #2: Bioinformatics is a service Assumption: Bioinformatics merely supports the experimental research and can be a disconnect service June 16, 2011 Traditional Biology Hypothesis Experiment Eyeballing Experimental Design Evaluation Biology High Throughput Biology (assumption) Experiment Biology Hypothesis Experimental Design Evaluation Data analysis Bioinformatics
  • 10.
  • 11. Myth #4: Bioinformatics is quick Assumption: bioinformatics analysis can be done quickly because computers are involved. June 16, 2011 http://www.ads-links.com/images/wp/keyboard-fast.gif
  • 12. Bioinformatics analysis is a scientific experiment in itself Bioinformatics is faster than manual work, however Quick tasks accumulated take a long time Task: Map 15 million reads of 76 bp length against the complete human genome (hg18) Manual: couple of decades Brute-Force: couple of years BLAST (1995): couple of days Modern Aligners: BWA ~ 4 h Bioinformatics is a proper scientific experiment in itself requires time for experimental design, development of controls, parameter tuning, evaluation, and summarizing. June 16, 2011 ! Bao S, Jiang R, Kwan W, Wang B, Ma X, Song YQ. Evaluation of next-generation sequencing software in mapping and assembly. J Hum Genet. 2011 Apr 28. PubMed PMID: 21525877.
  • 13. Myth #5: All dry-lab research does the same Assumption: bioinformatics is interchangeable with other dry lab research area because they all “analyze data”. Assumption: All biological research areas are interchangeable because they all “work with samples”. June 16, 2011
  • 14. Three things to remember Bioinformatics requires dedication and continuity Bioinformatics data analysis is a full research experiment in itself We get the most out of our research if we work as a interdisciplinary research team throughout June 16, 2011 Experiment Analysis Evaluation Experimental Design Hypothesis
  • 15. Next week: June 16, 2011 Abstract: An introduction to second generation sequencing will be given with focus on the production informatics: The basic approach of read-mapping and feature extraction will be introduced and challenges associated with sequencing errors discussed. http://web.qbi.uq.edu.au/labs/gseq/analysis/bioinformatics-seminar-series/
  • 16. TIP Puts several images in one file convert -adjoin unicorn.pngunicorn.pngunicorn.pngadjoin.pdf Joins several images into one image convert –append unicorn.pngunicorn.pngunicorn.pngappend.pdf June 16, 2011

Editor's Notes

  1. http://www.freevector.com/site_media/preview_images/FreeVector-Mythical-Creatures.jpg
  2. An experiment is reproducible until another laboratory tries to repeat it Alexander Kohn
  3. Discuss some of the points he raised
  4. your research have more impact -> teaming up with a bioinf