FBW29-09-2011Wim Van Criekinge
What is Bioinformatics ?Application of information technology to the storage, management and analysis of biological information (Facilitated by the use of computers)Sequence analysis?Molecular modeling (HTX) ?Phylogeny/evolution?Ecology and population studies?Medical informatics?Image Analysis ?Statistics ? AI ?Sterkstroom of zwakstroom ?
Promises of genomics and bioinformaticsMedicine (Pharma)Genome analysis allows the targeting of genetic diseasesThe effect of a disease or of a therapeutic on RNA and protein levels can be elucidatedKnowledge of protein structure facilitates drug designUnderstanding of genomic variation allows the tailoring of medical treatment to the individual’s genetic make-upThe same techniques can be applied to crop (Agro) and livestock improvement (Animal Health)
Bioinformatics: What’s in a name ?Begin 1990’s“Bio-informatics”:Computing PowerGenbank(Log)Time (years)
Bioinformatics: What’s in a name ?Begin 1990’s“Bio-informatics”:convergence of explosive growth in biotechnology, paralled by the explosive growth in information technologyNot new: > 30 years that people use “computers” in biologyIn silico biology, database biology, ...
Time (years)
Happy Birthday …
PCR + dye terminationSuddenly, a flash of insight caused him to pull the car off the road and stop. He awakened his friend dozing in the passenger seat and excitedly explained to her that he had hit upon a solution - not to his original problem, but to one of even greater significance. Kary Mullis had just conceived of a simple method for producing virtually unlimited copies of a specific DNA sequence in a test tube - the polymerase chain reaction (PCR)
MathTheoretical BiologyComputer Science(Molecular)BiologyInformaticsComputational BiologyBioinformatics, a scientific discipline  …Bioinformatics
Math Algorithm DevelopmentTheoretical BiologyComputer ScienceAI, Image Analysisstructure prediction (HTX)NPDataminingInterface DesignExpert AnnotationSequence Analysis(Molecular)BiologyInformaticsComputational BiologyBioinformatics, a scientific discipline  …Bioinformatics
Math Algorithm DevelopmentTheoretical BiologyComputer ScienceAI, Image Analysisstructure prediction (HTX)NPDataminingInterface DesignExpert AnnotationSequence Analysis(Molecular)BiologyInformaticsComputational BiologyBioinformatics, a scientific discipline  …BioinformaticsDiscovery Informatics – Computational Genomics
Doel van de cursusMeer dan een inleiding tot ... het is de bedoeling van de cursus een onderliggend inzicht te verschaffen achter de verschillende technieken. Naast het gebruik van recepten, wat terug te vinden is in delen van de syllabus laat een inzicht in de werking van databanken en de achterliggende algoritmen toe om wisselende interfaces op nieuwe problemen toe te passen.
Inhoud Lessen: Bioinformaticadon 29-09-2011: 1* Bioinformatics (practicum 8.30-11.00) don 06-10-2011: 2* Biological Databases (practicum 9.00-11.30) don 20-10-2011: 3 Sequence Similarity (Scoring Matrices)don 27-10-2011: 4 Sequence Alignmentsdon 10-11-2011: 5 Database Searching Fasta/Blastdon 17-11-2011: 6 Phylogeneticsdon 24-11-2011: 7 Protein Structure don 01-12-2011: 8 Gene Prediction, Gene Ontologies & HMMdon 08-12-2011: 9 ncRNA, Chip Data Analysis, AIdon 15-12-2011: 10 Bio- & Cheminformatics in Drug Discovery (inhaalweek)Opgelet: Geen les op don 13-10-2010 en don 3-11-2010
ExamenTheorie Deel rond een zelf te kiezen publicatie die in verband staat met de cursus Bv Bioinformatics of Computational Biology Drie inzichtsvragen over de cursus (inclusief  !!)Practicum (“open-book”)Viertal oefeningen die meestal het schrijven van een programma veronderstellenPuntenverdeling 50/50
Timelin: Magaret Dayhoff …
Nexus > FAQ > Bioinformatics Milestones
http://www.sciencemag.org/cgi/content/full/291/5507/1195Printed version in cursus
naturetheHumangenomeSetting the stage …
Genome MetersGenomes Online Database (GOLD 1.0)http://geta.life.uiuc.edu/~nikos/genomes.htmlhttp://www.ebi.ac.uk/research/cgg/genomes.htmlNCBIhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/bact.htmlINFOBIOGENhttp://www.infobiogen.fr/doc/data/complete_genome.html
Genome SizeE. coli = 4.2 x 106Yeast = 18 x 106Arabidopsis = 80 x 106C.elegans  = 100 x 106Drosophila = 180 x 106Human/Rat/Mouse = 3000 x 106Lily = 300 000 x 106With ... : 99.9 %To primates: 99%DOGS: Database Of Genome Sizes
Biological ResearchAdapted from John McPherson, OICR
And this is just the beginning ….Next Generation Sequencing is here
Basics of the “old” technologyClone the DNA.Generate a ladder of labeled (colored) molecules that are different by 1 nucleotide.Separate mixture on some matrix.Detect fluorochrome by laser.Interpret peaks as string of DNA.Strings are 500 to 1,000 letters long1 machine generates 57,000 nucleotides/runAssemble all strings into a genome.
Basics of the “new” technologyGet DNA.Attach it to something.Extend and amplify signal with some color scheme.Detect fluorochrome by microscopy.Interpret series of spots as short strings of DNA.Strings are 30-300 letters longMultiple images are interpreted as 0.4 to 1.2 GB/run  (1,200,000,000 letters/day). Map or align strings to one or many genome.
Next  Generation Technologies454Emulsion PCRPolymeraseNatural Nucleotides20-100Mb for 5-15k 1% error rateHomopolymers
One additional insight ...
Read Length is Not As Important For ResequencingJay Shendure
Two Short Read TechologiesIllumina GAABI SOLID
Technology Overview: Solexa/Illumina Sequencing
ABI SolidDressman 2003
ABI SOLID
ABI SOLID
Paired End Reads are Important!Known DistanceRead 1Read 2Repetitive DNAUnique DNAPaired read maps uniquelySingle read maps to multiple positions
Single Molecule SequencingAdapted from: Barak Cohen, Washington University, Bio5488    http://tinyurl.com/6zttuq http://tinyurl.com/6k26nhMicroscope slide***Single DNA moleculeSuper-cooledTIRF microscopeprimerdNTP-Cy3*Helicos Biosciences Corp.
IntroducingNXT GNT DXSNextGenerationDiagnostics18th september 2009Wim Van Criekinge
develop in shortest time frame best assay for most relevant clinical application
NXT GNT DXSGNT
Dedicated Team & Network
Operational: Location
Professionalized
DXS
Content engine
Product 1 established
Pipeline for n+1
NXT
Workflow management

Bioinformatica 29-09-2011-t1-bioinformatics

  • 2.
  • 4.
    What is Bioinformatics?Application of information technology to the storage, management and analysis of biological information (Facilitated by the use of computers)Sequence analysis?Molecular modeling (HTX) ?Phylogeny/evolution?Ecology and population studies?Medical informatics?Image Analysis ?Statistics ? AI ?Sterkstroom of zwakstroom ?
  • 5.
    Promises of genomicsand bioinformaticsMedicine (Pharma)Genome analysis allows the targeting of genetic diseasesThe effect of a disease or of a therapeutic on RNA and protein levels can be elucidatedKnowledge of protein structure facilitates drug designUnderstanding of genomic variation allows the tailoring of medical treatment to the individual’s genetic make-upThe same techniques can be applied to crop (Agro) and livestock improvement (Animal Health)
  • 6.
    Bioinformatics: What’s ina name ?Begin 1990’s“Bio-informatics”:Computing PowerGenbank(Log)Time (years)
  • 7.
    Bioinformatics: What’s ina name ?Begin 1990’s“Bio-informatics”:convergence of explosive growth in biotechnology, paralled by the explosive growth in information technologyNot new: > 30 years that people use “computers” in biologyIn silico biology, database biology, ...
  • 8.
  • 9.
  • 10.
    PCR + dyeterminationSuddenly, a flash of insight caused him to pull the car off the road and stop. He awakened his friend dozing in the passenger seat and excitedly explained to her that he had hit upon a solution - not to his original problem, but to one of even greater significance. Kary Mullis had just conceived of a simple method for producing virtually unlimited copies of a specific DNA sequence in a test tube - the polymerase chain reaction (PCR)
  • 11.
    MathTheoretical BiologyComputer Science(Molecular)BiologyInformaticsComputationalBiologyBioinformatics, a scientific discipline …Bioinformatics
  • 12.
    Math Algorithm DevelopmentTheoreticalBiologyComputer ScienceAI, Image Analysisstructure prediction (HTX)NPDataminingInterface DesignExpert AnnotationSequence Analysis(Molecular)BiologyInformaticsComputational BiologyBioinformatics, a scientific discipline …Bioinformatics
  • 13.
    Math Algorithm DevelopmentTheoreticalBiologyComputer ScienceAI, Image Analysisstructure prediction (HTX)NPDataminingInterface DesignExpert AnnotationSequence Analysis(Molecular)BiologyInformaticsComputational BiologyBioinformatics, a scientific discipline …BioinformaticsDiscovery Informatics – Computational Genomics
  • 14.
    Doel van decursusMeer dan een inleiding tot ... het is de bedoeling van de cursus een onderliggend inzicht te verschaffen achter de verschillende technieken. Naast het gebruik van recepten, wat terug te vinden is in delen van de syllabus laat een inzicht in de werking van databanken en de achterliggende algoritmen toe om wisselende interfaces op nieuwe problemen toe te passen.
  • 15.
    Inhoud Lessen: Bioinformaticadon29-09-2011: 1* Bioinformatics (practicum 8.30-11.00) don 06-10-2011: 2* Biological Databases (practicum 9.00-11.30) don 20-10-2011: 3 Sequence Similarity (Scoring Matrices)don 27-10-2011: 4 Sequence Alignmentsdon 10-11-2011: 5 Database Searching Fasta/Blastdon 17-11-2011: 6 Phylogeneticsdon 24-11-2011: 7 Protein Structure don 01-12-2011: 8 Gene Prediction, Gene Ontologies & HMMdon 08-12-2011: 9 ncRNA, Chip Data Analysis, AIdon 15-12-2011: 10 Bio- & Cheminformatics in Drug Discovery (inhaalweek)Opgelet: Geen les op don 13-10-2010 en don 3-11-2010
  • 16.
    ExamenTheorie Deel rondeen zelf te kiezen publicatie die in verband staat met de cursus Bv Bioinformatics of Computational Biology Drie inzichtsvragen over de cursus (inclusief  !!)Practicum (“open-book”)Viertal oefeningen die meestal het schrijven van een programma veronderstellenPuntenverdeling 50/50
  • 18.
  • 19.
    Nexus > FAQ> Bioinformatics Milestones
  • 20.
  • 21.
  • 25.
    Genome MetersGenomes OnlineDatabase (GOLD 1.0)http://geta.life.uiuc.edu/~nikos/genomes.htmlhttp://www.ebi.ac.uk/research/cgg/genomes.htmlNCBIhttp://www.ncbi.nlm.nih.gov/PMGifs/Genomes/bact.htmlINFOBIOGENhttp://www.infobiogen.fr/doc/data/complete_genome.html
  • 26.
    Genome SizeE. coli= 4.2 x 106Yeast = 18 x 106Arabidopsis = 80 x 106C.elegans = 100 x 106Drosophila = 180 x 106Human/Rat/Mouse = 3000 x 106Lily = 300 000 x 106With ... : 99.9 %To primates: 99%DOGS: Database Of Genome Sizes
  • 28.
  • 29.
    And this isjust the beginning ….Next Generation Sequencing is here
  • 30.
    Basics of the“old” technologyClone the DNA.Generate a ladder of labeled (colored) molecules that are different by 1 nucleotide.Separate mixture on some matrix.Detect fluorochrome by laser.Interpret peaks as string of DNA.Strings are 500 to 1,000 letters long1 machine generates 57,000 nucleotides/runAssemble all strings into a genome.
  • 31.
    Basics of the“new” technologyGet DNA.Attach it to something.Extend and amplify signal with some color scheme.Detect fluorochrome by microscopy.Interpret series of spots as short strings of DNA.Strings are 30-300 letters longMultiple images are interpreted as 0.4 to 1.2 GB/run (1,200,000,000 letters/day). Map or align strings to one or many genome.
  • 32.
    Next GenerationTechnologies454Emulsion PCRPolymeraseNatural Nucleotides20-100Mb for 5-15k 1% error rateHomopolymers
  • 38.
  • 39.
    Read Length isNot As Important For ResequencingJay Shendure
  • 40.
    Two Short ReadTechologiesIllumina GAABI SOLID
  • 41.
  • 47.
  • 48.
  • 49.
  • 53.
    Paired End Readsare Important!Known DistanceRead 1Read 2Repetitive DNAUnique DNAPaired read maps uniquelySingle read maps to multiple positions
  • 54.
    Single Molecule SequencingAdaptedfrom: Barak Cohen, Washington University, Bio5488 http://tinyurl.com/6zttuq http://tinyurl.com/6k26nhMicroscope slide***Single DNA moleculeSuper-cooledTIRF microscopeprimerdNTP-Cy3*Helicos Biosciences Corp.
  • 55.
  • 56.
    develop in shortesttime frame best assay for most relevant clinical application
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 66.
  • 67.