FBW
24-09-2013
Wim Van Criekinge
What is Bioinformatics ?
• Application of information technology to the
storage, management and analysis of biological
inf...
• Medicine (Pharma)
– Genome analysis allows the targeting of genetic
diseases
– The effect of a disease or of a therapeut...
Bioinformatics: What’s in a name ?
• Begin 1990’s
• “Bio-informatics”:
Computing Power
Genbank
(Log)
Time (years)
Bioinformatics: What’s in a name ?
• Begin 1990’s
• “Bio-informatics”:
– convergence of explosive growth in
biotechnology,...
Time (years)
Happy Birthday …
PCR + dye termination
Suddenly, a flash of insight caused him to pull the car
off the road and stop. He awakened his frien...
Math
Informatics
Bioinformatics, a scientific discipline …
Theoretical Biology
Computational Biology
(Molecular)
Biology
C...
Math
Algorithm Development
Informatics
Interface Design
Bioinformatics, a scientific discipline …
AI, Image Analysis
struc...
Math
Algorithm Development
Informatics
Interface Design
Bioinformatics, a scientific discipline …
AI, Image Analysis
struc...
Doel van de cursus
• Meer dan een inleiding tot ... het is de
bedoeling van de cursus een onderliggend
inzicht te verschaf...
Inhoud Lessen: Bioinformatica
Examen
• Theorie
– Deel rond een zelf te kiezen publicatie die in verband
staat met de cursus
• Bv Bioinformatics of Compu...
Cursus
• 25 Euro
– Syllabus
– V|Podcasts
– Weblems – Screencasts
– Flash Drive
20
biobix
wvcrieki
biobix.be
bioinformatics.be
• Timelin: Magaret
Dayhoff …
nature
the
Human
genome
Setting the stage …
Genome Size
DOGS: Database Of Genome Sizes
E. coli = 4.2 x 106
Yeast = 18 x 106
Arabidopsis = 80 x 106
C.elegans = 100 x 1...
Biological Research
Adapted from John McPherson, OICR
And this is just the beginning ….
Next Generation Sequencing is here
Basics of the “old” technology
• Clone the DNA.
• Generate a ladder of labeled (colored) molecules
that are different by 1...
Basics of the “new” technology
• Get DNA.
• Attach it to something.
• Extend and amplify signal with some color
scheme.
• ...
Next Generation Technologies
• 454
–Emulsion PCR
–Polymerase
–Natural Nucleotides
• 20-100Mb for 5-15k
–1% error rate
–Hom...
One additional insight ...
Read Length is Not As Important For Resequencing
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
8 10 12 14 16 18 20
Length of...
Two Short Read Techologies
• Illumina GA
• ABI SOLID
Technology Overview: Solexa/Illumina Sequencing
ABI Solid
Dressman 2003
ABI SOLID
ABI SOLID
Paired End Reads are Important!
Repetitive DNA
Unique DNA
Single read maps to
multiple positions
Paired read maps uniquely...
Single Molecule Sequencing
Helicos Biosciences Corp.
Microscope slide
Single DNA
molecule
dNTP-Cy3
* * *
*
primer
Super-co...
Introducing
NXT GNT DXS
NextGenerationDiagnostics
NXT GNT DXS
• GNT
– Dedicated Team & Network
– Operational: Location
– Professionalized
• DXS
– Content engine
– Product 1...
Next next generation sequencing
Third generation sequencing
Now sequencing
Complete genomics
Complete genomics
Pacific Biosciences: A Third Generation Sequencing Technology
Eid et al 2008
Pacific Biosciences: A Third Generation Sequencing Technology
Nanopore Sequencing
NCBI (educational resources)
Weblems
• What ?
– Web-based problemes (over de huidige les
en/of voorbereiding op volgende les)
• When ?
– Einde van elke...
Weblems
W1.1: To which phyla do the following species belong (a)
starfish (b) ginko tree (c) scorpion
W1.2: What are the c...
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Bioinformatics t1-introduction wim-vancriekinge_v2013
Upcoming SlideShare
Loading in...5
×

Bioinformatics t1-introduction wim-vancriekinge_v2013

1,225
-1

Published on

An introduction to Bioinformatica anno 2013

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,225
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
46
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Bioinformatics t1-introduction wim-vancriekinge_v2013

  1. 1. FBW 24-09-2013 Wim Van Criekinge
  2. 2. What is Bioinformatics ? • Application of information technology to the storage, management and analysis of biological information (Facilitated by the use of computers) – Sequence analysis? – Molecular modeling (HTX) ? – Phylogeny/evolution? – Ecology and population studies? – Medical informatics? – Image Analysis ? – Statistics ? AI ? – Sterkstroom of zwakstroom ?
  3. 3. • Medicine (Pharma) – Genome analysis allows the targeting of genetic diseases – The effect of a disease or of a therapeutic on RNA and protein levels can be elucidated – Knowledge of protein structure facilitates drug design – Understanding of genomic variation allows the tailoring of medical treatment to the individual’s genetic make- up • The same techniques can be applied to crop (Agro) and livestock improvement (Animal Health) Promises of genomics and bioinformatics
  4. 4. Bioinformatics: What’s in a name ? • Begin 1990’s • “Bio-informatics”: Computing Power Genbank (Log) Time (years)
  5. 5. Bioinformatics: What’s in a name ? • Begin 1990’s • “Bio-informatics”: – convergence of explosive growth in biotechnology, paralled by the explosive growth in information technology • Not new: > 30 years that people use “computers” in biology • In silico biology, database biology, ...
  6. 6. Time (years)
  7. 7. Happy Birthday …
  8. 8. PCR + dye termination Suddenly, a flash of insight caused him to pull the car off the road and stop. He awakened his friend dozing in the passenger seat and excitedly explained to her that he had hit upon a solution - not to his original problem, but to one of even greater significance. Kary Mullis had just conceived of a simple method for producing virtually unlimited copies of a specific DNA sequence in a test tube - the polymerase chain reaction (PCR)
  9. 9. Math Informatics Bioinformatics, a scientific discipline … Theoretical Biology Computational Biology (Molecular) Biology Computer Science Bioinformatics
  10. 10. Math Algorithm Development Informatics Interface Design Bioinformatics, a scientific discipline … AI, Image Analysis structure prediction (HTX) Theoretical Biology Sequence Analysis Computational Biology (Molecular) Biology Expert Annotation Computer Science NP Datamining Bioinformatics
  11. 11. Math Algorithm Development Informatics Interface Design Bioinformatics, a scientific discipline … AI, Image Analysis structure prediction (HTX) Theoretical Biology Sequence Analysis Computational Biology (Molecular) Biology Expert Annotation Computer Science NP Datamining Bioinformatics Discovery Informatics – Computational Genomics
  12. 12. Doel van de cursus • Meer dan een inleiding tot ... het is de bedoeling van de cursus een onderliggend inzicht te verschaffen achter de verschillende technieken. • Naast het gebruik van recepten, wat terug te vinden is in delen van de syllabus laat een inzicht in – de werking van databanken – en de achterliggende algoritmen • toe – om wisselende interfaces op nieuwe problemen toe te passen.
  13. 13. Inhoud Lessen: Bioinformatica
  14. 14. Examen • Theorie – Deel rond een zelf te kiezen publicatie die in verband staat met de cursus • Bv Bioinformatics of Computational Biology – Drie inzichtsvragen over de cursus (inclusief  !!) • Practicum (“open-book”) – Viertal oefeningen die meestal het schrijven van een programma veronderstellen • Puntenverdeling 50/50
  15. 15. Cursus • 25 Euro – Syllabus – V|Podcasts – Weblems – Screencasts – Flash Drive
  16. 16. 20 biobix wvcrieki biobix.be bioinformatics.be
  17. 17. • Timelin: Magaret Dayhoff …
  18. 18. nature the Human genome Setting the stage …
  19. 19. Genome Size DOGS: Database Of Genome Sizes E. coli = 4.2 x 106 Yeast = 18 x 106 Arabidopsis = 80 x 106 C.elegans = 100 x 106 Drosophila = 180 x 106 Human/Rat/Mouse = 3000 x 106 Lily = 300 000 x 106 With ... : 99.9 % To primates: 99%
  20. 20. Biological Research Adapted from John McPherson, OICR
  21. 21. And this is just the beginning …. Next Generation Sequencing is here
  22. 22. Basics of the “old” technology • Clone the DNA. • Generate a ladder of labeled (colored) molecules that are different by 1 nucleotide. • Separate mixture on some matrix. • Detect fluorochrome by laser. • Interpret peaks as string of DNA. • Strings are 500 to 1,000 letters long • 1 machine generates 57,000 nucleotides/run • Assemble all strings into a genome.
  23. 23. Basics of the “new” technology • Get DNA. • Attach it to something. • Extend and amplify signal with some color scheme. • Detect fluorochrome by microscopy. • Interpret series of spots as short strings of DNA. • Strings are 30-300 letters long • Multiple images are interpreted as 0.4 to 1.2 GB/run (1,200,000,000 letters/day). • Map or align strings to one or many genome.
  24. 24. Next Generation Technologies • 454 –Emulsion PCR –Polymerase –Natural Nucleotides • 20-100Mb for 5-15k –1% error rate –Homopolymers
  25. 25. One additional insight ...
  26. 26. Read Length is Not As Important For Resequencing 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 8 10 12 14 16 18 20 Length of K-mer Reads (bp) %ofPairedK-merswithUniquely AssignableLocation E.COLI HUMAN Jay Shendure
  27. 27. Two Short Read Techologies • Illumina GA • ABI SOLID
  28. 28. Technology Overview: Solexa/Illumina Sequencing
  29. 29. ABI Solid Dressman 2003
  30. 30. ABI SOLID
  31. 31. ABI SOLID
  32. 32. Paired End Reads are Important! Repetitive DNA Unique DNA Single read maps to multiple positions Paired read maps uniquely Read 1 Read 2 Known Distance
  33. 33. Single Molecule Sequencing Helicos Biosciences Corp. Microscope slide Single DNA molecule dNTP-Cy3 * * * * primer Super-cooled TIRF microscope Adapted from: Barak Cohen, Washington University, Bio5488 http://tinyurl.com/6zttuq http://tinyurl.com/6k26nh
  34. 34. Introducing NXT GNT DXS NextGenerationDiagnostics
  35. 35. NXT GNT DXS • GNT – Dedicated Team & Network – Operational: Location – Professionalized • DXS – Content engine – Product 1 established – Pipeline for n+1 • NXT – Workflow management – Bioinformatics – Epigenetics
  36. 36. Next next generation sequencing Third generation sequencing Now sequencing
  37. 37. Complete genomics
  38. 38. Complete genomics
  39. 39. Pacific Biosciences: A Third Generation Sequencing Technology Eid et al 2008
  40. 40. Pacific Biosciences: A Third Generation Sequencing Technology
  41. 41. Nanopore Sequencing
  42. 42. NCBI (educational resources)
  43. 43. Weblems • What ? – Web-based problemes (over de huidige les en/of voorbereiding op volgende les) • When ? – Einde van elke les • How ? – Oplossingen online via screencasts – Practicum – Voorbedereiding op het practicum examen ... Niet alle problemen vereisen noodzakelijk programmacode ...
  44. 44. Weblems W1.1: To which phyla do the following species belong (a) starfish (b) ginko tree (c) scorpion W1.2: What are the common names for the following species (a) Orycterophus afer (b) Beta vulagaris (c) macrocystis pyrifera W1.3: What species has the smallest known genome ? And is genome size related to number of genes ? W1.4: What are the 5 latest genomes published ? How complete is “coverage” ? W1.5: For approximately 10% of europeans, the painkiller codeine is ineffective because the patients lack the enzyme that converts codeine into the active molecule, morphine. What is the most common mutation that causes this condition ?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×