Your SlideShare is downloading. ×
Bioinformatics t1-introduction wim-vancriekinge_v2013
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Bioinformatics t1-introduction wim-vancriekinge_v2013

1,191

Published on

An introduction to Bioinformatica anno 2013

An introduction to Bioinformatica anno 2013

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,191
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
45
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. FBW 24-09-2013 Wim Van Criekinge
  • 2. What is Bioinformatics ? • Application of information technology to the storage, management and analysis of biological information (Facilitated by the use of computers) – Sequence analysis? – Molecular modeling (HTX) ? – Phylogeny/evolution? – Ecology and population studies? – Medical informatics? – Image Analysis ? – Statistics ? AI ? – Sterkstroom of zwakstroom ?
  • 3. • Medicine (Pharma) – Genome analysis allows the targeting of genetic diseases – The effect of a disease or of a therapeutic on RNA and protein levels can be elucidated – Knowledge of protein structure facilitates drug design – Understanding of genomic variation allows the tailoring of medical treatment to the individual’s genetic make- up • The same techniques can be applied to crop (Agro) and livestock improvement (Animal Health) Promises of genomics and bioinformatics
  • 4. Bioinformatics: What’s in a name ? • Begin 1990’s • “Bio-informatics”: Computing Power Genbank (Log) Time (years)
  • 5. Bioinformatics: What’s in a name ? • Begin 1990’s • “Bio-informatics”: – convergence of explosive growth in biotechnology, paralled by the explosive growth in information technology • Not new: > 30 years that people use “computers” in biology • In silico biology, database biology, ...
  • 6. Time (years)
  • 7. Happy Birthday …
  • 8. PCR + dye termination Suddenly, a flash of insight caused him to pull the car off the road and stop. He awakened his friend dozing in the passenger seat and excitedly explained to her that he had hit upon a solution - not to his original problem, but to one of even greater significance. Kary Mullis had just conceived of a simple method for producing virtually unlimited copies of a specific DNA sequence in a test tube - the polymerase chain reaction (PCR)
  • 9. Math Informatics Bioinformatics, a scientific discipline … Theoretical Biology Computational Biology (Molecular) Biology Computer Science Bioinformatics
  • 10. Math Algorithm Development Informatics Interface Design Bioinformatics, a scientific discipline … AI, Image Analysis structure prediction (HTX) Theoretical Biology Sequence Analysis Computational Biology (Molecular) Biology Expert Annotation Computer Science NP Datamining Bioinformatics
  • 11. Math Algorithm Development Informatics Interface Design Bioinformatics, a scientific discipline … AI, Image Analysis structure prediction (HTX) Theoretical Biology Sequence Analysis Computational Biology (Molecular) Biology Expert Annotation Computer Science NP Datamining Bioinformatics Discovery Informatics – Computational Genomics
  • 12. Doel van de cursus • Meer dan een inleiding tot ... het is de bedoeling van de cursus een onderliggend inzicht te verschaffen achter de verschillende technieken. • Naast het gebruik van recepten, wat terug te vinden is in delen van de syllabus laat een inzicht in – de werking van databanken – en de achterliggende algoritmen • toe – om wisselende interfaces op nieuwe problemen toe te passen.
  • 13. Inhoud Lessen: Bioinformatica
  • 14. Examen • Theorie – Deel rond een zelf te kiezen publicatie die in verband staat met de cursus • Bv Bioinformatics of Computational Biology – Drie inzichtsvragen over de cursus (inclusief  !!) • Practicum (“open-book”) – Viertal oefeningen die meestal het schrijven van een programma veronderstellen • Puntenverdeling 50/50
  • 15. Cursus • 25 Euro – Syllabus – V|Podcasts – Weblems – Screencasts – Flash Drive
  • 16. 20 biobix wvcrieki biobix.be bioinformatics.be
  • 17. • Timelin: Magaret Dayhoff …
  • 18. nature the Human genome Setting the stage …
  • 19. Genome Size DOGS: Database Of Genome Sizes E. coli = 4.2 x 106 Yeast = 18 x 106 Arabidopsis = 80 x 106 C.elegans = 100 x 106 Drosophila = 180 x 106 Human/Rat/Mouse = 3000 x 106 Lily = 300 000 x 106 With ... : 99.9 % To primates: 99%
  • 20. Biological Research Adapted from John McPherson, OICR
  • 21. And this is just the beginning …. Next Generation Sequencing is here
  • 22. Basics of the “old” technology • Clone the DNA. • Generate a ladder of labeled (colored) molecules that are different by 1 nucleotide. • Separate mixture on some matrix. • Detect fluorochrome by laser. • Interpret peaks as string of DNA. • Strings are 500 to 1,000 letters long • 1 machine generates 57,000 nucleotides/run • Assemble all strings into a genome.
  • 23. Basics of the “new” technology • Get DNA. • Attach it to something. • Extend and amplify signal with some color scheme. • Detect fluorochrome by microscopy. • Interpret series of spots as short strings of DNA. • Strings are 30-300 letters long • Multiple images are interpreted as 0.4 to 1.2 GB/run (1,200,000,000 letters/day). • Map or align strings to one or many genome.
  • 24. Next Generation Technologies • 454 –Emulsion PCR –Polymerase –Natural Nucleotides • 20-100Mb for 5-15k –1% error rate –Homopolymers
  • 25. One additional insight ...
  • 26. Read Length is Not As Important For Resequencing 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 8 10 12 14 16 18 20 Length of K-mer Reads (bp) %ofPairedK-merswithUniquely AssignableLocation E.COLI HUMAN Jay Shendure
  • 27. Two Short Read Techologies • Illumina GA • ABI SOLID
  • 28. Technology Overview: Solexa/Illumina Sequencing
  • 29. ABI Solid Dressman 2003
  • 30. ABI SOLID
  • 31. ABI SOLID
  • 32. Paired End Reads are Important! Repetitive DNA Unique DNA Single read maps to multiple positions Paired read maps uniquely Read 1 Read 2 Known Distance
  • 33. Single Molecule Sequencing Helicos Biosciences Corp. Microscope slide Single DNA molecule dNTP-Cy3 * * * * primer Super-cooled TIRF microscope Adapted from: Barak Cohen, Washington University, Bio5488 http://tinyurl.com/6zttuq http://tinyurl.com/6k26nh
  • 34. Introducing NXT GNT DXS NextGenerationDiagnostics
  • 35. NXT GNT DXS • GNT – Dedicated Team & Network – Operational: Location – Professionalized • DXS – Content engine – Product 1 established – Pipeline for n+1 • NXT – Workflow management – Bioinformatics – Epigenetics
  • 36. Next next generation sequencing Third generation sequencing Now sequencing
  • 37. Complete genomics
  • 38. Complete genomics
  • 39. Pacific Biosciences: A Third Generation Sequencing Technology Eid et al 2008
  • 40. Pacific Biosciences: A Third Generation Sequencing Technology
  • 41. Nanopore Sequencing
  • 42. NCBI (educational resources)
  • 43. Weblems • What ? – Web-based problemes (over de huidige les en/of voorbereiding op volgende les) • When ? – Einde van elke les • How ? – Oplossingen online via screencasts – Practicum – Voorbedereiding op het practicum examen ... Niet alle problemen vereisen noodzakelijk programmacode ...
  • 44. Weblems W1.1: To which phyla do the following species belong (a) starfish (b) ginko tree (c) scorpion W1.2: What are the common names for the following species (a) Orycterophus afer (b) Beta vulagaris (c) macrocystis pyrifera W1.3: What species has the smallest known genome ? And is genome size related to number of genes ? W1.4: What are the 5 latest genomes published ? How complete is “coverage” ? W1.5: For approximately 10% of europeans, the painkiller codeine is ineffective because the patients lack the enzyme that converts codeine into the active molecule, morphine. What is the most common mutation that causes this condition ?

×