5. Sanger sequencing
• Published in 2003.
• Free access.
• 3 Billons $ and 15 years of work.
• Starting point for future projects (Encode, 1000
Genomes…).
• Economic impact until 2011: 796 Billons $.
6. Sanger sequencing
• However:
1. The cost of a new genome still high, 10.000 –
100.000 $.
2. Hard lab work.
3. Development of new technologies are essentials.
9. Next Sequencing Technologies
Sanger NGS
Number of sequences 96 Seqs/run 8 Human Genomes/run
Length 0.5-2 KiloBases Up to 40 KiloBases
Accuracy 99% 85% to 99%
Cost 200.000 $ GigaBase 2.000 $ GigaBase
12. New challenges
• Bioinformatic approach, how work with Gbytes of
sequences?? Linux
• Quality controls.
• Decision in sequencing platform and software.
• Constant change. First NGS technologies (e.g. Roche
454) are obsolete.
13. New challenges
• Number of samples.
• Necessity of replicates? Biological – Technical?
• Type of reads.
• Number of reads / Coverage.
• Library construction and complexity.
• Reference genome. Gene annotation.
15. Some applications: Metagenomics
• Characterize species (bacteria/virus) present in an
environment
• Soil, water, fecal…
• Associate metagenomics results with the origin of the
sample (e.g. host, environment etc.).
• Sequence of specific region (e.g. 16S), not whole
genome.
• Binning with described species.
• Specific software: Phymm, MetaPhlAn…
18. Some applications: Population genomics
• Whole genome not necessary for all cases.
• Identify neutral and adaptive regions.
• Not necessary reference genome.
• Different approaches, based in restriction enzymes:
• Genotyping By Sequencing (GBS).
• Restriction site Associated DNA (RAD).
21. Some applications: Transcriptomics
• Real expression of the DNA.
• Not necessary reference genome.
• Annotation with described genes.
• Quantify genome expression.
22. Some applications: Transcriptomics
• Read mapping (alignment): place the “shorts” reads
in the genome
• Quantification:
• Assigning to genes
• Determining whether a gene is expressed
• Normalization
• Compare between samples