This document outlines the structure and content of a three-part lecture series on the human genome taking place from October 12-16, 2014. Part I will provide an introduction and overview of genome sequencing technologies. Part II will discuss the human genome project and sequencing methods. Part III will cover genome assembly, annotation, outcomes including the number of genes and functional categories, and applications such as SNP analysis and genome-wide association studies. The overall goals are to understand principles of genome analysis and the impacts of the human genome project.
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Genome Assembly, Annotation and Outcomes Explained
1. The Human Genome
HAGenetics.org
Dr. Hasan Alhaddad
Guest lecturer: Molecular Basis of Human Diseases
October 12th, 14th, 16th 2014
Room 244 (1 PM)
2. Lectures structure
HAGenetics.org
• Part I (Sunday Oct 12th):
• The book of life (Matt Ridely’s analogy with
modifications).
• Introduction to the technologies at the time.
• Part II (Tuesday Oct 14th):
• Why sequencing genomes/the human genome?
• Genome war (public and private projects).
• Sequencing the genome.
• Part III (Thursday Oct 16th):
• Genome assembly revisited.
• Genome annotation.
• Genome outcome.
• The Genomic era.
3. AIMS (part III)
• Learn the basic principles and terminology of
genome assembly.
• Understand the importance of genome annotation.
• Become familiar with the outcomes of the human
genome.
• Understand the technologies and applications that
were developed due to the human genome project.
• Become familiar with the OMICS.
HAGenetics.org
5. Genome Assembly Revisited
DNA sequence: The sequence reads that gets
produced by sequencing machine.
This can be considered the primary sequence of the
genome.
HAGenetics.org
6. Genome Assembly Revisited
Sequence alignment: order and connect overlapping
sequence reads to for a Contig.
This is something you are likely to do when you
sequence a gene.
HAGenetics.org
7. Genome Assembly Revisited
HAGenetics.org
We can consider Contigs the secondary level of
genome assembly.
8. Genome Assembly Revisited
HAGenetics.org
Scaffolds are the tertiary
level of genome assembly.
Scaffolds are also referred to
as Super Contigs.
Scaffolds are formed by
connecting ordered Contigs.
9. Genome Assembly Revisited
HAGenetics.org
Scaffolds are formed by connecting ordered and
Contigs. How?
11. Genome Assembly Revisited
Genome assembly quality is measured by Contig/scaffold N50 or
similar measures.
HAGenetics.org
12. Genome Assembly Revisited
What affects the quality of genome assembly?
1.Repeat elements.
2.Variations between the individuals sequenced
(segmental duplications).
HAGenetics.org
13. Genome Annotation
Genome annotation is very important to study the biology of an
organism.
Without a proper annotation, the sequence is useless.
HAGenetics.org
Remember!
A book that cannot be read and understood is
useless knowledge
14. Genome Annotation
The genome sequence can be classified into different groups
based on the overall sequence composition and structure.
HAGenetics.org
Genome
Coding Non-coding
Genes
Proteins or RNA
Introns
Regulators
Etc.
Repetitive DNA
Interspersed Tandem
SINE
LINE
LTR
Transposons
Satellite
Minisatellite
Microsatellite
15. Genome Annotation
Genome annotation can be divided into two
HAGenetics.org
approaches:
1.Structural annotation:
1. Largely in silico.
2. Utilizing the accumulated knowledge of genes and
genomes to identify sequence signatures.
2.Functional annotation:
1. Requires a lot of work and time.
2. Studying the function of the book/code.
3. Involves biochemical analyses of the genome.
4. Gene expression and regulation.
16. Structural annotation
Start End
HAGenetics.org
Introns
Exons
5’ UTR 3’ UTR
Un-Translated Region
Promoter
sequence
Regulation
sequence
29. Genome Outcome
HAGenetics.org
GC is correlated with genes
CpG islands in the promoter region can regulate
gene expression
30. Genome Outcome
We are repeat elements with some genes :-)
HAGenetics.org
31. Tandem Repeat elements
Microsatellite: Short Tandem Repeats (STR) – Simple Sequence
Repeats (SSR)
HAGenetics.org
Minisatellite: Variable Number Tandem Repeats (VNTR)
Repeat unit size = hundreds base pairs
Repeated 4 times
Repeat unit size = 2 - 6 base pairs
Repeated 8 times Repeated 20 times
36. SNP as a marker
Single Nucleotide Polymorphism
1. Many are found in through out
the genome.
2. Found in nuclear and
mitochondrial DNA.
3. No need for a lot of DNA.
4. Can be used on degraded DNA.
5. Easy to detect – many
platforms.
6. Polymorphism lower than
microsatellites.
HAGenetics.org
37. SNP as a marker
The SNPs identified by the human genome project
allowed the development of SNP arrays (SNP chip).
SNP array allows surveying the genome for variations
between individuals easily at a low price.
HAGenetics.org
39. SNP as a marker
HAGenetics.org
Commercial uses of SNP markers to
learn about ancestry and health
40. SNP as a marker
Genome-wide
Association studies
HAGenetics.org
(GWAS)
41. Beyond the genome
The ENCycleopedia Of
DNA Elements
1. Transcripts
2. Regulatory elements
3. Enhancers
4. Silencers
5. Origins of replication
6. CpG islands
7. Histone modification sites
8. Open chromatin sites
HAGenetics.org