SlideShare a Scribd company logo
Genome annotation
Paul Gardner
March 3, 2015
Paul Gardner Genome annotation
Medical genomics
Vicky Cameron & Anna Pilbrow at Otago are
identifying genetic variation and genes associated
with an increased risk of heart disease.
Mike Stratton at the Sanger Institute is hunting
for genetic variation that is associated with an
increased risk of cancer.
Rob Knight at UC Boulder is sequencing the
microbes that live on us. Finding associations
between our health and microbial communities.
See Rob’s TEDTalk.
Paul Gardner Genome annotation
Agricultural genomics
Graeme Attwood at AgResearch is trying to stop
cows & sheep from emitting greenhouse gases by
studying their gut microbes. He has sequenced
two methanogenic Archaeal genomes of
Methanobrevibacter sp.
Honour McCann at Massey University is trying
to determine how Pseudomonas syringae pv.
actinidiae (PSA) is killing kiwifruit.
Rebecca Ganley at SCION is investigating how
Phytophthora Taxon Agathis (PTA) is causing
kauri die-back disease and killing kauri trees.
Paul Gardner Genome annotation
Academic interest genomics
Tom Gilbert at the University of Copenhagen is
sequencing bird and giant squid genomes.
Elizabeth Murchison is sequencing tasmanian
devils (and their transmissible cancers). See
Liz’s TEDTalk.
Neil Gemmel at Otago University is sequencing
the tuatara genome.
Paul Gardner Genome annotation
Annotate me!
TTACAGAGTACACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAGGTAACGGTGCGGGCTGACGCGTACAGGAAA
CACAGAAAAAAGCCCGCACCTGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTACATCAG
TGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCA
CCAACCACCTGGTGGCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTTTGCCGAACTTTTGACGG
GACTCGCCGCCGCCCAGCCGGGGTTCCCGCTGGCGCAATTGAAAACTTTCGTCGATCAGGAATTTGCCCAAATAAAACATGTCCTGCATGGCATTAGTT
TGTTGGGGCAGTGCCCGGATAGCATCAACGCTGCGCTGATTTGCCGTGGCGAGAAAATGTCGATCGCCATTATGGCCGGCGTATTAGAAGCGCGCGGTC
ACAACGTTACTGTTATCGATCCGGTCGAAAAACTGCTGGCAGTGGGGCATTACCTCGAATCTACCGTCGATATTGCTGAGTCCACCCGCCGTATTGCGG
CAAGCCGCATTCCGGCTGATCACATGGTGCTGATGGCAGGTTTCACCGCCGGTAATGAAAAAGGCGAACTGGTGGTGCTTGGACGCAACGGTTCCGACT
ACTCTGCTGCGGTGCTGGCTGCCTGTTTACGCGCCGATTGTTGCGAGATTTGGACGGACGTTGACGGGGTCTATACCTGCGACCCGCGTCAGGTGCCCG
ATGCGAGGTTGTTGAAGTCGATGTCCTACCAGGAAGCGATGGAGCTTTCCTACTTCGGCGCTAAAGTTCTTCACCCCCGCACCATTACCCCCATCGCCC
AGTTCCAGATCCCTTGCCTGATTAAAAATACCGGAAATCCTCAAGCACCAGGTACGCTCATTGGTGCCAGCCGTGATGAAGACGAATTACCGGTCAAGG
GCATTTCCAATCTGAATAACATGGCAATGTTCAGCGTTTCTGGTCCGGGGATGAAAGGGATGGTCGGCATGGCGGCGCGCGTCTTTGCAGCGATGTCAC
GCGCCCGTATTTCCGTGGTGCTGATTACGCAATCATCTTCCGAATACAGCATCAGTTTCTGCGTTCCACAAAGCGACTGTGTGCGAGCTGAACGGGCAA
TGCAGGAAGAGTTCTACCTGGAACTGAAAGAAGGCTTACTGGAGCCGCTGGCAGTGACGGAACGGCTGGCCATTATCTCGGTGGTAGGTGATGGTATGC
GCACCTTGCGTGGGATCTCGGCGAAATTCTTTGCCGCACTGGCCCGCGCCAATATCAACATTGTCGCCATTGCTCAGGGATCTTCTGAACGCTCAATCT
CTGTCGTGGTAAATAACGATGATGCGACCACTGGCGTGCGCGTTACTCATCAGATGCTGTTCAATACCGATCAGGTTATCGAAGTGTTTGTGATTGGCG
TCGGTGGCGTTGGCGGTGCGCTGCTGGAGCAACTGAAGCGTCAGCAAAGCTGGCTGAAGAATAAACATATCGACTTACGTGTCTGCGGTGTTGCCAACT
CGAAGGCTCTGCTCACCAATGTACATGGCCTTAATCTGGAAAACTGGCAGGAAGAACTGGCGCAAGCCAAAGAGCCGTTTAATCTCGGGCGCTTAATTC
GCCTCGTGAAAGAATATCATCTGCTGAACCCGGTCATTGTTGACTGCACTTCCAGCCAGGCAGTGGCGGATCAATATGCCGACTTCCTGCGCGAAGGTT
TCCACGTTGTCACGCCGAACAAAAAGGCCAACACCTCGTCGATGGATTACTACCATCAGTTGCGTTATGCGGCGGAAAAATCGCGGCGTAAATTCCTCT
ATGACACCAACGTTGGGGCTGGATTACCGGTTATTGAGAACCTGCAAAATCTGCTCAATGCAGGTGATGAATTGATGAAGTTCTCCGGCATTCTTTCTG
GTTCGCTTTCTTATATCTTCGGCAAGTTAGACGAAGGCATGAGTTTCTCCGAGGCGACCACGCTGGCGCGGGAAATGGGTTATACCGAACCGGACCCGC
GAGATGATCTTTCTGGTATGGATGTGGCGCGTAAACTATTGATTCTCGCTCGTGAAACGGGACGTGAACTGGAGCTGGCGGATATTGAAATTGAACCTG
TGCTGCCCGCAGAGTTTAACGCCGAGGGTGATGTTGCCGCTTTTATGGCGAATCTGTCACAACTCGACGATCTCTTTGCCGCGCGCGTGGCGAAGGCCC
GTGATGAAGGAAAAGTTTTGCGCTATGTTGGCAATATTGATGAAGATGGCGTCTGCCGCGTGAAGATTGCCGAAGTGGATGGTAATGATCCGCTGTTCA
AAGTGAAAAATGGCGAAAACGCCCTGGCCTTCTATAGCCACTATTATCAGCCGCTGCCGTTGGTACTGCGCGGATATGGTGCGGGCAATGACGTTACAG
CTGCCGGTGTCTTTGCTGATCTGCTACGTACCCTCTCATGGAAGTTAGGAGTCTGACATGGTTAAAGTTTATGCCCCCATGGTTAAAGTTTATGCCCCG
GCTTCCAGTGCCAATATGAGCGTCGGGTTTGATGTGCTCGGGGCGGCGGTGACACCTGTTGATGGTGCATTGCTCGGAGATGTAGTCACGGTTGAGGCG
GCAGAGACATTCAGTCTCAACAACCTCGGACGCTTTGCCGATAAGCTGCCGTCAGAACCACGGGAAAATATCGTTTATCA
Paul Gardner Genome annotation
Discussion
How should these researchers annotate their genomes (after
they have sequenced and assembled them)?
What are the fast and cheap methods?
What are the most accurate methods?
Paul Gardner Genome annotation
The data tsunami
Thanks to new sequencing technologies (recall Ant’s
teeny-tiny little sequencer).
Biologists no longer spend years acquiring data.
The bottle-neck for research is now in the analysis phase of
research.
Biologists with good mathematics skills and mathematicians
with an interest in biology are in high demand.
Gather data
Analyze-Classify
Hypotheses-
Predictions
Experiment GCGAGCAGACGCA
CCGAACAGACACA
GUGAGCAGGCGCC
CCGAGCAGUCAUA
ACACUGAGACGCA
GCGAGCGU-AACG
R
A
A
A
A
R
C
Y
Y R
R
G
Y
U
U
U
U
U
U U5'
0.0
1.0
2.0
A
C
GU
CC
A
GA5
A
GA
U
CAGG
U
A10
CA
GU
CU
G
A
Paul Gardner Genome annotation
We can use sequence analysis...
Genes leave a statistical signal in the genome...
Example: identify promotors, ribosome binding sites,
open-reading frames (ORFs), terminators
In eukaryotes CpG islands, splicing signals and poly-A tails may
be incorporated
How reliable are these approaches? What are the main
weaknesses & strengths?
Figure from: http://zerocool.is-a-geek.net/?p=630
Paul Gardner Genome annotation
Sequence analysis: strengths and weaknesses
ORF prediction: Prodigal, GLIMMER
Strengths:
very fast
cheap
Weaknesses:
false positives (see AntiFam)
misses short peptides (e.g. toxins-antitoxin systems)
No ncRNAs, pseudogenes, recoding elements, ...
Paul Gardner Genome annotation
Annotate me!
TTACAGAGTACACAACATCCATGAAACGCATTAGCACCACCATTACCACCACCATCACCATTACCACAGGTAACGGTGCGGGCTGACGCGTACAGGAAA
CACAGAAAAAAGCCCGCACCTGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTAACGAGGTAACAACCATGCGAGTGTTGAAGTTCGGCGGTACATCAG
TGGCAAATGCAGAACGTTTTCTGCGTGTTGCCGATATTCTGGAAAGCAATGCCAGGCAGGGGCAGGTGGCCACCGTCCTCTCTGCCCCCGCCAAAATCA
CCAACCACCTGGTGGCGATGATTGAAAAAACCATTAGCGGCCAGGATGCTTTACCCAATATCAGCGATGCCGAACGTATTTTTGCCGAACTTTTGACGG
GACTCGCCGCCGCCCAGCCGGGGTTCCCGCTGGCGCAATTGAAAACTTTCGTCGATCAGGAATTTGCCCAAATAAAACATGTCCTGCATGGCATTAGTT
TGTTGGGGCAGTGCCCGGATAGCATCAACGCTGCGCTGATTTGCCGTGGCGAGAAAATGTCGATCGCCATTATGGCCGGCGTATTAGAAGCGCGCGGTC
ACAACGTTACTGTTATCGATCCGGTCGAAAAACTGCTGGCAGTGGGGCATTACCTCGAATCTACCGTCGATATTGCTGAGTCCACCCGCCGTATTGCGG
CAAGCCGCATTCCGGCTGATCACATGGTGCTGATGGCAGGTTTCACCGCCGGTAATGAAAAAGGCGAACTGGTGGTGCTTGGACGCAACGGTTCCGACT
ACTCTGCTGCGGTGCTGGCTGCCTGTTTACGCGCCGATTGTTGCGAGATTTGGACGGACGTTGACGGGGTCTATACCTGCGACCCGCGTCAGGTGCCCG
ATGCGAGGTTGTTGAAGTCGATGTCCTACCAGGAAGCGATGGAGCTTTCCTACTTCGGCGCTAAAGTTCTTCACCCCCGCACCATTACCCCCATCGCCC
AGTTCCAGATCCCTTGCCTGATTAAAAATACCGGAAATCCTCAAGCACCAGGTACGCTCATTGGTGCCAGCCGTGATGAAGACGAATTACCGGTCAAGG
GCATTTCCAATCTGAATAACATGGCAATGTTCAGCGTTTCTGGTCCGGGGATGAAAGGGATGGTCGGCATGGCGGCGCGCGTCTTTGCAGCGATGTCAC
GCGCCCGTATTTCCGTGGTGCTGATTACGCAATCATCTTCCGAATACAGCATCAGTTTCTGCGTTCCACAAAGCGACTGTGTGCGAGCTGAACGGGCAA
TGCAGGAAGAGTTCTACCTGGAACTGAAAGAAGGCTTACTGGAGCCGCTGGCAGTGACGGAACGGCTGGCCATTATCTCGGTGGTAGGTGATGGTATGC
GCACCTTGCGTGGGATCTCGGCGAAATTCTTTGCCGCACTGGCCCGCGCCAATATCAACATTGTCGCCATTGCTCAGGGATCTTCTGAACGCTCAATCT
CTGTCGTGGTAAATAACGATGATGCGACCACTGGCGTGCGCGTTACTCATCAGATGCTGTTCAATACCGATCAGGTTATCGAAGTGTTTGTGATTGGCG
TCGGTGGCGTTGGCGGTGCGCTGCTGGAGCAACTGAAGCGTCAGCAAAGCTGGCTGAAGAATAAACATATCGACTTACGTGTCTGCGGTGTTGCCAACT
CGAAGGCTCTGCTCACCAATGTACATGGCCTTAATCTGGAAAACTGGCAGGAAGAACTGGCGCAAGCCAAAGAGCCGTTTAATCTCGGGCGCTTAATTC
GCCTCGTGAAAGAATATCATCTGCTGAACCCGGTCATTGTTGACTGCACTTCCAGCCAGGCAGTGGCGGATCAATATGCCGACTTCCTGCGCGAAGGTT
TCCACGTTGTCACGCCGAACAAAAAGGCCAACACCTCGTCGATGGATTACTACCATCAGTTGCGTTATGCGGCGGAAAAATCGCGGCGTAAATTCCTCT
ATGACACCAACGTTGGGGCTGGATTACCGGTTATTGAGAACCTGCAAAATCTGCTCAATGCAGGTGATGAATTGATGAAGTTCTCCGGCATTCTTTCTG
GTTCGCTTTCTTATATCTTCGGCAAGTTAGACGAAGGCATGAGTTTCTCCGAGGCGACCACGCTGGCGCGGGAAATGGGTTATACCGAACCGGACCCGC
GAGATGATCTTTCTGGTATGGATGTGGCGCGTAAACTATTGATTCTCGCTCGTGAAACGGGACGTGAACTGGAGCTGGCGGATATTGAAATTGAACCTG
TGCTGCCCGCAGAGTTTAACGCCGAGGGTGATGTTGCCGCTTTTATGGCGAATCTGTCACAACTCGACGATCTCTTTGCCGCGCGCGTGGCGAAGGCCC
GTGATGAAGGAAAAGTTTTGCGCTATGTTGGCAATATTGATGAAGATGGCGTCTGCCGCGTGAAGATTGCCGAAGTGGATGGTAATGATCCGCTGTTCA
AAGTGAAAAATGGCGAAAACGCCCTGGCCTTCTATAGCCACTATTATCAGCCGCTGCCGTTGGTACTGCGCGGATATGGTGCGGGCAATGACGTTACAG
CTGCCGGTGTCTTTGCTGATCTGCTACGTACCCTCTCATGGAAGTTAGGAGTCTGACATGGTTAAAGTTTATGCCCCCATGGTTAAAGTTTATGCCCCG
GCTTCCAGTGCCAATATGAGCGTCGGGTTTGATGTGCTCGGGGCGGCGGTGACACCTGTTGATGGTGCATTGCTCGGAGATGTAGTCACGGTTGAGGCG
GCAGAGACATTCAGTCTCAACAACCTCGGACGCTTTGCCGATAAGCTGCCGTCAGAACCACGGGAAAATATCGTTTATCA
Paul Gardner Genome annotation
We can use homology...
Evolution tends to preserve functional genomic regions...
Example 1: Use an existing set of genes from related species
and map these onto your genome (e.g. RATT)
Example 2: Align two or more related genomes, look for
conserved regions, patterns of variation can be indicative of
function (e.g. QRNA, RNAz & RNAcode)
How reliable are these approaches? What are the main
weaknesses & strengths?
Paul Gardner Genome annotation
The QRNA approach...
Rivas et al. (2001) Computational identification of noncoding RNAs in E. coli by comparative genomics. Current
Biology.
Paul Gardner Genome annotation
DNA encodes Protein
# STOCKHOLM 1.0
#33 unique RNA sequences, 1 peptide sequence
#=GR PR1 G..A..D..V..T..H..P..P..A..G..D..
#=GR PR3 GlyAlaAspValThrHisProProAlaGlyAsp
platypus GGAGCAGACGTCACTCACCCCCCAGCCGGAGAT
opossum GGAGCAGATGTTACTCACCCTCCTGCTGGAGAT
sloth GGAGCAGACGTCACACACCCTCCCGCGGGGGAT
armadillo GGAGCAGACGTCACGCACCCTCCGGCAGGGGAT
tenrec GGGGCCGACGTCACGCACCCCCCTGCGGGCGAT
elephant GGAGCGGATGTCACACACCCGCCTGCGGGGGAT
shrew GGCGCAGATGTCACGCATCCTCCAGCAGGGGAC
hedgehog GGAGCAGATGTCACACACCCCCCAGCAGGAGAT
megabat GGAGCAGATGTCACACACCCTCCTGCAGGAGAT
microbat GGAGCAGATGTCACCCACCCCCCTGCAGGGGAC
dog GGAGCGGATGTCACACACCCCCCAGCCGGGGAC
cat GGAGCCGATGTCACGCACCCCCCAGCAGGGGAT
horse GGAGCGGATGTCACACACCCTCCGGCAGGGGAT
pika GGAGCAGATGTCACTCACCCTCCAGCTGGGGAT
rabbit GGTGCAGATGTCACACACCCCCCAGCTGGAGAT
squirrel GGAGCAGATGTCACTCACCCTCCAGCGGGAGAT
guinea_pig GGAGCAGATGTCACACACCCACCAGCGGGAGAT
mouse GGAGCAGATGTCACTCATCCGCCTGCTGGGGAC
rat GGAGCAGATGTCACTCATCCACCTGCTGGGGAT
kangaroo_rat GGAGCAGATGTTACACACCCTCCAGCAGGGGAT
tree_shrew GGCGCAGACGTCACGCACCCCCCGGCCGGGGAT
human GGAGCGGATGTCACACACCCCCCAGCAGGGGAT
tarsier GGTGCTGATGTCACACACCCCCCTGCAGGGGAT
marmoset GGAGCAGATGTCACACACCCACCAGCAGGGGAT
zebrafinch GGAGCAGATGTCACTCACCCTCCCGCCGGGGAT
green_anole GGGGCAGACGTCACTCACCCGCCAGCCGGGGAC
xenopus GGAGCAGATGTTACACACCCACCTGCTGGTGAT
pufferfish GGTGCGGATGTTACTCATCCTCCTGCTGGTGAT
fugu GGGGCTGATGTTACTCACCCTCCAGCTGGTGAT
stickleback GGTGCAGACGTCACACATCCTCCAGCGGGTGAT
medaka GGTGCCGATGTCACTCATCCTCCTGCCGGGGAC
zebrafish GGGGCAGATGTTACACACCCGCCGGCTGGTGAT
lamprey GGTGCCGATGTGACACACCCTCCAGCGGGAGAC
//
G
A
A
A
A
A
G
G
G
G
C
C
C
C
U
U
U
U
UC AG UCA
G
U
C
A
G
U
C
A
G
U
C
A
G
U
C
A
G
UC
AGUCAGUCAGUC
AG
U
C
A
G
U
C
A
G
U
C
A
G
U
C
AG
U
C
AG UCAG
P
S
U
nG
nG
oG
oG
oG
G
P
P
P
P
P
nM
nM
M
M
nM
nM
nM
Phenylalanine
Phe
Leucine
Leu
Leucine
Leu
Proline
Pro
Histidine
His
Glutamine
Gln
Isoleucine
Ile
Methionine
Met
Threonine
Thr
Asparagine
Asn
Lysine
Lys
Arginine
Arg
Arginine
Arg
Valine
Val
Alanine
Ala
Glutamic acid
Glu
Aspartic acid
Asp
Glycine
Gly
Serine
Ser
Serine
Ser
Tyrosine
Tyr
Cysteine
Cys
Tryptophan
Trp
Stops
Stop
E
G F L
S
S
Y
C
W
L
P
H
R
R
Q
IM
T
N
K
V
A
D
89.09
75.07
174.20
174.20
146.19
165.19
133.11
117.15
147.13
146.15
155.16
115.13
105.09
105.09
131.18
132.12
MW
=149.21Da
131.18
119.12
204.23
131.18
181.19
121.16
HN
NH2
NH
H2N
OH
O
H2N
CH3 OH
O
H2N
O
H2N
OH
O
O
HO
H2N
OH
O
HS
H2N
OH
O
H2N
O
NH2
OH
O
O
OH
H2N
OH
O
H2N
OH
O
NH
H2N
OH
O
N
CH3 CH3
H2N
OH
O
CH3
CH3
H2N
OH
O
CH3
CH3
H2N
OH
O
H2N
H2N
OH
O
CH3 S
H2N
OH
O
H2N
OH
O
NH
OH
O
H2N
HO OH
O
H2N
HO OH
O
H2N
HO
CH3
OH
O
NH
H2N
OH
O
HO
H2N
OH
O
H2N
CH3
CH3
OH
O
Basic
Acidic
Polar
Nonpolar
(hydrophobic)
S -
M -
P -
U -
nM -
oG -
nG -
Sumo
Methyl
Phospho
Ubiquitin
N-Methyl
O-glycosyl
N-glycosyl
Modification
aminoacid
2nd1st position 3rd
U
C
Image source: http://upload.wikimedia.org/wikipedia/en/d/d6/GeneticCode21-version-2.svgPaul Gardner Genome annotation
DNA encodes RNA
G
C
G
G
A
U
UU
A
GCUC
AGD
D
G
G G A
G A G C
G
C
C
A
GA
C
U
G
A A
.
A
.
C
U
G
GAGG
U
C
C U G U G
T . C
G
A
UC
CACAG
A
A
U
U
C
G
C
A
C
CA
Variable
LoopAnticodon
Loop
T ΨC
Loop
10 15 20 25 30 355 40 45 50 55 60 65 70 75
Anticodon
Loop
Acceptor
Stem
GCGGAUUUAGCUCAGDDGGGAGAGCGCCAGACUGAAYA.CUGGAGGUCCUGUGT.CGAUCCACAGAAUUCGCACCA5’ 3’
Secondary Structure Tertiary StructureB C
Primary StructureA
Acceptor
Stem
T ΨC
Loop
ΨΨ
Ψ
Ψ
Y
65
60
55
40
10
20
15
5
70
75
25
30
35
45
50
D Loop
3’
5’
5’
3’
D Loop
Paul Gardner Genome annotation
Homology-based annotation: strengths and weaknesses
Example 1: map known genes onto genomes
Strengths: fast, cheap, ...
Weaknesses:
Inaccurate for divergent species (e.g. Graeme’s
Methanobrevibacter or GEBA genomes)
Requires manual correction of border-line results
Errors are propagated throughout the databases
Example 2: aligning genomes
Strengths:
“cheap” if genomes already exist
fast for small genomes
evolutionary support for all discoveries
Weaknesses:
Requires lots of powerful computers for large genomes
Inaccurate for divergent species (e.g. Neil’s tuatara or
Graeme’s Methanobrevibacter)
Requires manual correction of border-line results
Paul Gardner Genome annotation
Homology annotation: nucleotides are difficult to align
0
20
40
60
80
100
Conservation of Xfam families in bacterial genomes
Conservedfamilies(%)
Freq.
RNA−seq species
0
10
Pfam (N=6671)
Rfam (N=331)
0.0 0.1 0.2 0.3 0.4 0.5 0.6
Phylogenetic distance
Lindgreen et al. (2014) Robust identification of noncoding RNA from transcriptomes requires
phylogenetically-informed sampling. PLOS Computational Biology.
Paul Gardner Genome annotation
We can use RNA detection methods...
Remember the central dogma of molecular biology
Example: sequence RNAs from multiple tissues,
developmental stages and environmental conditions
How reliable is this approach? What are the main weaknesses
& strengths?
Wang, Gerstein & Snyder (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nature Reviews Genetics.
Paul Gardner Genome annotation
RNA-seq: strengths and weaknesses
RNA-seq
Strengths:
Experimental support for transcribed regions
Identifies untranslated regions (UTRs), ncRNAs, antisense
RNAs, ...
Identifies alternatively spliced and edited RNAs
Weaknesses:
Expensive & lots of work
RNA degradation and genomic contamination
Transcription does not prove translation
Will miss genes transcribed in specific developmental stages,
tissues & environmental conditions E.g. lsy-6 microRNA
Paul Gardner Genome annotation
We can use protein detection methods...
Central dogma of molecular biology
Example: Protein mass spectrometry
How reliable is this approach? What are the main weaknesses
& strengths?
Figure from: http://en.wikipedia.org/wiki/Protein mass spectrometry
Paul Gardner Genome annotation
Protein mass spectrometry: strengths and weaknesses
Protein mass spectrometry
Strengths:
Experimental support for translated regions
Identifies alternative isoforms and post-translational
modifications (Ezkurdia et al. 2012)
Weaknesses:
Expensive & lots of work
Misses genes transcribed in specific developmental stages,
tissues & environmental conditions
Currently technology generally only detects the most
abundant proteins
Ezkurdia et al. (2012) Comparative proteomics reveals a significant bias toward alternative protein isoforms with
conserved structure and function. Mol Biol Evol.
Paul Gardner Genome annotation
How cool is this?!
Schwanh¨ausser et al. (2011) Global quantification of mammalian gene expression control. Nature
Paul Gardner Genome annotation
This is also kinda neat...
Lu et al. (2007) Absolute protein expression profiling estimates the relative contributions of transcriptional and
translational regulation. Nature Biotechnology
Paul Gardner Genome annotation
Relevant reading
Reviews:
Stein L (2001) Genome annotation: from sequence to biology.
Nature Reviews Genetics.
Reed JL et al. (2006) Towards multidimensional genome
annotation. Nature Reviews Genetics.
ORF finding:
Delcher AL et al. (2007) Identifying bacterial genes and
endosymbiont DNA with Glimmer. Bioinformatics.
Hyatt D et al. (2010) Prodigal: prokaryotic gene recognition
and translation initiation site identification. BMC
Bioinformatics.
RNA-seq (Ant’s lectures)
Wang, Gerstein & Snyder (2009) RNA-Seq: a revolutionary
tool for transcriptomics. Nature Reviews Genetics.
Proteomics (Sarah’s lectures)
Ezkurdia et al. (2012) Comparative proteomics reveals a
significant bias toward alternative protein isoforms with
conserved structure and function. Mol Biol Evol.
Paul Gardner Genome annotation
Homework: How to make a sequence alignment?
Play: http://phylo.cs.mcgill.ca
or even better, play Ribo: http://ribo.cs.mcgill.ca/
Paul Gardner Genome annotation
The End
Paul Gardner Genome annotation

More Related Content

What's hot

Protein fold recognition and ab_initio modeling
Protein fold recognition and ab_initio modelingProtein fold recognition and ab_initio modeling
Protein fold recognition and ab_initio modeling
Bioinformatics and Computational Biosciences Branch
 
gene prediction programs
gene prediction programsgene prediction programs
gene prediction programs
MugdhaSharma11
 
Transcriptomics and metabolomics
Transcriptomics and metabolomicsTranscriptomics and metabolomics
Transcriptomics and metabolomics
Sukhjinder Singh
 
Tech Talk: UCSC Genome Browser
Tech Talk: UCSC Genome BrowserTech Talk: UCSC Genome Browser
Tech Talk: UCSC Genome Browser
Hoffman Lab
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
Shifa Ansari
 
Computational biology
Computational biologyComputational biology
Computational biology
Zeina Abdelmoez
 
Upgma
UpgmaUpgma
Upgma
MUSKANKr
 
Genome analysis2
Genome analysis2Genome analysis2
RNA-seq differential expression analysis
RNA-seq differential expression analysisRNA-seq differential expression analysis
RNA-seq differential expression analysis
mikaelhuss
 
Protein data bank
Protein data bankProtein data bank
Protein data bank
Yogesh Joshi
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformatics
Abhishek Vatsa
 
Scoring schemes in bioinformatics (blosum)
Scoring schemes in bioinformatics (blosum)Scoring schemes in bioinformatics (blosum)
Scoring schemes in bioinformatics (blosum)
SumatiHajela
 
SAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene ExpressionSAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene Expression
Aashish Patel
 
Kegg
KeggKegg
Kegg
msfbi1521
 
Finding ORF
Finding ORFFinding ORF
Finding ORF
Sabahat Ali
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure prediction
karamveer prajapat
 
System's Biology
System's Biology System's Biology
System's Biology
Pritam Shil
 
Gene prediction method
Gene prediction method Gene prediction method
Gene prediction method
Nusrat Gulbarga
 
De novo genome assembly - T.Seemann - IMB winter school 2016 - brisbane, au ...
De novo genome assembly  - T.Seemann - IMB winter school 2016 - brisbane, au ...De novo genome assembly  - T.Seemann - IMB winter school 2016 - brisbane, au ...
De novo genome assembly - T.Seemann - IMB winter school 2016 - brisbane, au ...
Torsten Seemann
 
Role of ensembl in genome browsing
Role of ensembl in genome browsingRole of ensembl in genome browsing
Role of ensembl in genome browsing
Joydeep16
 

What's hot (20)

Protein fold recognition and ab_initio modeling
Protein fold recognition and ab_initio modelingProtein fold recognition and ab_initio modeling
Protein fold recognition and ab_initio modeling
 
gene prediction programs
gene prediction programsgene prediction programs
gene prediction programs
 
Transcriptomics and metabolomics
Transcriptomics and metabolomicsTranscriptomics and metabolomics
Transcriptomics and metabolomics
 
Tech Talk: UCSC Genome Browser
Tech Talk: UCSC Genome BrowserTech Talk: UCSC Genome Browser
Tech Talk: UCSC Genome Browser
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
 
Computational biology
Computational biologyComputational biology
Computational biology
 
Upgma
UpgmaUpgma
Upgma
 
Genome analysis2
Genome analysis2Genome analysis2
Genome analysis2
 
RNA-seq differential expression analysis
RNA-seq differential expression analysisRNA-seq differential expression analysis
RNA-seq differential expression analysis
 
Protein data bank
Protein data bankProtein data bank
Protein data bank
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformatics
 
Scoring schemes in bioinformatics (blosum)
Scoring schemes in bioinformatics (blosum)Scoring schemes in bioinformatics (blosum)
Scoring schemes in bioinformatics (blosum)
 
SAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene ExpressionSAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene Expression
 
Kegg
KeggKegg
Kegg
 
Finding ORF
Finding ORFFinding ORF
Finding ORF
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure prediction
 
System's Biology
System's Biology System's Biology
System's Biology
 
Gene prediction method
Gene prediction method Gene prediction method
Gene prediction method
 
De novo genome assembly - T.Seemann - IMB winter school 2016 - brisbane, au ...
De novo genome assembly  - T.Seemann - IMB winter school 2016 - brisbane, au ...De novo genome assembly  - T.Seemann - IMB winter school 2016 - brisbane, au ...
De novo genome assembly - T.Seemann - IMB winter school 2016 - brisbane, au ...
 
Role of ensembl in genome browsing
Role of ensembl in genome browsingRole of ensembl in genome browsing
Role of ensembl in genome browsing
 

Viewers also liked

Genome annotation 2013
Genome annotation 2013Genome annotation 2013
Genome annotation 2013
Karan Veer Singh
 
Gemome annotation
Gemome annotationGemome annotation
Gemome annotation
Tajammal Daultana
 
Bioalgo 2012-01-gene-prediction-stat
Bioalgo 2012-01-gene-prediction-statBioalgo 2012-01-gene-prediction-stat
Bioalgo 2012-01-gene-prediction-stat
BioinformaticsInstitute
 
Gene identification and discovery
Gene identification and discoveryGene identification and discovery
Gene identification and discovery
Amit Ruchi Yadav
 
Gene annotation games
Gene annotation gamesGene annotation games
Gene annotation games
Benjamin Good
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
ajay301
 
TRAM THESIS POSTER
TRAM THESIS POSTERTRAM THESIS POSTER
TRAM THESIS POSTER
Tram Bui
 
Characterization of Drosophila Nucleobindin: An Evolutionarily Conserved Ca2+...
Characterization of Drosophila Nucleobindin: An Evolutionarily Conserved Ca2+...Characterization of Drosophila Nucleobindin: An Evolutionarily Conserved Ca2+...
Characterization of Drosophila Nucleobindin: An Evolutionarily Conserved Ca2+...
Vadivel Prabahar
 
Characterization in Dvilp 7 gene
Characterization in Dvilp 7 geneCharacterization in Dvilp 7 gene
Characterization in Dvilp 7 gene
Hunter Kelley
 
Drosophilla
DrosophillaDrosophilla
Drosophilla
erikaLane14
 
Trends in Annotation of Genomic Data
Trends in Annotation of Genomic DataTrends in Annotation of Genomic Data
Trends in Annotation of Genomic Data
biobase
 
Identification, annotation and visualisation of extreme changes in splicing w...
Identification, annotation and visualisation of extreme changes in splicing w...Identification, annotation and visualisation of extreme changes in splicing w...
Identification, annotation and visualisation of extreme changes in splicing w...
Mar Gonzàlez-Porta
 
DIYA: An annotation pipeline for any genomics lab
DIYA: An annotation pipeline for any genomics labDIYA: An annotation pipeline for any genomics lab
DIYA: An annotation pipeline for any genomics lab
Andrew Stewart
 
Web Apollo: A Web-based Genomic Annotation Editing Platform ISB2013
Web Apollo: A Web-based Genomic Annotation Editing Platform ISB2013Web Apollo: A Web-based Genomic Annotation Editing Platform ISB2013
Web Apollo: A Web-based Genomic Annotation Editing Platform ISB2013
Monica Munoz-Torres
 
Web Apollo: A Web-based Genomics Annotation Editing Platform. 13ArthGen
Web Apollo: A Web-based Genomics Annotation Editing Platform. 13ArthGenWeb Apollo: A Web-based Genomics Annotation Editing Platform. 13ArthGen
Web Apollo: A Web-based Genomics Annotation Editing Platform. 13ArthGen
Monica Munoz-Torres
 
Genome assembly: then and now — v1.0
Genome assembly: then and now — v1.0Genome assembly: then and now — v1.0
Genome assembly: then and now — v1.0
Keith Bradnam
 
Est database
Est databaseEst database
Est database
Amit Ruchi Yadav
 
The NCBI Eukaryotic Genome Annotation Pipeline and Alternate Genomic Sequences
The NCBI Eukaryotic Genome Annotation Pipeline and Alternate Genomic SequencesThe NCBI Eukaryotic Genome Annotation Pipeline and Alternate Genomic Sequences
The NCBI Eukaryotic Genome Annotation Pipeline and Alternate Genomic Sequences
Genome Reference Consortium
 
Introduction to Apollo for i5k
Introduction to Apollo for i5kIntroduction to Apollo for i5k
Introduction to Apollo for i5k
Monica Munoz-Torres
 
2 md2016 annotation
2 md2016 annotation2 md2016 annotation
2 md2016 annotation
Scott Dawson
 

Viewers also liked (20)

Genome annotation 2013
Genome annotation 2013Genome annotation 2013
Genome annotation 2013
 
Gemome annotation
Gemome annotationGemome annotation
Gemome annotation
 
Bioalgo 2012-01-gene-prediction-stat
Bioalgo 2012-01-gene-prediction-statBioalgo 2012-01-gene-prediction-stat
Bioalgo 2012-01-gene-prediction-stat
 
Gene identification and discovery
Gene identification and discoveryGene identification and discovery
Gene identification and discovery
 
Gene annotation games
Gene annotation gamesGene annotation games
Gene annotation games
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
TRAM THESIS POSTER
TRAM THESIS POSTERTRAM THESIS POSTER
TRAM THESIS POSTER
 
Characterization of Drosophila Nucleobindin: An Evolutionarily Conserved Ca2+...
Characterization of Drosophila Nucleobindin: An Evolutionarily Conserved Ca2+...Characterization of Drosophila Nucleobindin: An Evolutionarily Conserved Ca2+...
Characterization of Drosophila Nucleobindin: An Evolutionarily Conserved Ca2+...
 
Characterization in Dvilp 7 gene
Characterization in Dvilp 7 geneCharacterization in Dvilp 7 gene
Characterization in Dvilp 7 gene
 
Drosophilla
DrosophillaDrosophilla
Drosophilla
 
Trends in Annotation of Genomic Data
Trends in Annotation of Genomic DataTrends in Annotation of Genomic Data
Trends in Annotation of Genomic Data
 
Identification, annotation and visualisation of extreme changes in splicing w...
Identification, annotation and visualisation of extreme changes in splicing w...Identification, annotation and visualisation of extreme changes in splicing w...
Identification, annotation and visualisation of extreme changes in splicing w...
 
DIYA: An annotation pipeline for any genomics lab
DIYA: An annotation pipeline for any genomics labDIYA: An annotation pipeline for any genomics lab
DIYA: An annotation pipeline for any genomics lab
 
Web Apollo: A Web-based Genomic Annotation Editing Platform ISB2013
Web Apollo: A Web-based Genomic Annotation Editing Platform ISB2013Web Apollo: A Web-based Genomic Annotation Editing Platform ISB2013
Web Apollo: A Web-based Genomic Annotation Editing Platform ISB2013
 
Web Apollo: A Web-based Genomics Annotation Editing Platform. 13ArthGen
Web Apollo: A Web-based Genomics Annotation Editing Platform. 13ArthGenWeb Apollo: A Web-based Genomics Annotation Editing Platform. 13ArthGen
Web Apollo: A Web-based Genomics Annotation Editing Platform. 13ArthGen
 
Genome assembly: then and now — v1.0
Genome assembly: then and now — v1.0Genome assembly: then and now — v1.0
Genome assembly: then and now — v1.0
 
Est database
Est databaseEst database
Est database
 
The NCBI Eukaryotic Genome Annotation Pipeline and Alternate Genomic Sequences
The NCBI Eukaryotic Genome Annotation Pipeline and Alternate Genomic SequencesThe NCBI Eukaryotic Genome Annotation Pipeline and Alternate Genomic Sequences
The NCBI Eukaryotic Genome Annotation Pipeline and Alternate Genomic Sequences
 
Introduction to Apollo for i5k
Introduction to Apollo for i5kIntroduction to Apollo for i5k
Introduction to Apollo for i5k
 
2 md2016 annotation
2 md2016 annotation2 md2016 annotation
2 md2016 annotation
 

Similar to BIOL335: How to annotate a genome

Abr 8(3) abstracts_v2
Abr 8(3) abstracts_v2Abr 8(3) abstracts_v2
Abr 8(3) abstracts_v2
Stephen Wawman
 
Genome project.pdf
Genome project.pdfGenome project.pdf
Genome project.pdf
ManchikantiDivya
 
Optimizing Grape Rootstock Production and Export of inhibitors of X. fastidio...
Optimizing Grape Rootstock Production and Export of inhibitors of X. fastidio...Optimizing Grape Rootstock Production and Export of inhibitors of X. fastidio...
Optimizing Grape Rootstock Production and Export of inhibitors of X. fastidio...
huyng
 
2013_CarterEtal_MultiplexPCR-Cronobacter_ AEM
2013_CarterEtal_MultiplexPCR-Cronobacter_ AEM2013_CarterEtal_MultiplexPCR-Cronobacter_ AEM
2013_CarterEtal_MultiplexPCR-Cronobacter_ AEM
Monica Pava-Ripoll
 
David Gardner
David GardnerDavid Gardner
David Gardner
AC Goatham & Son
 
Phytothreats: WP4 overview
Phytothreats: WP4 overviewPhytothreats: WP4 overview
Phytothreats: WP4 overview
Forest Research
 
Investigation of phylogenic relationships of shrew populations using genetic...
Investigation of phylogenic relationships  of shrew populations using genetic...Investigation of phylogenic relationships  of shrew populations using genetic...
Investigation of phylogenic relationships of shrew populations using genetic...
Juan Barrera
 
Investigation of phylogenic relationships of shrew populations using genetic...
Investigation of phylogenic relationships  of shrew populations using genetic...Investigation of phylogenic relationships  of shrew populations using genetic...
Investigation of phylogenic relationships of shrew populations using genetic...
Juan Barrera
 
genetics lab poster SRC
genetics lab poster SRCgenetics lab poster SRC
genetics lab poster SRC
Juan Barrera
 
ELS - M9 L3 L4 print.pdf
ELS - M9 L3 L4 print.pdfELS - M9 L3 L4 print.pdf
ELS - M9 L3 L4 print.pdf
BobbyPabores1
 
PAG-2004-Roe
PAG-2004-RoePAG-2004-Roe
PAG-2004-Roe
mounir elharam
 
CV only
CV onlyCV only
Making Protein Function and Subcellular Localization Predictions: Challenges ...
Making Protein Function and Subcellular Localization Predictions: Challenges ...Making Protein Function and Subcellular Localization Predictions: Challenges ...
Making Protein Function and Subcellular Localization Predictions: Challenges ...
fionabrinkman
 
genome sequencing, types by kk sahu sir
genome sequencing, types by kk sahu sirgenome sequencing, types by kk sahu sir
genome sequencing, types by kk sahu sir
KAUSHAL SAHU
 
Disrupted development and altered hormone signaling in male Padi2:Padi4 doubl...
Disrupted development and altered hormone signaling in male Padi2:Padi4 doubl...Disrupted development and altered hormone signaling in male Padi2:Padi4 doubl...
Disrupted development and altered hormone signaling in male Padi2:Padi4 doubl...
Cornell University
 
Talk on Phylogenomics for MBL Molecular Evolution Course 2004
Talk on Phylogenomics for MBL Molecular Evolution Course 2004Talk on Phylogenomics for MBL Molecular Evolution Course 2004
Talk on Phylogenomics for MBL Molecular Evolution Course 2004
Jonathan Eisen
 
Genetic and Molecular Characterization of a Dental Pathogen Using a Genome-Wi...
Genetic and Molecular Characterization of a Dental Pathogen Using a Genome-Wi...Genetic and Molecular Characterization of a Dental Pathogen Using a Genome-Wi...
Genetic and Molecular Characterization of a Dental Pathogen Using a Genome-Wi...
shabeel pn
 
Poster_Molecular analysis of BRAF and RAS family genes in thyroid carcinoma i...
Poster_Molecular analysis of BRAF and RAS family genes in thyroid carcinoma i...Poster_Molecular analysis of BRAF and RAS family genes in thyroid carcinoma i...
Poster_Molecular analysis of BRAF and RAS family genes in thyroid carcinoma i...
Alexandra Papadopoulou
 
PREVALENCE OF CANCER ASSOCIATED GENES IN BREAST CANCER PATIENTS IN THE HOSPIT...
PREVALENCE OF CANCER ASSOCIATED GENES IN BREAST CANCER PATIENTS IN THE HOSPIT...PREVALENCE OF CANCER ASSOCIATED GENES IN BREAST CANCER PATIENTS IN THE HOSPIT...
PREVALENCE OF CANCER ASSOCIATED GENES IN BREAST CANCER PATIENTS IN THE HOSPIT...
Jagadish Hansa
 
Big data biology for pythonistas: getting in on the genomics revolution
Big data biology for pythonistas: getting in on the genomics revolutionBig data biology for pythonistas: getting in on the genomics revolution
Big data biology for pythonistas: getting in on the genomics revolution
Darya Vanichkina
 

Similar to BIOL335: How to annotate a genome (20)

Abr 8(3) abstracts_v2
Abr 8(3) abstracts_v2Abr 8(3) abstracts_v2
Abr 8(3) abstracts_v2
 
Genome project.pdf
Genome project.pdfGenome project.pdf
Genome project.pdf
 
Optimizing Grape Rootstock Production and Export of inhibitors of X. fastidio...
Optimizing Grape Rootstock Production and Export of inhibitors of X. fastidio...Optimizing Grape Rootstock Production and Export of inhibitors of X. fastidio...
Optimizing Grape Rootstock Production and Export of inhibitors of X. fastidio...
 
2013_CarterEtal_MultiplexPCR-Cronobacter_ AEM
2013_CarterEtal_MultiplexPCR-Cronobacter_ AEM2013_CarterEtal_MultiplexPCR-Cronobacter_ AEM
2013_CarterEtal_MultiplexPCR-Cronobacter_ AEM
 
David Gardner
David GardnerDavid Gardner
David Gardner
 
Phytothreats: WP4 overview
Phytothreats: WP4 overviewPhytothreats: WP4 overview
Phytothreats: WP4 overview
 
Investigation of phylogenic relationships of shrew populations using genetic...
Investigation of phylogenic relationships  of shrew populations using genetic...Investigation of phylogenic relationships  of shrew populations using genetic...
Investigation of phylogenic relationships of shrew populations using genetic...
 
Investigation of phylogenic relationships of shrew populations using genetic...
Investigation of phylogenic relationships  of shrew populations using genetic...Investigation of phylogenic relationships  of shrew populations using genetic...
Investigation of phylogenic relationships of shrew populations using genetic...
 
genetics lab poster SRC
genetics lab poster SRCgenetics lab poster SRC
genetics lab poster SRC
 
ELS - M9 L3 L4 print.pdf
ELS - M9 L3 L4 print.pdfELS - M9 L3 L4 print.pdf
ELS - M9 L3 L4 print.pdf
 
PAG-2004-Roe
PAG-2004-RoePAG-2004-Roe
PAG-2004-Roe
 
CV only
CV onlyCV only
CV only
 
Making Protein Function and Subcellular Localization Predictions: Challenges ...
Making Protein Function and Subcellular Localization Predictions: Challenges ...Making Protein Function and Subcellular Localization Predictions: Challenges ...
Making Protein Function and Subcellular Localization Predictions: Challenges ...
 
genome sequencing, types by kk sahu sir
genome sequencing, types by kk sahu sirgenome sequencing, types by kk sahu sir
genome sequencing, types by kk sahu sir
 
Disrupted development and altered hormone signaling in male Padi2:Padi4 doubl...
Disrupted development and altered hormone signaling in male Padi2:Padi4 doubl...Disrupted development and altered hormone signaling in male Padi2:Padi4 doubl...
Disrupted development and altered hormone signaling in male Padi2:Padi4 doubl...
 
Talk on Phylogenomics for MBL Molecular Evolution Course 2004
Talk on Phylogenomics for MBL Molecular Evolution Course 2004Talk on Phylogenomics for MBL Molecular Evolution Course 2004
Talk on Phylogenomics for MBL Molecular Evolution Course 2004
 
Genetic and Molecular Characterization of a Dental Pathogen Using a Genome-Wi...
Genetic and Molecular Characterization of a Dental Pathogen Using a Genome-Wi...Genetic and Molecular Characterization of a Dental Pathogen Using a Genome-Wi...
Genetic and Molecular Characterization of a Dental Pathogen Using a Genome-Wi...
 
Poster_Molecular analysis of BRAF and RAS family genes in thyroid carcinoma i...
Poster_Molecular analysis of BRAF and RAS family genes in thyroid carcinoma i...Poster_Molecular analysis of BRAF and RAS family genes in thyroid carcinoma i...
Poster_Molecular analysis of BRAF and RAS family genes in thyroid carcinoma i...
 
PREVALENCE OF CANCER ASSOCIATED GENES IN BREAST CANCER PATIENTS IN THE HOSPIT...
PREVALENCE OF CANCER ASSOCIATED GENES IN BREAST CANCER PATIENTS IN THE HOSPIT...PREVALENCE OF CANCER ASSOCIATED GENES IN BREAST CANCER PATIENTS IN THE HOSPIT...
PREVALENCE OF CANCER ASSOCIATED GENES IN BREAST CANCER PATIENTS IN THE HOSPIT...
 
Big data biology for pythonistas: getting in on the genomics revolution
Big data biology for pythonistas: getting in on the genomics revolutionBig data biology for pythonistas: getting in on the genomics revolution
Big data biology for pythonistas: getting in on the genomics revolution
 

More from Paul Gardner

ppgardner-lecture07-genome-function.pdf
ppgardner-lecture07-genome-function.pdfppgardner-lecture07-genome-function.pdf
ppgardner-lecture07-genome-function.pdf
Paul Gardner
 
ppgardner-lecture06-homologysearch.pdf
ppgardner-lecture06-homologysearch.pdfppgardner-lecture06-homologysearch.pdf
ppgardner-lecture06-homologysearch.pdf
Paul Gardner
 
ppgardner-lecture05-alignment-comparativegenomics.pdf
ppgardner-lecture05-alignment-comparativegenomics.pdfppgardner-lecture05-alignment-comparativegenomics.pdf
ppgardner-lecture05-alignment-comparativegenomics.pdf
Paul Gardner
 
ppgardner-lecture04-annotation-comparativegenomics.pdf
ppgardner-lecture04-annotation-comparativegenomics.pdfppgardner-lecture04-annotation-comparativegenomics.pdf
ppgardner-lecture04-annotation-comparativegenomics.pdf
Paul Gardner
 
ppgardner-lecture03-genomesize-complexity.pdf
ppgardner-lecture03-genomesize-complexity.pdfppgardner-lecture03-genomesize-complexity.pdf
ppgardner-lecture03-genomesize-complexity.pdf
Paul Gardner
 
Does RNA avoidance dictate protein expression level?
Does RNA avoidance dictate protein expression level?Does RNA avoidance dictate protein expression level?
Does RNA avoidance dictate protein expression level?
Paul Gardner
 
Machine learning methods
Machine learning methodsMachine learning methods
Machine learning methods
Paul Gardner
 
Clustering
ClusteringClustering
Clustering
Paul Gardner
 
Monte Carlo methods
Monte Carlo methodsMonte Carlo methods
Monte Carlo methods
Paul Gardner
 
The jackknife and bootstrap
The jackknife and bootstrapThe jackknife and bootstrap
The jackknife and bootstrap
Paul Gardner
 
Contingency tables
Contingency tablesContingency tables
Contingency tables
Paul Gardner
 
Regression (II)
Regression (II)Regression (II)
Regression (II)
Paul Gardner
 
Regression (I)
Regression (I)Regression (I)
Regression (I)
Paul Gardner
 
Analysis of covariation and correlation
Analysis of covariation and correlationAnalysis of covariation and correlation
Analysis of covariation and correlation
Paul Gardner
 
Analysis of two samples
Analysis of two samplesAnalysis of two samples
Analysis of two samples
Paul Gardner
 
Analysis of single samples
Analysis of single samplesAnalysis of single samples
Analysis of single samples
Paul Gardner
 
Centrality and spread
Centrality and spreadCentrality and spread
Centrality and spread
Paul Gardner
 
Fundamentals of statistical analysis
Fundamentals of statistical analysisFundamentals of statistical analysis
Fundamentals of statistical analysis
Paul Gardner
 
Random RNA interactions control protein expression in prokaryotes
Random RNA interactions control protein expression in prokaryotesRandom RNA interactions control protein expression in prokaryotes
Random RNA interactions control protein expression in prokaryotes
Paul Gardner
 
Avoidance of stochastic RNA interactions can be harnessed to control protein ...
Avoidance of stochastic RNA interactions can be harnessed to control protein ...Avoidance of stochastic RNA interactions can be harnessed to control protein ...
Avoidance of stochastic RNA interactions can be harnessed to control protein ...
Paul Gardner
 

More from Paul Gardner (20)

ppgardner-lecture07-genome-function.pdf
ppgardner-lecture07-genome-function.pdfppgardner-lecture07-genome-function.pdf
ppgardner-lecture07-genome-function.pdf
 
ppgardner-lecture06-homologysearch.pdf
ppgardner-lecture06-homologysearch.pdfppgardner-lecture06-homologysearch.pdf
ppgardner-lecture06-homologysearch.pdf
 
ppgardner-lecture05-alignment-comparativegenomics.pdf
ppgardner-lecture05-alignment-comparativegenomics.pdfppgardner-lecture05-alignment-comparativegenomics.pdf
ppgardner-lecture05-alignment-comparativegenomics.pdf
 
ppgardner-lecture04-annotation-comparativegenomics.pdf
ppgardner-lecture04-annotation-comparativegenomics.pdfppgardner-lecture04-annotation-comparativegenomics.pdf
ppgardner-lecture04-annotation-comparativegenomics.pdf
 
ppgardner-lecture03-genomesize-complexity.pdf
ppgardner-lecture03-genomesize-complexity.pdfppgardner-lecture03-genomesize-complexity.pdf
ppgardner-lecture03-genomesize-complexity.pdf
 
Does RNA avoidance dictate protein expression level?
Does RNA avoidance dictate protein expression level?Does RNA avoidance dictate protein expression level?
Does RNA avoidance dictate protein expression level?
 
Machine learning methods
Machine learning methodsMachine learning methods
Machine learning methods
 
Clustering
ClusteringClustering
Clustering
 
Monte Carlo methods
Monte Carlo methodsMonte Carlo methods
Monte Carlo methods
 
The jackknife and bootstrap
The jackknife and bootstrapThe jackknife and bootstrap
The jackknife and bootstrap
 
Contingency tables
Contingency tablesContingency tables
Contingency tables
 
Regression (II)
Regression (II)Regression (II)
Regression (II)
 
Regression (I)
Regression (I)Regression (I)
Regression (I)
 
Analysis of covariation and correlation
Analysis of covariation and correlationAnalysis of covariation and correlation
Analysis of covariation and correlation
 
Analysis of two samples
Analysis of two samplesAnalysis of two samples
Analysis of two samples
 
Analysis of single samples
Analysis of single samplesAnalysis of single samples
Analysis of single samples
 
Centrality and spread
Centrality and spreadCentrality and spread
Centrality and spread
 
Fundamentals of statistical analysis
Fundamentals of statistical analysisFundamentals of statistical analysis
Fundamentals of statistical analysis
 
Random RNA interactions control protein expression in prokaryotes
Random RNA interactions control protein expression in prokaryotesRandom RNA interactions control protein expression in prokaryotes
Random RNA interactions control protein expression in prokaryotes
 
Avoidance of stochastic RNA interactions can be harnessed to control protein ...
Avoidance of stochastic RNA interactions can be harnessed to control protein ...Avoidance of stochastic RNA interactions can be harnessed to control protein ...
Avoidance of stochastic RNA interactions can be harnessed to control protein ...
 

Recently uploaded

快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
hozt8xgk
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
Anagha Prasad
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
RDhivya6
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Leonel Morgado
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1
Shashank Shekhar Pandey
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 
Katherine Romanak - Geologic CO2 Storage.pdf
Katherine Romanak - Geologic CO2 Storage.pdfKatherine Romanak - Geologic CO2 Storage.pdf
Katherine Romanak - Geologic CO2 Storage.pdf
Texas Alliance of Groundwater Districts
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
LengamoLAppostilic
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
MaheshaNanjegowda
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
Daniel Tubbenhauer
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
Vandana Devesh Sharma
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
Areesha Ahmad
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
Scintica Instrumentation
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
pablovgd
 
Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
International Food Policy Research Institute- South Asia Office
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Selcen Ozturkcan
 

Recently uploaded (20)

快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 
Katherine Romanak - Geologic CO2 Storage.pdf
Katherine Romanak - Geologic CO2 Storage.pdfKatherine Romanak - Geologic CO2 Storage.pdf
Katherine Romanak - Geologic CO2 Storage.pdf
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
 
Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
 

BIOL335: How to annotate a genome