1. The Genomic Revolution; How
Sequencing Anything
and Everything Is Changing the
Way We Do Science
{
C. Titus Brown
2.
3.
4.
Reed College, BA in Mathematics
Caltech, PhD and post-doctoral fellow in
Biology;
Michigan State University, Assistant Professor
in Biology and Computer Science.
My background
5.
I’m still confused by almost everything, but in
some cases I have a lot more detail to be
confused about.
So if you ask questions, I may say “I don’t
know!” (but may then guess).
Please! Ask questions!
On “Expertise”
6.
First, genetic investigation of
fetuses in utero;
Second, tracking hospital
infections;
Third, investigating global nutrient
cycling.
Three stories:
7. Genome sequencing!!
The ability to cheaply sequence
DNA is an extremely exciting and
fairly new technique; all three
stories used this extensively.
Why these stories?
8. 1.
2.
3.
4.
5.
DNA, genomes, and sequencing;
Story 1: genetics of unborn fetuses
Story 2: staph transmission in hospitals
Story 3: global nutrient cycling in the oceans
My research, briefly!
Outline
11. This means that for a string of 10
DNA bases, there are over 1 million
combinations!
AAAAAAAAAT
AAAAAAAATA
AAAAAAATAA
…
DNA is combinatorial.
12.
This combinatorial property matters
because it means that DNA is an alphabet
and can be used as a language – you can
build “words” and “sentences” in it.
(Just in case you’re wondering, we still
don’t really understand the language in
detail, although we know a lot about it.)
DNA is a language.
13. Every cell in your body contains about 6
billion bases of DNA, in a particular
order.
This is your genome.
Almost every one of your cells contains
the same 6 billion bases of DNA.
…and it’s what you pass on to your
children.
DNA underlies heredity
14. Since your genome has 6 billion
bases of DNA, it would take up
about 1.5 million pages in a book -This book would be the architectural
plans for you!
DNA is a language.
15.
Sequencing your genome is the same thing as
digitizing it – putting the sequences of bases
into a format that computers can read.
Analogy: scanning in old photos.
Important side note: just because you can
digitize it, doesn’t mean you understand it!
“Sequencing” the genome.
16.
You can look for known words and sentences,
to diagnose disease susceptibility.
You can compare with other genomes, to find
out what words and sentences might be
responsible for disease.
Why is it useful to
sequence your genome?
17.
The first human genome cost between $300m
and $3bn dollars. That was in ~2002.
Today, you can sequence your genome for
under $5000!
This decrease in cost lets us look at a lot more
genomes!
…and the price is dropping fast.
How much does it cost?!
18.
Knowing a particular genome sequence lets us
look for known disease susceptibility, as well as
helping us find “words” associated with
unknown diseases.
We can do this for around $5000 per person.
Summary of DNA:
32. Pregnant mother
+ Father
(+ Fetal cells)
Mother's blood plasma.
Sequence plasma, mother,
and father – then count.
* Complication: between 1/10 and ½ of cells are fetal.
34. So, it’s now possible (if not yet really
cheap!) to non-invasively figure out
the genotype of a fetus, by sampling
parents + blood.
Instead of one genome,
sequence three!
35. DNA sequencing shows a lot of promise
for diagnosing rare diseases.
http://www.forbes.com/sites/matthewherper/2011/01/05
/the-first-child-saved-by-dna-sequencing/
38. Staph tends to attack soft tissue in
people who are already ill.
Correlation between staph
infections and hospitals/assisted
care.
Staph infections are a
problem!
41.
Does it spread within facilities?
or
Does it spread between facilities?
How does staph spread?
42.
Sequence staph strains from within hospitals.
If transmission is within hospital, all the strains
will look alike.
If transmission is mainly from outside, strains
will be spread across hospitals.
Approach:
46. More than 80% of staph infections
were newly acquired from nonpatients!
Implications for prevention: focus on
isolation from outsiders, not just
patients.
Conclusions: mostly from outside.
49. The Great Plate Count Anomaly: most
microbes cannot be studied in the lab
http://schaechter.asmblog.org/schaechter/2010/07/the-uncultured-bacteria.html
51. Measurements + extrapolation suggest:
1/3 of cells in ocean are archaeal;
2/3 of cells in ocean are bacterial.
Approximately 20% of cells are from one group of
archaea.
Distribution of microbial archaea off
of Hawaii; why so many, so deep?
52.
Hints came from just sequencing “seawater” in
2004:
“an ammonium monooxygenase gene was found
on an archaeal-associated” section of genome.
What are all these archaea
doing!?
Venter et al., 2004. Science.
54. Current theory is that a majority of the
nitrification in the ocean (a driver of this CO2
sequestration pump) occurs via these archaeal
cells.
What are all these archaea
doing!?
55.
More emphasis on analysis rather than just
data gathering.
More exploration – “just sequence it”.
More unexpected results!
How is cheap sequencing
changing research?
56.
We can generate a lot of data quite
easily.
How do we interpret the data
correctly, and efficiently? How do we
correlate between data sets?
How can we do good biology in the
face of these technical challenges?
My research is:
58. Sea lamprey in the Great Lakes
Non-native
Parasite of
medium to large
fishes
Caused
populations of
host fishes to
crash
Li Lab / Y-W C-D
59. Tail loss and notochord
genes
a) M. oculata b) hybrid (occulta egg x oculata sperm) c) M. occulta
Notochord cells in orange
Swalla, B. et al. Science, Vol 274, Issue 5290, 1205-1208 , 15 November 1996
60. You do!
Via:
National Science Foundation (NSF);
National Institutes of Health (NIH);
US Department of Agriculture (USDA);
US Department of Energy (DOE)
Who funds all this
research (including mine)?
Larvae/stream bottoms 3-6 years; parasitic adult -> great lakes, 12-20 months feeding. 5-8 years. 40 lbs of fish per life as parasite. 98% of fish in great lakes went away!
Notochord cells present, do not intercalate or extend