The Genomic Revolution; How
Sequencing Anything
and Everything Is Changing the
Way We Do Science

{

C. Titus Brown





Reed College, BA in Mathematics
Caltech, PhD and post-doctoral fellow in
Biology;
Michigan State University, Assistant Professor
in Biology and Computer Science.

My background


I’m still confused by almost everything, but in
some cases I have a lot more detail to be
confused about.



So if you ask questions, I may say “I don’t
know!” (but may then guess).



Please! Ask questions!

On “Expertise”


First, genetic investigation of
fetuses in utero;



Second, tracking hospital
infections;



Third, investigating global nutrient
cycling.

Three stories:
Genome sequencing!!
The ability to cheaply sequence
DNA is an extremely exciting and
fairly new technique; all three
stories used this extensively.

Why these stories?
1.
2.
3.
4.
5.

DNA, genomes, and sequencing;
Story 1: genetics of unborn fetuses
Story 2: staph transmission in hospitals
Story 3: global nutrient cycling in the oceans
My research, briefly!

Outline
A

DNA.

C

G

T
AGTCCA is different from CCAAGT!

DNA is combinatorial.
This means that for a string of 10
DNA bases, there are over 1 million
combinations!
AAAAAAAAAT
AAAAAAAATA
AAAAAAATAA
…

DNA is combinatorial.


This combinatorial property matters
because it means that DNA is an alphabet
and can be used as a language – you can
build “words” and “sentences” in it.



(Just in case you’re wondering, we still
don’t really understand the language in
detail, although we know a lot about it.)

DNA is a language.
Every cell in your body contains about 6
billion bases of DNA, in a particular
order.
 This is your genome.


Almost every one of your cells contains
the same 6 billion bases of DNA.
 …and it’s what you pass on to your
children.


DNA underlies heredity
Since your genome has 6 billion
bases of DNA, it would take up
about 1.5 million pages in a book -This book would be the architectural
plans for you!

DNA is a language.


Sequencing your genome is the same thing as
digitizing it – putting the sequences of bases
into a format that computers can read.



Analogy: scanning in old photos.



Important side note: just because you can
digitize it, doesn’t mean you understand it!

“Sequencing” the genome.


You can look for known words and sentences,
to diagnose disease susceptibility.



You can compare with other genomes, to find
out what words and sentences might be
responsible for disease.

Why is it useful to
sequence your genome?


The first human genome cost between $300m
and $3bn dollars. That was in ~2002.



Today, you can sequence your genome for
under $5000!



This decrease in cost lets us look at a lot more
genomes!



…and the price is dropping fast.

How much does it cost?!


Knowing a particular genome sequence lets us
look for known disease susceptibility, as well as
helping us find “words” associated with
unknown diseases.



We can do this for around $5000 per person.

Summary of DNA:
Questions at this point?
You have two
near-copies of
each string of
DNA, or
“chromosome”.

Inheritance of traits.
These two copies
are a bit different.

Inheritance of traits.
One copy may
carry a particular
trait – say,
albinism, or wet
earwax.

Inheritance of traits.
Non-albino

This trait may not
show up if you
have only one
copy (albinism).

Inheritance of traits.
Albino

But if it’s on both
copies, it may
have an effect.

Inheritance traits.
Non-albino

Albino

Albinism occurs only with two copies of
albino trait.
OK

Very badly ill

…many diseases work the same way.
OK

Very badly ill

Can we diagnose fetuses?
Amniocentisis is invasive.
http://www.reproduccionasistida.org/evitar-amniocentesis/
Father’s genome

Mother’s genome

Child’s genome
Heredity and crossover.
Thomas Hunt Morgan, 1916
Mother’s genome

Father’s genome

Children’s genomes
(Only 1 in 4 will have trait.)
Pregnant mother
+ Father

(+ Fetal cells)

Mother's blood plasma.

Sequence plasma, mother,
and father – then count.
* Complication: between 1/10 and ½ of cells are fetal.
Good accuracy!
Fan et al., 2012
So, it’s now possible (if not yet really
cheap!) to non-invasively figure out
the genotype of a fetus, by sampling
parents + blood.

Instead of one genome,
sequence three!
DNA sequencing shows a lot of promise
for diagnosing rare diseases.
http://www.forbes.com/sites/matthewherper/2011/01/05
/the-first-child-saved-by-dna-sequencing/
Questions?
Methicillin-resistant Staph
(“MRSA”)
Wikipedia
Staph tends to attack soft tissue in
people who are already ill.
 Correlation between staph
infections and hospitals/assisted
care.


Staph infections are a
problem!
Alice

Enters hospital

Megan

Chris

Bob

Julia

Hypothesis 1: broad transmission
Jason
(carrier)

Visits hospital

Cathy

Health care worker

Bob
Megan

Alice

Julia

Chris

Transfers from another facility

Hypothesis 2: deep transmission


Does it spread within facilities?

or


Does it spread between facilities?

How does staph spread?


Sequence staph strains from within hospitals.



If transmission is within hospital, all the strains
will look alike.



If transmission is mainly from outside, strains
will be spread across hospitals.

Approach:
Tracking transmission by mutations in the genome

Ancestor

Present strains
Strain relatedness

Hospitals:

Do staph strains cluster by hospital?
Prosperi et al., 2013. Nature.
Strain relatedness

Hospitals:

Do staph strains cluster by hospital? No!
Prosperi et al., 2013. Nature.
More than 80% of staph infections
were newly acquired from nonpatients!
Implications for prevention: focus on
isolation from outsiders, not just
patients.

Conclusions: mostly from outside.
Questions?
Exploring the microbial
unknown!
The Great Plate Count Anomaly: most
microbes cannot be studied in the lab
http://schaechter.asmblog.org/schaechter/2010/07/the-uncultured-bacteria.html
Depth

Location

Distribution of microbial archaea off
of Hawaii; why so many, so deep?
Karner et al., Nature, 2001.
Measurements + extrapolation suggest:
1/3 of cells in ocean are archaeal;
2/3 of cells in ocean are bacterial.
Approximately 20% of cells are from one group of
archaea.

Distribution of microbial archaea off
of Hawaii; why so many, so deep?


Hints came from just sequencing “seawater” in
2004:

“an ammonium monooxygenase gene was found
on an archaeal-associated” section of genome.

What are all these archaea
doing!?
Venter et al., 2004. Science.
“Primary pump” – CO2
sequestration in deep ocean
Wikipedia
Current theory is that a majority of the
nitrification in the ocean (a driver of this CO2
sequestration pump) occurs via these archaeal
cells.

What are all these archaea
doing!?





More emphasis on analysis rather than just
data gathering.
More exploration – “just sequence it”.
More unexpected results!

How is cheap sequencing
changing research?


We can generate a lot of data quite
easily.



How do we interpret the data
correctly, and efficiently? How do we
correlate between data sets?



How can we do good biology in the
face of these technical challenges? 

My research is:
Great Prairie Grand
Challenge --SAMPLING
LOCATIONS

2008
Sea lamprey in the Great Lakes

Non-native
 Parasite of
medium to large
fishes
 Caused
populations of
host fishes to
crash


Li Lab / Y-W C-D
Tail loss and notochord
genes

a) M. oculata b) hybrid (occulta egg x oculata sperm) c) M. occulta
Notochord cells in orange
Swalla, B. et al. Science, Vol 274, Issue 5290, 1205-1208 , 15 November 1996
You do!
Via:
National Science Foundation (NSF);
National Institutes of Health (NIH);
US Department of Agriculture (USDA);
US Department of Energy (DOE)

Who funds all this
research (including mine)?
Titus Brown, ctb@msu.edu
(Just google me.)

Thanks!

2014 whitney-public-talk

  • 1.
    The Genomic Revolution;How Sequencing Anything and Everything Is Changing the Way We Do Science { C. Titus Brown
  • 4.
       Reed College, BAin Mathematics Caltech, PhD and post-doctoral fellow in Biology; Michigan State University, Assistant Professor in Biology and Computer Science. My background
  • 5.
     I’m still confusedby almost everything, but in some cases I have a lot more detail to be confused about.  So if you ask questions, I may say “I don’t know!” (but may then guess).  Please! Ask questions! On “Expertise”
  • 6.
     First, genetic investigationof fetuses in utero;  Second, tracking hospital infections;  Third, investigating global nutrient cycling. Three stories:
  • 7.
    Genome sequencing!! The abilityto cheaply sequence DNA is an extremely exciting and fairly new technique; all three stories used this extensively. Why these stories?
  • 8.
    1. 2. 3. 4. 5. DNA, genomes, andsequencing; Story 1: genetics of unborn fetuses Story 2: staph transmission in hospitals Story 3: global nutrient cycling in the oceans My research, briefly! Outline
  • 9.
  • 10.
    AGTCCA is differentfrom CCAAGT! DNA is combinatorial.
  • 11.
    This means thatfor a string of 10 DNA bases, there are over 1 million combinations! AAAAAAAAAT AAAAAAAATA AAAAAAATAA … DNA is combinatorial.
  • 12.
     This combinatorial propertymatters because it means that DNA is an alphabet and can be used as a language – you can build “words” and “sentences” in it.  (Just in case you’re wondering, we still don’t really understand the language in detail, although we know a lot about it.) DNA is a language.
  • 13.
    Every cell inyour body contains about 6 billion bases of DNA, in a particular order.  This is your genome.  Almost every one of your cells contains the same 6 billion bases of DNA.  …and it’s what you pass on to your children.  DNA underlies heredity
  • 14.
    Since your genomehas 6 billion bases of DNA, it would take up about 1.5 million pages in a book -This book would be the architectural plans for you! DNA is a language.
  • 15.
     Sequencing your genomeis the same thing as digitizing it – putting the sequences of bases into a format that computers can read.  Analogy: scanning in old photos.  Important side note: just because you can digitize it, doesn’t mean you understand it! “Sequencing” the genome.
  • 16.
     You can lookfor known words and sentences, to diagnose disease susceptibility.  You can compare with other genomes, to find out what words and sentences might be responsible for disease. Why is it useful to sequence your genome?
  • 17.
     The first humangenome cost between $300m and $3bn dollars. That was in ~2002.  Today, you can sequence your genome for under $5000!  This decrease in cost lets us look at a lot more genomes!  …and the price is dropping fast. How much does it cost?!
  • 18.
     Knowing a particulargenome sequence lets us look for known disease susceptibility, as well as helping us find “words” associated with unknown diseases.  We can do this for around $5000 per person. Summary of DNA:
  • 19.
  • 20.
    You have two near-copiesof each string of DNA, or “chromosome”. Inheritance of traits.
  • 21.
    These two copies area bit different. Inheritance of traits.
  • 22.
    One copy may carrya particular trait – say, albinism, or wet earwax. Inheritance of traits.
  • 23.
    Non-albino This trait maynot show up if you have only one copy (albinism). Inheritance of traits.
  • 24.
    Albino But if it’son both copies, it may have an effect. Inheritance traits.
  • 25.
    Non-albino Albino Albinism occurs onlywith two copies of albino trait.
  • 26.
    OK Very badly ill …manydiseases work the same way.
  • 27.
    OK Very badly ill Canwe diagnose fetuses?
  • 28.
  • 29.
  • 30.
  • 31.
    Mother’s genome Father’s genome Children’sgenomes (Only 1 in 4 will have trait.)
  • 32.
    Pregnant mother + Father (+Fetal cells) Mother's blood plasma. Sequence plasma, mother, and father – then count. * Complication: between 1/10 and ½ of cells are fetal.
  • 33.
  • 34.
    So, it’s nowpossible (if not yet really cheap!) to non-invasively figure out the genotype of a fetus, by sampling parents + blood. Instead of one genome, sequence three!
  • 35.
    DNA sequencing showsa lot of promise for diagnosing rare diseases. http://www.forbes.com/sites/matthewherper/2011/01/05 /the-first-child-saved-by-dna-sequencing/
  • 36.
  • 37.
  • 38.
    Staph tends toattack soft tissue in people who are already ill.  Correlation between staph infections and hospitals/assisted care.  Staph infections are a problem!
  • 39.
  • 40.
    Jason (carrier) Visits hospital Cathy Health careworker Bob Megan Alice Julia Chris Transfers from another facility Hypothesis 2: deep transmission
  • 41.
     Does it spreadwithin facilities? or  Does it spread between facilities? How does staph spread?
  • 42.
     Sequence staph strainsfrom within hospitals.  If transmission is within hospital, all the strains will look alike.  If transmission is mainly from outside, strains will be spread across hospitals. Approach:
  • 43.
    Tracking transmission bymutations in the genome Ancestor Present strains
  • 44.
    Strain relatedness Hospitals: Do staphstrains cluster by hospital? Prosperi et al., 2013. Nature.
  • 45.
    Strain relatedness Hospitals: Do staphstrains cluster by hospital? No! Prosperi et al., 2013. Nature.
  • 46.
    More than 80%of staph infections were newly acquired from nonpatients! Implications for prevention: focus on isolation from outsiders, not just patients. Conclusions: mostly from outside.
  • 47.
  • 48.
  • 49.
    The Great PlateCount Anomaly: most microbes cannot be studied in the lab http://schaechter.asmblog.org/schaechter/2010/07/the-uncultured-bacteria.html
  • 50.
    Depth Location Distribution of microbialarchaea off of Hawaii; why so many, so deep? Karner et al., Nature, 2001.
  • 51.
    Measurements + extrapolationsuggest: 1/3 of cells in ocean are archaeal; 2/3 of cells in ocean are bacterial. Approximately 20% of cells are from one group of archaea. Distribution of microbial archaea off of Hawaii; why so many, so deep?
  • 52.
     Hints came fromjust sequencing “seawater” in 2004: “an ammonium monooxygenase gene was found on an archaeal-associated” section of genome. What are all these archaea doing!? Venter et al., 2004. Science.
  • 53.
    “Primary pump” –CO2 sequestration in deep ocean Wikipedia
  • 54.
    Current theory isthat a majority of the nitrification in the ocean (a driver of this CO2 sequestration pump) occurs via these archaeal cells. What are all these archaea doing!?
  • 55.
       More emphasis onanalysis rather than just data gathering. More exploration – “just sequence it”. More unexpected results! How is cheap sequencing changing research?
  • 56.
     We can generatea lot of data quite easily.  How do we interpret the data correctly, and efficiently? How do we correlate between data sets?  How can we do good biology in the face of these technical challenges?  My research is:
  • 57.
    Great Prairie Grand Challenge--SAMPLING LOCATIONS 2008
  • 58.
    Sea lamprey inthe Great Lakes Non-native  Parasite of medium to large fishes  Caused populations of host fishes to crash  Li Lab / Y-W C-D
  • 59.
    Tail loss andnotochord genes a) M. oculata b) hybrid (occulta egg x oculata sperm) c) M. occulta Notochord cells in orange Swalla, B. et al. Science, Vol 274, Issue 5290, 1205-1208 , 15 November 1996
  • 60.
    You do! Via: National ScienceFoundation (NSF); National Institutes of Health (NIH); US Department of Agriculture (USDA); US Department of Energy (DOE) Who funds all this research (including mine)?
  • 61.
    Titus Brown, ctb@msu.edu (Justgoogle me.) Thanks!

Editor's Notes

  • #9 Please ask questions!
  • #28 Down’s synrdome
  • #29 Down’s syndrome
  • #35 Contact.
  • #53 Ammonia -> nitrite -> nitrate.
  • #59 Larvae/stream bottoms 3-6 years; parasitic adult -> great lakes, 12-20 months feeding. 5-8 years. 40 lbs of fish per life as parasite. 98% of fish in great lakes went away!
  • #60 Notochord cells present, do not intercalate or extend