The Story of
The Human Genome Project (HGP)
Presented By:
Punith Kumar. S
Department of Biotechnology
Bangalore university
Guided By:
Dr.Lakshmi. G
Department of Biotechnology
Bangalore university
Contents:
 Inception of Human Genome Project.
 Objectives.
 Major contributions from NIH & DOE.
 Methods used for sequencing.
 Challenges of sequencing.
 Announcement of completion.
 Conclusion
2
3
Inception Of Human Genome Project
1986
1984
Drumbeat of Discussions Leading Up to HGP
1987
1988 1988 1989
“For the newly developing discipline
of [genome] mapping/sequencing
(including the analysis of the
information), we have adopted the
term GENOMICS…
Genomics
Molecular Biology Revolution Set the Stage
for the Human Genome Project (HGP)
1970s 1977 1983
DNA
Cloning
DNA
Sequencing
Polymerase Chain
Reaction (PCR)
6
1. Expect to be a 15-year initiative
2. Gain experience with model (i.e., well-studied, experimental) organisms with
smaller genomes before giving full attention to human genome
3. In each case, map (i.e., organize) DNA first and then sequence (i.e., read) DNA
4. Wait to sequence human genome until a new ‘revolutionary’ DNA sequencing
method(s) becomes available – replacing Sanger DNA sequencing
5. Make generating the first sequence of the human genome the signature
accomplishment of the HGP
Initially Envisioned Plan for HGP
Broad Objectives :
 To identify all the genes of human genome.
 To sequence the ~3 billion nucleotides of human genome.
 To identify all the disease causing genes and understand their function.
 To develop database to store this information and make the information
available to al the researchers.
 To develop tools for processing and analysing data.
To address ethical, legal , and social issues.
8
9
Contributions from Department Of
Energy
1
0
Contributions from National Institute Of Health
Genomes Organized by Chromosomes
Fruit Fly
Yeast Human
Nematode
Clone-Based Physical Mapping
Chromosome
Clone Contigs
Clones
Larger Clones (like ‘Chapters’) Smaller Clones (like ‘Pages’)
Shotgun Sequencing
Shotgun sequencing is a laboratory technique for determining the DNA
sequence of an organism’s genome (or part of the genome). The method
involves randomly breaking up the DNA into small fragments that are
then sequenced individually. A computer program looks for overlaps in
the DNA sequences, using them to reassemble the fragments in their
correct order to determine the sequence of the starting DNA.
From NHGRI’s ‘Talking Glossary’
genome.gov/genetics-glossary
Sanger sequencing ( Chain termination method )
14
Subclone Construction
Subclone Fragments
ATCGTCTAGAATCTC
AGATCTCTGAGAGTC
TGGGAAACTGTGTGA
GTGACTAGCCACAGT
TGGGAAACTGTGTGA
ACGTGTGAGAGATGT
TGATGCACCTGACCC
GGTTTCACTCTCAAC
ACTCACTCCACCTCA
TGGGAAACTGTGTGA
AGGCCCACCGCCGCT
TGCACGTCCACCACC
ATCGTCTAGAATCTC
AGATCTCTGAGAGTC
TGGGAAACTGTGTGA
GTGACTAGCCACAGT
TGGGAAACTGTGTGA
ACGTGTGAGAGATGT
TGATGCACCTGACCC
GGTTTCACTCTCAAC
ACTCACTCCACCTCA
TGGGAAACTGTGTGA
AGGCCCACCGCCGCT
TGCACGTCCACCACC
ATCGTCTAGAATCTC
AGATCTCTGAGAGTC
TGGGAAACTGTGTGA
GTGACTAGCCACAGT
TGGGAAACTGTGTGA
ACGTGTGAGAGATGT
TGATGCACCTGACCC
GGTTTCACTCTCAAC
ACTCACTCCACCTCA
TGGGAAACTGTGTGA
AGGCCCACCGCCGCT
TGCACGTCCACCACC
ATCGTCTAGAATCTC
AGATCTCTGAGAGTC
TGGGAAACTGTGTGA
GTGACTAGCCACAGT
TGGGAAACTGTGTGA
ACGTGTGAGAGATGT
TGATGCACCTGACCC
GGTTTCACTCTCAAC
ACTCACTCCACCTCA
TGGGAAACTGTGTGA
AGGCCCACCGCCGCT
TGCACGTCCACCACC
G G G G GATCGTCTAGAATCTC
G G G G GAGATCTCTGAGAGTC
G G G G GTGGGAAACTGTGTGA
T T T T TGTGACTAGCCACAGT
G G G G GTGGGAAACTGTGTGA
T T T T TACGTGTGAGAGATGT
A A A A ATGATGCACCTGACCC
G G G G GGGTTTCACTCTCAAC
G G G G GACTCACTCCACCTCA
G G G G GTGGGAAACTGTGTGA
G G G G GAGGCCCACCGCCGCT
G G G G GTGCACGTCCACCACC
Randomly Fragment
Clone DNA
Prepare Multiple Copies
GATCGTCTAGAATCTC
GAGATCTCTGAGAGTC
GTGGGAAACTGTGTGA
TGTGACTAGCCACAGT
GTGGGAAACTGTGTGA
TACGTGTGAGAGATGT
ATGATGCACCTGACCC
GGGTTTCACTCTCAAC
GACTCACTCCACCTCA
GTGGGAAACTGTGTGA
GAGGCCCACCGCCGCT
GTGCACGTCCACCACC
Shotgun Sequencing Strategy
Assemble Sequence Reads into Sequence Contigs
Deduce Sequence
‘Working Draft’ Sequence
Sequence Finishing
Final Sequence
Clone DNA
Subclones
Generate Shotgun Sequence Reads
GATCGTCTAGAATCTC
GAGATCTCTGAGAGTC
GTGGGAAACTGTGTGA
TGTGACTAGCCACAGT
GTGGGAAACTGTGTGA
TACGTGTGAGAGATGT
ATGATGCACCTGACCC
GGGTTTCACTCTCAAC
GACTCACTCCACCTCA
GTGGGAAACTGTGTGA
GAGGCCCACCGCCGCT
GTGCACGTCCACCACC
1
7
Restriction fragment length polymorphism
Example of Genome Sequence Assembly
First Eukaryotic Genomes Sequenced by HGP
Dividing Up Human Genome During HGP
For example…
Challenges of Sequencing the Human Genome
 Human Genome: ~3,000,000,000 nucleotides (bases or base pairs)
 Sanger DNA sequencing Circa 1990: ~500-800 bases per read
 ‘Coverage’ (i.e., number of time each base is read) needed to be
high (e.g., >30-fold) to attain high accuracy
 Roughly half of human genome consists repetitive DNA, much of
it reflecting remnants of transposable elements
Whose Genome Was Sequenced by HGP?
 Buffalo, NY blood donors
 93% of HGP’s human genome sequence from 11 donors
 70% of HGP’s human genome sequence from 1 donor
Humorous Aside: Advocacy by some HGP researchers to select a ‘normal’
person and sequence their genome first – as if anyone knows what ‘normal’
means!
 Significant attention to release and sharing of
HGP genome sequence data
 Two seminal meetings in Bermuda in 1996
and 1997
 Landmark agreement for rapid data release
and public access to HGP genome
sequence data
 Became known as ‘Bermuda Principles’
 Among the most important legacy of HGP
Bermuda Principles for Data Sharing
Initial HGP Plan
‘Clone-by-Clone
Shotgun Sequencing’
VS
Editorial Aside: Not really a fair ‘race’ since Celera had access to HGP data (but not vice versa)!!!
Purported ‘Race’ to Sequence Human Genome
Venter/Celera Plan
‘Whole-Genome
Shotgun Sequencing’
June 2000: Draft Sequence of Human Genome
February 2001: Papers Reporting
Draft Sequence of Human Genome
HGP Paper Venter/Celera Paper
Initial HGP Plan Venter/Celera Plan Ultimate HGP Plan
+ =
Generating the First Human Genome Sequence
 National DNA Day established
 HGP completion & 50th
anniversary of discovery of
DNA’s double-helical structure
April 25, 2003: HGP Completion
Highlight Features of HGP
 Completed ahead of schedule (13 years) and underbudget
 Signature accomplishment was generation of an extremely high-
quality sequence for >90% (‘near-complete’ or ‘essentially
complete’) of human genome
 Cost of generating first human genome sequence by HGP: ~$1 billion
 The ‘race’ between HGP and Venter/Celera melted away after
announcement of draft human genome sequence in 2000
 Similarly, the initial concerns about the HGP from some parts of the
scientific community largely melted away
 HGP set the field of genomics into a trajectory of widespread
dissemination across biology, medicine, and society
 HGP produced a high-quality human genome sequence, but it
only accounted for 92% of the human genome
 Remaining 8% was not ‘readable’ using the then-available
methods for DNA sequencing, but those regions are important
for structural (centromere and telomeres) and medical reasons
 Several new ‘revolutionary’ methods for DNA sequencing have
been developed over the last ~20 years
 These new methods plus better computational approaches set
the stage for a new group of researchers to (finally) generate
a truly complete sequence of the human genome in 2022
A Truly Complete Human
Genome Sequence
2022: A Truly Complete Human Genome Sequence
33
Recent Time Video
CONCLUSION
 HGP: 1990-2003
 HGP used a map-first, sequence-second strategy to study the human genome
 HGP used Sanger DNA sequencing – not a revolutionary new DNA sequencing method
 Sequencing the human genome was particularly difficult because of its large size, complexity,
and extensive amounts of repetitive regions
 Genome sequence assembly was (and remains) a major challenge; repetitive regions present
a particular obstacle to accurately assembling genome sequences
 Venter/Celera pursued a whole-genome sequencing strategy and tried to build a business
selling access to their data; both efforts fell short of expectations
 Ultimately, the HGP completed the task of generating the first high-quality ‘essentially
complete’ sequence of the human genome; 19 years later (in 2022), a truly complete
(‘telomere-to-telomere’) human genome sequence was finally generated
In reality, HGP was the end of one journey, but
the beginning of another.
For example, HGP determined the sequence of
most of the ~3 billion bases in the human genome,
with the next phase focused on INTERPRETING the
information encoded in that sequence – something
that continues to the present time.
Beyond the HGP
Reference:
 James D. Watson , Amy A. Caudy , Richard M. Mayers, Jan
A.Witkowski. Third edition Recombinant DNA Genes and
Genomes-A Short course . 2006. Page no: 273-304.
 National Human Genome Research
Institute https://www.genome.gov
37
Thank you
38

HUMAN GENOME PROJECT.pptx

  • 1.
    The Story of TheHuman Genome Project (HGP) Presented By: Punith Kumar. S Department of Biotechnology Bangalore university Guided By: Dr.Lakshmi. G Department of Biotechnology Bangalore university
  • 2.
    Contents:  Inception ofHuman Genome Project.  Objectives.  Major contributions from NIH & DOE.  Methods used for sequencing.  Challenges of sequencing.  Announcement of completion.  Conclusion 2
  • 3.
    3 Inception Of HumanGenome Project
  • 4.
    1986 1984 Drumbeat of DiscussionsLeading Up to HGP 1987 1988 1988 1989 “For the newly developing discipline of [genome] mapping/sequencing (including the analysis of the information), we have adopted the term GENOMICS… Genomics
  • 5.
    Molecular Biology RevolutionSet the Stage for the Human Genome Project (HGP) 1970s 1977 1983 DNA Cloning DNA Sequencing Polymerase Chain Reaction (PCR)
  • 6.
  • 7.
    1. Expect tobe a 15-year initiative 2. Gain experience with model (i.e., well-studied, experimental) organisms with smaller genomes before giving full attention to human genome 3. In each case, map (i.e., organize) DNA first and then sequence (i.e., read) DNA 4. Wait to sequence human genome until a new ‘revolutionary’ DNA sequencing method(s) becomes available – replacing Sanger DNA sequencing 5. Make generating the first sequence of the human genome the signature accomplishment of the HGP Initially Envisioned Plan for HGP
  • 8.
    Broad Objectives : To identify all the genes of human genome.  To sequence the ~3 billion nucleotides of human genome.  To identify all the disease causing genes and understand their function.  To develop database to store this information and make the information available to al the researchers.  To develop tools for processing and analysing data. To address ethical, legal , and social issues. 8
  • 9.
  • 10.
  • 11.
    Genomes Organized byChromosomes Fruit Fly Yeast Human Nematode
  • 12.
    Clone-Based Physical Mapping Chromosome CloneContigs Clones Larger Clones (like ‘Chapters’) Smaller Clones (like ‘Pages’)
  • 13.
    Shotgun Sequencing Shotgun sequencingis a laboratory technique for determining the DNA sequence of an organism’s genome (or part of the genome). The method involves randomly breaking up the DNA into small fragments that are then sequenced individually. A computer program looks for overlaps in the DNA sequences, using them to reassemble the fragments in their correct order to determine the sequence of the starting DNA. From NHGRI’s ‘Talking Glossary’ genome.gov/genetics-glossary
  • 14.
    Sanger sequencing (Chain termination method ) 14
  • 15.
    Subclone Construction Subclone Fragments ATCGTCTAGAATCTC AGATCTCTGAGAGTC TGGGAAACTGTGTGA GTGACTAGCCACAGT TGGGAAACTGTGTGA ACGTGTGAGAGATGT TGATGCACCTGACCC GGTTTCACTCTCAAC ACTCACTCCACCTCA TGGGAAACTGTGTGA AGGCCCACCGCCGCT TGCACGTCCACCACC ATCGTCTAGAATCTC AGATCTCTGAGAGTC TGGGAAACTGTGTGA GTGACTAGCCACAGT TGGGAAACTGTGTGA ACGTGTGAGAGATGT TGATGCACCTGACCC GGTTTCACTCTCAAC ACTCACTCCACCTCA TGGGAAACTGTGTGA AGGCCCACCGCCGCT TGCACGTCCACCACC ATCGTCTAGAATCTC AGATCTCTGAGAGTC TGGGAAACTGTGTGA GTGACTAGCCACAGT TGGGAAACTGTGTGA ACGTGTGAGAGATGT TGATGCACCTGACCC GGTTTCACTCTCAAC ACTCACTCCACCTCA TGGGAAACTGTGTGA AGGCCCACCGCCGCT TGCACGTCCACCACC ATCGTCTAGAATCTC AGATCTCTGAGAGTC TGGGAAACTGTGTGA GTGACTAGCCACAGT TGGGAAACTGTGTGA ACGTGTGAGAGATGT TGATGCACCTGACCC GGTTTCACTCTCAAC ACTCACTCCACCTCA TGGGAAACTGTGTGA AGGCCCACCGCCGCT TGCACGTCCACCACC GG G G GATCGTCTAGAATCTC G G G G GAGATCTCTGAGAGTC G G G G GTGGGAAACTGTGTGA T T T T TGTGACTAGCCACAGT G G G G GTGGGAAACTGTGTGA T T T T TACGTGTGAGAGATGT A A A A ATGATGCACCTGACCC G G G G GGGTTTCACTCTCAAC G G G G GACTCACTCCACCTCA G G G G GTGGGAAACTGTGTGA G G G G GAGGCCCACCGCCGCT G G G G GTGCACGTCCACCACC Randomly Fragment Clone DNA Prepare Multiple Copies GATCGTCTAGAATCTC GAGATCTCTGAGAGTC GTGGGAAACTGTGTGA TGTGACTAGCCACAGT GTGGGAAACTGTGTGA TACGTGTGAGAGATGT ATGATGCACCTGACCC GGGTTTCACTCTCAAC GACTCACTCCACCTCA GTGGGAAACTGTGTGA GAGGCCCACCGCCGCT GTGCACGTCCACCACC
  • 16.
    Shotgun Sequencing Strategy AssembleSequence Reads into Sequence Contigs Deduce Sequence ‘Working Draft’ Sequence Sequence Finishing Final Sequence Clone DNA Subclones Generate Shotgun Sequence Reads GATCGTCTAGAATCTC GAGATCTCTGAGAGTC GTGGGAAACTGTGTGA TGTGACTAGCCACAGT GTGGGAAACTGTGTGA TACGTGTGAGAGATGT ATGATGCACCTGACCC GGGTTTCACTCTCAAC GACTCACTCCACCTCA GTGGGAAACTGTGTGA GAGGCCCACCGCCGCT GTGCACGTCCACCACC
  • 17.
  • 18.
    Example of GenomeSequence Assembly
  • 19.
    First Eukaryotic GenomesSequenced by HGP
  • 20.
    Dividing Up HumanGenome During HGP For example…
  • 21.
    Challenges of Sequencingthe Human Genome  Human Genome: ~3,000,000,000 nucleotides (bases or base pairs)  Sanger DNA sequencing Circa 1990: ~500-800 bases per read  ‘Coverage’ (i.e., number of time each base is read) needed to be high (e.g., >30-fold) to attain high accuracy  Roughly half of human genome consists repetitive DNA, much of it reflecting remnants of transposable elements
  • 22.
    Whose Genome WasSequenced by HGP?
  • 23.
     Buffalo, NYblood donors  93% of HGP’s human genome sequence from 11 donors  70% of HGP’s human genome sequence from 1 donor Humorous Aside: Advocacy by some HGP researchers to select a ‘normal’ person and sequence their genome first – as if anyone knows what ‘normal’ means!
  • 24.
     Significant attentionto release and sharing of HGP genome sequence data  Two seminal meetings in Bermuda in 1996 and 1997  Landmark agreement for rapid data release and public access to HGP genome sequence data  Became known as ‘Bermuda Principles’  Among the most important legacy of HGP Bermuda Principles for Data Sharing
  • 25.
    Initial HGP Plan ‘Clone-by-Clone ShotgunSequencing’ VS Editorial Aside: Not really a fair ‘race’ since Celera had access to HGP data (but not vice versa)!!! Purported ‘Race’ to Sequence Human Genome Venter/Celera Plan ‘Whole-Genome Shotgun Sequencing’
  • 26.
    June 2000: DraftSequence of Human Genome
  • 27.
    February 2001: PapersReporting Draft Sequence of Human Genome HGP Paper Venter/Celera Paper
  • 28.
    Initial HGP PlanVenter/Celera Plan Ultimate HGP Plan + = Generating the First Human Genome Sequence
  • 29.
     National DNADay established  HGP completion & 50th anniversary of discovery of DNA’s double-helical structure April 25, 2003: HGP Completion
  • 30.
    Highlight Features ofHGP  Completed ahead of schedule (13 years) and underbudget  Signature accomplishment was generation of an extremely high- quality sequence for >90% (‘near-complete’ or ‘essentially complete’) of human genome  Cost of generating first human genome sequence by HGP: ~$1 billion  The ‘race’ between HGP and Venter/Celera melted away after announcement of draft human genome sequence in 2000  Similarly, the initial concerns about the HGP from some parts of the scientific community largely melted away  HGP set the field of genomics into a trajectory of widespread dissemination across biology, medicine, and society
  • 31.
     HGP produceda high-quality human genome sequence, but it only accounted for 92% of the human genome  Remaining 8% was not ‘readable’ using the then-available methods for DNA sequencing, but those regions are important for structural (centromere and telomeres) and medical reasons  Several new ‘revolutionary’ methods for DNA sequencing have been developed over the last ~20 years  These new methods plus better computational approaches set the stage for a new group of researchers to (finally) generate a truly complete sequence of the human genome in 2022 A Truly Complete Human Genome Sequence
  • 32.
    2022: A TrulyComplete Human Genome Sequence
  • 33.
  • 34.
  • 35.
    CONCLUSION  HGP: 1990-2003 HGP used a map-first, sequence-second strategy to study the human genome  HGP used Sanger DNA sequencing – not a revolutionary new DNA sequencing method  Sequencing the human genome was particularly difficult because of its large size, complexity, and extensive amounts of repetitive regions  Genome sequence assembly was (and remains) a major challenge; repetitive regions present a particular obstacle to accurately assembling genome sequences  Venter/Celera pursued a whole-genome sequencing strategy and tried to build a business selling access to their data; both efforts fell short of expectations  Ultimately, the HGP completed the task of generating the first high-quality ‘essentially complete’ sequence of the human genome; 19 years later (in 2022), a truly complete (‘telomere-to-telomere’) human genome sequence was finally generated
  • 36.
    In reality, HGPwas the end of one journey, but the beginning of another. For example, HGP determined the sequence of most of the ~3 billion bases in the human genome, with the next phase focused on INTERPRETING the information encoded in that sequence – something that continues to the present time. Beyond the HGP
  • 37.
    Reference:  James D.Watson , Amy A. Caudy , Richard M. Mayers, Jan A.Witkowski. Third edition Recombinant DNA Genes and Genomes-A Short course . 2006. Page no: 273-304.  National Human Genome Research Institute https://www.genome.gov 37
  • 38.