2. • International research project
• For determining the base pairs of Human DNA
• Identifying and mapping all of the genes of the human genome.
• Study of base sequences of DNA molecules of a complete set of chromosomes in
human being
• launched in the year 1990
• coordinated by the U.S. Department of Energy and the National Institute of Health
• completed in 2003
HUMAN GENOME PROJECT
Bioinformatics.
• It develops methods and software tools for understanding large and complex biological
data.
• Development of software for this
3. Goals of HGP
• Identify all the approximately 20,000-25,000 genes in human DNA
• Determine the sequences of the 3 billion chemical base pairs that make up human DNA
• Store this information in databases
• Improve tools for data analysis
• Transfer related technologies to other sectors, such as industries
• Address the ethical, legal, and social issues (ELSI) that may arise
Many non-human model organisms, such as bacteria, yeast, Caenorhabditis elegans (a
free living non-pathogenic nematode), Drosophila, plants (rice and Arabidopsis), etc.,
have been sequenced.
4. METHEDOLOGY- 2 approaches
ESTs (Expressed Sequence Tags )
Identifying all the genes that are expressed as RNA
Sequence Annotation
Sequencing the whole set of genomes that contained
all the coding and non-coding sequence
5. HOST AND VECTORS USED IN HGP
HOST
Bacteria and Yeast
VECTOR
BAC (bacterial artificial chromosomes)
YAC (yeast artificial chromosomes)
Feature BAC YAC
Stability Very stable Less stable
DNA E. coli Yeast
Insert length 1000-2000 kb 200-300 kb
Copy number 1-2 1
6. The sequence of chromosome 1 was completed only in May 2006
The first full DNA genome to be sequenced was that of bacteriophage φX174 in 1977
7. Isolation of gene from a cell.
Cleavage of gene into small fragments.
Amplification of DNA using BAC and YAC
Sequencing using DNA sequencers. (Frederick Sanger)
Arranging gene sequences (based on some overlapping regions present in them with the
help of computer data base.
8. Salient Features of Human Genome
• The human genome contains 3164.7 million bp.
• The average gene consists of 3000 bases
• Largest human gene being dystrophin at 2.4 million bases.
• The total number of genes is estimated at 30,000 (much lower than previous estimates
of 80,000 to 1,40,000 genes).
• Almost all (99.9 %) bases are exactly the same in all people.
• The functions 50 % genes are not known.
• Less than 2 % of the genome codes for proteins.
• Repeated sequences make up very large portion of the human genome
• Chromosome 1 has most genes (2968), and Y has fewest (231).
• Scientists have identified about 1.4 million locations where single base DNA differences
(SNPs – single nucleotide polymorphism, pronounced as ‘snips’) occur in humans.
9. Applications of HGP
• understanding of biological systems.
• Study of disease due to some alteration in a certain gene.
• All the genes in a genome can be studied together.
• Helps to understand how tens of thousands of genes and proteins work
together in interconnected networks.
• Helps to diagnose and treat genetic diseases.