• Like
Human Comparative Genomics What is the sequence of the normal ...
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Human Comparative Genomics What is the sequence of the normal ...

  • 282 views
Published

 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
282
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
4
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Human Comparative Genomics
  • 2.
    • What is the sequence of the normal Human Genome?
    • What accounts for the genetic differences between individuals?
  • 3. Finding Segmental Duplications in the Human Genome Bailey et al (2002) Science 297:1003-07
  • 4. Segmental Duplications in the Human Genome Bailey et al (2002) Science 297:1003-07
  • 5. Polymorphism in Segmental Duplications Iafrate et al (2004) Nat Genet 36:949-51
  • 6. Polymorphism in Segmental Duplications
    • CGH studies find many copy number polymorphisms in segmental duplications (~12 per individual)
    • Rare and common polymorphisms
    • Many overlap coding regions
    • Critical for the interpretation of amplifications in cancers
    • Responsible for phenotypic differences between people?
  • 7. SNPs/Hap Map
    • http://www.hapmap.org/
    • ~1 SNP/1000 bp
    The International HapMap Project is a multi-country effort to identify and catalog genetic similarities and differences in human beings. Using the information in the HapMap, researchers will be able to find genes that affect health, disease, and individual responses to medications and environmental factors. The Project is a collaboration among scientists and funding agencies from Japan, the United Kingdom, Canada, China, Nigeria, and the United States. All of the information generated by the Project will be released into the public domain
  • 8. Questions
    • How many sub-populations best partition the data?
    • How strong is the evidence for the clusters?
    • Do the inferred clusters correspond to our notions of race, ethnicity, ancestry, or geography?
    • Given the inferred clusters can we accurately can we classify new individuals?
    • Can we identify population admixture or migration events?
  • 9. Attempts to group humans by genotype
  • 10.  and F st
    •  , average nucleotide diversity (~1 in 1000 bp)
    • F st , proportion of genetic variation that can be ascribed to differences between populations (~10%)
  • 11. Summary of Findings
    •  and F st are small
    • Diversity within “African” populations is highest
    • Unsupervised clustering tends to support either 3 or 4 sub-populations depending on number and type of markers and individuals included in the study, but the composition of the groups are often different in different studies
  • 12. A contradiction?
    • Although they differed on the extent and composition of sub-populations, so far all studies have found evidence of significant sub-structure in human populations
    • And yet, all studies agree that F st is small (between 3-15%)
    See review by Jorde and Wooding (2004) Nature Genet. 36: S28-S33
  • 13. Small F st does not imply lack of structure A 1 D 2 B 2 A 1 B 2 A 1 A 1 A 1 A 2 A 2 D 2 A 1 C 1 C 2 A 1 B 1 B 1 B 1 A 1 C 1 A 2 D 1 A 2 A 1 C 2 A 1 D 2 C 2 D 1 D 1 A 1 C 1 D 1 B 2 E 2 E 2 E 1 E 1 E 1 E 1 E 2 E 2 E 2 C 2
  • 14. Clustering human populations by genotype
    • K-means clustering of gene expression data
    • Pick a number (k) of cluster centers
    • Assign every gene to its nearest cluster center
    • Move each cluster center to the mean of its assigned genes
    • Repeat 2-3 until convergence
    • EM-based clustering of genotype data
    • Pick a number (k) of sub-populations
    • Assign every individual to a sub-population based on the allele frequencies in the sub-population
    • Recalculate the allele frequencies in each sub population
    • Repeat 2-3 until convergence
  • 15. An Example I 1 = (A 1 ,B 1 ,C 2 ) I 2 = (A 1 ,B 1 ,C 2 ) I 3 = (A 1 ,B 2 ,C 2 ) I 4 = (A 2 ,B 2 ,C 1 ) I 5 = (A 1 ,B 1 ,C 1 ) I 6 = (A 1 ,B 1 ,C 2 ) I 7 = (A 1 ,B 1 ,C 2 ) I 8 = (A 2 ,B 2 ,C 2 ) I 9 = (A 1 ,B 2 ,C 1 ) I 10 = (A 2 ,B 1 ,C 2 ) I 11 = (A 2 ,B 2 ,C 2 ) I 12 = (A 2 ,B 2 ,C 2 ) 12 individuals genotyped at three different independent biallelic loci
  • 16. k 1 k 3 k 2 I 1 = (A 1 ,B 1 ,C 2 ) I 2 = (A 1 ,B 1 ,C 2 ) I 3 = (A 1 ,B 2 ,C 2 ) I 4 = (A 2 ,B 2 ,C 1 ) I 5 = (A 1 ,B 1 ,C 1 ) I 6 = (A 1 ,B 1 ,C 2 ) I 7 = (A 1 ,B 1 ,C 2 ) I 8 = (A 2 ,B 2 ,C 2 ) I 9 = (A 1 ,B 2 ,C 1 ) I 10 = (A 2 ,B 1 ,C 2 ) I 11 = (A 2 ,B 2 ,C 2 ) I 12 = (A 2 ,B 2 ,C 2 ) F(A 1 ) k1 =0.75 F(B 1 ) k1 =0.5 F(C 1 ) k1 =0.25 F(A 1 ) k2 =0.75 F(B 1 ) k2 =0.75 F(C 1 ) k2 =0.25 F(A 1 ) k3 =0.25 F(B 1 ) k3 =0.25 F(C 1 ) k3 =0.25 Consider individual I 1 = (A 1 ,B 1 ,C 2 ) P(I 1 in k 1 ) = (.75)(.5)(.75) = 0.28 P(I 1 in k 2 ) = (.75)(.75)(.75) = 0.42 P(I 1 in k 3 ) = (.25)(.25)(.75) = 0.046 Therefore reassign I 1 to k 2
  • 17. An example Bamshad et al (2003) Am. J. Hum. Genet. 72:578-89
  • 18. But… Bamshad et al (2003) Am. J. Hum. Genet. 72:578-89
  • 19. Genes mirror geography in Europe Novembre et al. Nature 456 , 98-101
  • 20. Pharmacogenomics
    • Many drugs never reach the market because of side effects in a small minority of patients
    • Many drugs on the market are efficacious in only a small fraction of the population
    • This variation is (in part) due to genetic determinants
      • Orissa  EGF mutations
      • Codeine  cytochrome P450 alleles
  • 21. Question: Is race, ancestry, ethnicity, geography or genetic substructure a reasonable proxy for genotype at alleles relevant for drug metabolism? Answer: So far…No. Still looks as if we will have to genotype the relevant loci before making any guesses
  • 22. Population genetic structure of variable drug response. Wilson et al (2001) Nat Genet. 29: 265-269 A = African B = European C = Asian A B C CYP1A2 GSTM1 CYP2C19 DIA4 NAT2 CYP2D6
  • 23. Evidence for Archaic Asian Ancestry on the Human X Chromosome Garrigan et al. (2005) Mol. Biol. And Evol. 22:189-192
    • Pseudogene on the X-chromosome
    • 18 substitutions between human-chimp
    • 15 substitutions between two human alleles
    • Assuming a molecular clock the split between the two human alleles is about 2 million years
    • Both alleles found in southern Asia, only one allele found in Africa
    • Only human gene tree to “root” in Asia
  • 24. Garrigan et al. (2005) Mol. Biol. And Evol. 22:189-192
  • 25. Garrigan et al. (2005) Mol. Biol. And Evol. 22:189-192
  • 26. Human evolution in a nutshell chimps H. sapien H. ergaster H. erectus H. neanderthalis 5-6 mya 1 mya 0.5 mya 0.2 mya
  • 27. Human evolution in a nutshell chimps H. sapien H. ergaster H. erectus H. neanderthalis 5-6 mya 1 mya 0.5 mya 0.2 mya ?
  • 28. So what happened?
    • Strong selection for the Asian allele in southern Asia
      • -not likely since this is a pseudogene locus
      • -fails Tajima’s D test
    • Gene flow between H. sapien and H.erectus in southern Asia
      • -branch lengths are about right for 2 million years of divergence
      • - H. erectus was in southern Asia until 18,000 years ago (Morwood et al. and Brown et al. in Nature (2004) vol 431.)
      • -supporting evidence from genetic analysis of lice and other human parasites (Reed et al (2004) PLoS 2:1972-83)