Il sequenziamento dei
     genomi sardi al CRS4


Francesco Cucca
INN-CNR


                             1
•  Humans and other living organisms all contain a digital project
   constituted by a linear sequence of different combin...
U   U   U   Phe   U   C   U   Ser   U   A   U   Tyr    U   G   U   Cys
U   U   C   Phe   U   C   C   Ser   U   A   C   Tyr...
•  While the basic composition of both DNA and protein building
   blocks and the translational system of one chemical lan...
•  Modern humans originated ~100,000 years ago from pre-
   modern humans and represent a relatively homogenous
   species...
This genetic variation has important medical consequences:


In simple mendelian traits, the relationship between the caus...
Qualitative trait   Quantitative trait
                                         9
theoretical framework




R. A. Fisher, 1890-1962




                                                  10
POSSIBILI SIGNIFICATI DI UN’ASSOCIAZIONE




             ASSOCIAZIONE PRIMARIA CON LA
             VARIANTE CAUSALE




 ...
Why a sequencing project?




                            12
The imperfect genome-wide search

  All Gene Chip Arrays contain SNPs chosen based on linkage
   disequilibrium (LD) obse...
The imperfect genome-wide search

  All Gene Chip Arrays contain SNPs chosen based on linkage
   disequilibrium (LD) obse...
The imperfect genome-wide search
  All Gene Chip Arrays contain SNPs chosen based on linkage
   disequilibrium (LD) obser...
Why a sequencing project in Sardinia?
                                                    CROATIA
                        ...
Why a sequencing project in
                                                  Sardinia?
                                  ...
What samples to sequence in
           Sardinia?

•  ProgeNIA study

•  Case-Control studies

•  Future work
ProgeNIA

6.148 volontari



                              Arzana



                      Elini
                         ...
ProgeNIA/SardiNIA project
 6,148 individuals - aged 14-102 y.

95% are known to have all grandparents born in Sardinia


...
> 150 quantitative traits
  Anthropometric Measurements
Height, Weight, Hip, Waist, BMI

  Blood Chemistry Components
 L...
Case-control samples

•  The special case of autoimmune diseases




                                             22
10
                                                                            42


                                      ...
119
                                                                                       165

                          ...
70                        70                        70

60                        60                        60
50         ...
How many samples to
        sequence?

• Is it necessary to sequence all
  people analysed?




                          ...
•  Observed genotypes


• Inferred DNA stretches
sharing along
chromosome

 • Inferred missing
 genotypes according
 to ch...
1) Identify Match Among
                Reference
Individuals in study sample
. . A A . . . . . . . . A . . . . A . . .
. ...
1) Identify Match Among
             Reference
Individuals in study sample
. . A A . . . . . . . . A . . . . A . . .
. . G...
1) Identify Match Among
             Reference
Individuals in study sample
. . A A . . . . . . . . A . . . . A . . .
. . G...
1) Identify Match Among
             Reference
Individuals in study sample
. . A A . . . . . . . . A . . . . A . . .
. . G...
2) Phase Chromosome

Individuals in study sample
. . A A . . . . . . . . A . . . . A . . .
. . G A . . . . . . . . C . . ....
3) Impute Missing Genotypes

Individuals in study sample
C G A A A T C T C C C G A C C T C A T G G
C G G A G C T C T T T T...
Recent updates

  We used whole-genome sequences of 52 Europeans available
   from the 1,000 Genomes Project to infer ~6....
GWAS finding

Mostly all of the loci detected by GWAS only explain a small
 fraction of the heritability

               T...
Shankar Balasubramanian   David Klenerman
                                            38
39
ProgeNIA Team Lanusei-Cagliari


Manuela Uda                                  Monica Lai
Serena Sanna                     ...
Acknowledgements:
Paolo Zanella
Chris Jones
Roman Tirler

Antonio Cao
Giuseppe Pilia

David Schlessinger
Goncalo Abecasis
...
Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)
Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)
Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)
Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)
Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)
Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)
Upcoming SlideShare
Loading in …5
×

Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

56,738 views

Published on

Francesco Cucca (University of Sassari and INN-CNR) at CRS4 presenting the sardinian genome sequencing program (24 march 2010).

Published in: Health & Medicine
3 Comments
2 Likes
Statistics
Notes
No Downloads
Views
Total views
56,738
On SlideShare
0
From Embeds
0
Number of Embeds
319
Actions
Shares
0
Downloads
62
Comments
3
Likes
2
Embeds 0
No embeds

No notes for slide

Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca (University of Sassari and INN-CNR)

  1. 1. Il sequenziamento dei genomi sardi al CRS4 Francesco Cucca INN-CNR 1
  2. 2. •  Humans and other living organisms all contain a digital project constituted by a linear sequence of different combinations of 4 small chemical compounds, named nucleotides, which together constitute their DNA. • Particular combinations of nucleotides specify the key qualitative and quantitative instructions for the synthesis of essential structural and operative components of the cell formed by different combinations of 20 molecules named amino acids • In turn amino acids are linked to each other to form more complex molecules named proteins.
  3. 3. U U U Phe U C U Ser U A U Tyr U G U Cys U U C Phe U C C Ser U A C Tyr U G C Cys U U A Leu U C A Ser U A A STOP U G A STOP U U G Leu U C G Ser U A G STOP U G G Trp C U U Leu C C U Pro C A U His C G U Arg C U C Leu C C C Pro C A C His C G C Arg C U A Leu C C A Pro C A A Gln C G A Arg C U G Leu C C G Pro C A G Gln C G G Arg A U U Ile A C U Thr A A U Asn A G U Ser A U C Ile A C C Thr A A C Asn A G C Ser A U A Ile A C A Thr A A A Lys A G A Arg A U G Met A C G Thr A A G Lys A G G Arg G U U Val G C U Ala G A U Asp G G U Gly G U C Val G C C Ala G A C Asp G G C Gly G U A Val G C A Ala G A A Glu G G A Gly G U G Val G C G Ala G A G Glu G G G Gly
  4. 4. •  While the basic composition of both DNA and protein building blocks and the translational system of one chemical language into the other is conserved, there is wide variation in the order of these block units in different organisms and individuals. • This is because the DNA and deriving protein products are not a static entity. Instead, DNA is subjected to a variety of different types of heritable change known as mutation. •  Mutations often arise as copying errors during DNA replication. Although the fidelity of DNA replication is strikingly high, misincorporation occurs at a given frequency, known as mutation rate.
  5. 5. •  Modern humans originated ~100,000 years ago from pre- modern humans and represent a relatively homogenous species which has experienced a dramatic expansion during its recent evolutionary history. •  Two unrelated human individuals on our planet are identical for about 99.9% and thus differ for about 0.1% of their DNA content. •  This means that there is approximately one change every 1000 nucleotides (our genome has an overall content of about two copies of 3.3 billion nucleotides) when comparing the DNA from two unrelated individuals.
  6. 6. This genetic variation has important medical consequences: In simple mendelian traits, the relationship between the causal genetic variant (genotype) and the disease state is deterministic. In a complex trait such as MS, the disease state results from interactions between multiple genotypes and the environment. The influence of any individual causal allele tends to be modest and the relationship between the causal variant and the disease state is probabilistic.
  7. 7. Qualitative trait Quantitative trait 9
  8. 8. theoretical framework R. A. Fisher, 1890-1962 10
  9. 9. POSSIBILI SIGNIFICATI DI UN’ASSOCIAZIONE ASSOCIAZIONE PRIMARIA CON LA VARIANTE CAUSALE ASSOCIAZIONE SECONDARIA DOVUTA A CONTIGUITA’ ASSOCIAZIONE SPURIA DOVUTA A SUBSTRUTTURA DI POPOLAZIONE
  10. 10. Why a sequencing project? 12
  11. 11. The imperfect genome-wide search   All Gene Chip Arrays contain SNPs chosen based on linkage disequilibrium (LD) observed in HapMap populations, a catalogue of ~ 3 million SNPs genotyped in Europeans, Asians, and Africans
  12. 12. The imperfect genome-wide search   All Gene Chip Arrays contain SNPs chosen based on linkage disequilibrium (LD) observed in HapMap populations, a catalogue of ~ 3 million SNPs genotyped in Europeans, Asians, and Africans   Studying a subset of 500,000 or 1 million is limitative
  13. 13. The imperfect genome-wide search   All Gene Chip Arrays contain SNPs chosen based on linkage disequilibrium (LD) observed in HapMap populations, a catalogue of ~ 3 million SNPs genotyped in Europeans, Asians, and Africans   Studying a subset of 500,000 or 1 million is limitative Power to detect disease associations Causative Tested at a locus inversely correlates with variant variant the r2 between typed(tested) and untyped (causative) SNPs
  14. 14. Why a sequencing project in Sardinia? CROATIA UKRAINE SARDINIA HUNGARY POLAND CATALONIA BASQUE COUNTRY GEORGIA ANDALUSIA CORSICA SICILY NORTH-CENTRAL ITALY ALBANIA CALABRIA GREECE LEBANON TURKEY
  15. 15. Why a sequencing project in Sardinia? SAAMI UDMURT MARI CZECH AND SLOVAKIAN DUTCH UKRAINIAN FRENCH POLISH HUNGARIAN CROATIAN GEORGIAN CENTRAL-NORTHERN ITALIAN MACEDONIAN ALBANIAN SPANISH BASQUES CALABRIAN TURKISH SYRIAN ANDALUSIAN GREEK LEBANESE MOROCCO 17
  16. 16. What samples to sequence in Sardinia? •  ProgeNIA study •  Case-Control studies •  Future work
  17. 17. ProgeNIA 6.148 volontari Arzana Elini Ilbono Lanusei 19
  18. 18. ProgeNIA/SardiNIA project  6,148 individuals - aged 14-102 y. 95% are known to have all grandparents born in Sardinia  711 pedigrees up to 5 generations deep Largest family: 625 phenotyped individuals  >34,000 relatives pairs Pilia et al. PLoS Genet. 2006
  19. 19. > 150 quantitative traits   Anthropometric Measurements Height, Weight, Hip, Waist, BMI   Blood Chemistry Components LDL, HDL, TG, Insulin, RBC, MCH, MCV, Bilirubin, hsCRP, MCP-1, IL-6, etc.   Cardiovascular Traits HR, SBP, DBP, PP, PWV, IMT, QT, etc   Personality Facets Neuroticism, Extraversion, Openess, Agreeableness, Coscientiousness, etc.   New traits will be added soon (immunological traits).  Cytokines
  20. 20. Case-control samples •  The special case of autoimmune diseases 22
  21. 21. 10 42 36 21 26 20 23 10 6 22 8 7 13 13 6 19 13 7 6 10 15 12 12 7 8 8 9 9 10 6 7 19 11 9.8 12 8 7 14 12 5 8 15 6 42 6 9 12 7 15 *Adapted from EURODIAB 18 8 5
  22. 22. 119 165 74 93 120 153 186 187 60 56 112 55 135 126 76 35 55 86 83 62 50 112 83 81 50 21 47 55 42 65 39 55 39 6 10 140 31 61 Pugliatti et al (EBC), Eur J Neurol 2006 10 68 7 17 29
  23. 23. 70 70 70 60 60 60 50 50 50 40 40 40 30 30 30 20 20 20 10 10 10 0 0 0 Pazienti Controlli Pazienti Controlli Pazienti Controlli
  24. 24. How many samples to sequence? • Is it necessary to sequence all people analysed? 28
  25. 25. •  Observed genotypes • Inferred DNA stretches sharing along chromosome • Inferred missing genotypes according to chromosome sharing Chen and Abecasis AJHG 2008 Burdick et al. Nat. Genet. 2006
  26. 26. 1) Identify Match Among Reference Individuals in study sample . . A A . . . . . . . . A . . . . A . . . . . G A . . . . . . . . C . . . . A . . . Observed HapMap Chromosomes C G A G A T C T C C T T C T T C T G T G C C G A A A T C T C C C G A C C T C A T G G C C A A G C T C T T T T C T T C T G T G C C G A A G C T C T T T T C T T C T G T G C C G A G A C T C T C C G A C C T T A T G C T G G A A T C T C C C G A C C T C A T G G C G A G A T C T C C C G A C C T T G T G C C G A G A C T C T T T T C T T T T A T A C C G A G A C T C T C C G A C C T C G T G C C G G A G C T C T T T T C T T C T G T G C
  27. 27. 1) Identify Match Among Reference Individuals in study sample . . A A . . . . . . . . A . . . . A . . . . . G A . . . . . . . . C . . . . A . . . Observed HapMap Chromosomes C G A G A T C T C C T T C T T C T G T G C C G A A A T C T C C C G A C C T C A T G G C C A A G C T C T T T T C T T C T G T G C C G A A G C T C T T T T C T T C T G T G C C G A G A C T C T C C G A C C T T A T G C T G G A A T C T C C C G A C C T C A T G G C G A G A T C T C C C G A C C T T G T G C C G A G A C T C T T T T C T T T T A T A C C G A G A C T C T C C G A C C T C G T G C C G G A G C T C T T T T C T T C T G T G C
  28. 28. 1) Identify Match Among Reference Individuals in study sample . . A A . . . . . . . . A . . . . A . . . . . G A . . . . . . . . C . . . . A . . . Observed HapMap Chromosomes C G A G A T C T C C T T C T T C T G T G C C G A A A T C T C C C G A C C T C A T G G C C A A G C T C T T T T C T T C T G T G C C G A A G C T C T T T T C T T C T G T G C C G A G A C T C T C C G A C C T T A T G C T G G A A T C T C C C G A C C T C A T G G C G A G A T C T C C C G A C C T T G T G C C G A G A C T C T T T T C T T T T A T A C C G A G A C T C T C C G A C C T C G T G C C G G A G C T C T T T T C T T C T G T G C
  29. 29. 1) Identify Match Among Reference Individuals in study sample . . A A . . . . . . . . A . . . . A . . . . . G A . . . . . . . . C . . . . A . . . Observed HapMap Chromosomes C G A G A T C T C C T T C T T C T G T G C C G A A A T C T C C C G A C C T C A T G G C C A A G C T C T T T T C T T C T G T G C C G A A G C T C T T T T C T T C T G T G C C G A G A C T C T C C G A C C T T A T G C T G G A A T C T C C C G A C C T C A T G G C G A G A T C T C C C G A C C T T G T G C C G A G A C T C T T T T C T T T T A T A C C G A G A C T C T C C G A C C T C G T G C C G G A G C T C T T T T C T T C T G T G C
  30. 30. 2) Phase Chromosome Individuals in study sample . . A A . . . . . . . . A . . . . A . . . . . G A . . . . . . . . C . . . . A . . . Observed HapMap Chromosomes C G A G A T C T C C T T C T T C T G T G C C G A A A T C T C C C G A C C T C A T G G C C A A G C T C T T T T C T T C T G T G C C G A A G C T C T T T T C T T C T G T G C C G A G A C T C T C C G A C C T T A T G C T G G A A T C T C C C G A C C T C A T G G C G A G A T C T C C C G A C C T T G T G C C G A G A C T C T T T T C T T T T A T A C C G A G A C T C T C C G A C C T C G T G C C G G A G C T C T T T T C T T C T G T G C
  31. 31. 3) Impute Missing Genotypes Individuals in study sample C G A A A T C T C C C G A C C T C A T G G C G G A G C T C T T T T C T T T T A T G C Observed HapMap Chromosomes C G A G A T C T C C T T C T T C T G T G C C G A A A T C T C C C G A C C T C A T G G C C A A G C T C T T T T C T T C T G T G C C G A A G C T C T T T T C T T C T G T G C C G A G A C T C T C C G A C C T T A T G C T G G A A T C T C C C G A C C T C A T G G C G A G A T C T C C C G A C C T T G T G C C G A G A C T C T T T T C T T T T A T A C C G A G A C T C T C C G A C C T C G T G C C G G A G C T C T T T T C T T C T G T G C
  32. 32. Recent updates   We used whole-genome sequences of 52 Europeans available from the 1,000 Genomes Project to infer ~6.6 million markers in individuals typed with the higher density chip…..   …. then with imputation method we inferred the 6.6 million markers to all individuals and performed a GWAS   This :  Provides a fine mapping for previously discovered loci  May show new loci that were poorly tagged by the previous set of SNPs
  33. 33. GWAS finding Mostly all of the loci detected by GWAS only explain a small fraction of the heritability Trait Heritability So far explained HbF ~60% ~17% Height ~80% ~4% BMI ~40% ~1% Smaller is the effect size, larger is the sample size required to maintain adequate power
  34. 34. Shankar Balasubramanian David Klenerman 38
  35. 35. 39
  36. 36. ProgeNIA Team Lanusei-Cagliari Manuela Uda Monica Lai Serena Sanna Anna Cau Eleonora Porcu Barbara Deiana Ilenia Zara Monica Balloi Carlo Sidore Maria Grazia Piras Maristella Steri Gianluca Usala Marco Masala Antonella Mulas Gianmauro Cuccuru Andrea Maschio Angelo Scuteri Fabio Busonero Marco Orrù Sandra Lai Maria Grazia Pilia Mariano Dei Danilo Fois Liana Ferreli Laura Crisponi Francesco Loi Silvia Naitza Caterina Flore Simona Foddi Giuseppe Pilia, Ideatore e Fondatore del Progetto ProgeNIA
  37. 37. Acknowledgements: Paolo Zanella Chris Jones Roman Tirler Antonio Cao Giuseppe Pilia David Schlessinger Goncalo Abecasis John Todd

×