GLOBAL VARIATION IN COPY NUMBER IN THE HUMAN GENOME Ada Uzo-Okereke Patricia Dennis
HUMAN GENETIC DIVERSITY
Nearly 99.9% of an individual’s genetic information is estimated to be identical to the information of any given stranger.
A majority of these differences have been linked to small submicroscopic alterations in a DNA sequence.
Few rare mutations can be viewed on a microscopic level.
Large scale, duplications, deletions, translocation, aneuploidy.
HUMAN GENETIC DIVERSITY Supplementary Fig. 3
WHAT ARE CNVS AND WHY ARE THEY IMPORTANT?
Copy number variation (CNV) of DNA sequences constitute large segments of DNA ranging from 1 Kb that have copy number differences when compared to a reference genome.
CNVs: deletions, duplications, insertions, inversions, etc.
CNVs associate with genes and can disrupt their expression or their dosage.
Why Study CNVs?
May influence gene expression and adaptation, can give insight into complexity of normal phenotypic variation and disease.
THE HISTORY OF CNVS
Are CNVs something new?
Rearrangement, structure, and variation in chromosome copy number have been linked to disease.
70yrs ago: duplication of Bar gene in Drosophila melanogaster has been shown to cause the Bar eye phenotype this was one of the first known associations with CNV
Down Syndrome: Chromosome 21
The complexity of CNVs and their nature is now being investigated, with resources available to accomplish such detailed analysis. Normal human genetic variance and other disease studies can be further understood through the understanding of CNVs.
THE BIGGER PICTURE
Understanding and the further investigation of copy number variance could lead to better understanding of human genetic variance and disease.
To provide one of the first CNV maps of the human genome and demonstrate this utility as a great resource for future genetic disease studies and to provide a framework for studies of human genetic variation by studying and screening for CNV of several individuals across populations of different ancestry.
DNA of 270 people from four populations with distinct ancestry was collected.
Groups included two 30 parent-offspring trios, and two groups of 45 unrelated individuals.
Collection was from lymphoblastoid cell lines and was screened for obvious inconsistencies in structure and composition.
Chromosome number, hybridization with blood DNA, somatic deletions within SNP genotypes.
Parameters were set to estimate the number of CNV detections that could be labeled as false positives.
Accomplished by validating a number of known CNV’s
The presence and location of CNV’s were determined using two distinct platforms.
Comparative Genome Hybridization with Whole Genome TilePath Array
Comparative Intensity Analysis with Affymetrix 500k Early Access SNP Chip.
Compiled and compared data to reach their final conclusions.
AFFYMETRIX GENECHIP HUMAN MAPPING 500K
Used in whole genome association studies and genotypes via hybridization.
This system relies on gene chips that represent identified SNPs.
For each SNP there are 25-40 different 25-nucleotide probes.
Millions of probes are placed on a Variant Detector array (VDA).
Genomic DNA is fragmented with restriction enzymes, amplified, purified, labeled and hybridized with the array.
The signal intensity of DNA that match the probe are compared with those that didn’t match at the probes central position.
AFFYMETRIX GENECHIP HUMAN MAPPING 500K Gibson 175
RESULTS OBTAINED FROM AFFYMETRIX
The average number of CNVs detected per experiment was 24
Median size of CNVs detected was 81kb and the mean size was 206kb.
980 CNVRs were detected and their genomic regions were also mapped.
Overall, this platform was better at detecting smaller CNVs.
COMPARATIVE GENOME HYBRIDIZATION WITH WHOLE GENOME TILEPATH ARRAY
DNA Microarrays allow for the measurement of gene expression
WGTP array comprises 26,574 large insert clones which represent 93.7% the euchromatic portion of the human genome.
Test DNA is cloned, cleaved with restriction enzymes, purified, labeled.
Labeled test DNA and normal reference DNA are hybridized together
Hybridization is detected by different fluorophores
The tilling array platform used required mechanically spotting probes on microarray chips
Excitation with laser beam and Microarray scanner
CNVs are seen as the ratio of the intensities of the fluorescence emitted changes along the target chromosome.
Dye Swap used to account for dye bias.
WHOLE GENOME ARRAY TILING
RESULTS OBTAINED FROM TILEPATH
Detected an average of 70 CNVs per experiment.
The median size of CNVs was 228kb, and the mean size was 341kb.
Large size of CNVs was attributed to the platforms tendency to report that the entire cloned insert was the CNV, as opposed to the small fraction of the clone containing the CNV.
This platform proved to be more effective in identifying CNV’s in duplicated regions.
913 CNVs were detected on this platform.
PROCEDURAL OUTLINE Fig. 1
Between both platforms:
1,447 different CNVRs were found, consisting of 12% of the human genome.
There are no large stretches of genome absent from CNVs.
Any given portion of chromosome has a 6-19% chance of containing a CNV.
Classified CNVs: deletions, duplications, deletions and duplications at the same locus, multi-allelic loci, complex loci insight into CNV formation.
CNVRs are normally located outside of genes found in highly conserved areas.
Cell signaling, proliferation, kinase and phosphorylation related categories.
Thousands of functional sequences have been found to flank or fall into CNVRs.
Genes that were found to contain CNVs involve:
Diseases that are both polymorphic or follow mendelian patterns of inheritance.
There is a greater observance of linkage disequilibrium around SNPs than CNVs.
Some CNVs are poorly predicted by SNPs that were considered to mark their locations in previous studies.
Variation of CNVs in relation of population was similar to numbers found through SNP studies.
GENOMIC DISTRIBUTION OF CNVRS Fig. 4
LOOKING INTO THE FUTURE WITH CNVS
CNV’s may offer insight into evolutionary studies.
Chimpanzee vs. Human: Most genetic differences have been identified as CNV’s with 1/3 unique to human beings.
Adds another variable to predict the likelihood of complex phenotypes.