Sequencing Cancer Genomes - Chemical Engineering at Texas A


Published on

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • (All notes obtained from Circos Website. Acessed April 30, 2010.

    The human genome is comprised of 22 pairs of chromosomes 1-22 and the pair of sex chromosomes X,Y.

    This graphic shows the chromosomes arranged in a circular orientation, shown as wedges, marked with a length scale. Data placed outside of the chromosome ring represents degree of small- and large- scale variation in the genome at a given position found between different populations.
    Data placed on top of the chromosome ring highlights positions of genes implicated in disease, such as cancer, diabetes, and glaucoma.

    Data placed inside the ring links disease-related genes found in the same biochemical pathway (grey) and the degree of similarity for a subset of the genome (colored).
  • (All notes obtained from Circos Website. Acessed April 30, 2010.

    (C) mapping and sequencing tracks / chromosome band (ideogram)
    (A) variation and repeats / snps (v126). The histogram shows the number of SNPs per 1 Mb.
    (F) variation and repeats / segmental duplication. A small subset of segmental duplications are drawn, filtered by locations on chromosomes 2, 3, 7, 9. The choice of locations was motivated by the need for a visually balanced set of links.
    (B) variants in genome structure catalogued by the TCAG database
    (D) locations of genes implicated in disease. Gene-to-disease mappings were done using OMIM database

    The graphic shows the human genome annotated with data related to genes implicated in disease, regions of variation found in various populations, and regions of similarity between chromosomes.

    The 24 individual chromosomes (1..22 [each present in pairs in the genome], X, Y) are arranged circularly (C), and represented by labeled (C3) ideograms on which the distance scale is displayed (C1).

    Some chromosomes are shown at different physical scales to illustrate the rich pattern of the data (chr2 3x; chrs 18,19,20,21,22 2x; chrs 3,7,17 10x). Within each ideogram, cytogenetic bands are shown (C2). These are large-scale features used in cytogenetics to locate and reference gross changes.

    On the outside of the ideograms, genomic variation between individuals and populations is represented by tracks (A) and (B). The number of catalogued locations at which single base pair changes have been observed within populations is shown as a histogram (A). Large regions which have been seen to vary in size and copy number between individuals are marked in (B).

    Locations of genes associated with disease are superimposed on the ideograms (D). (D3) shows the location of genes implicated in cancer (very dark red), other disease (dark red) and all other genes (red). (D2) shows locations of genes implicated in lung, ovarian, breast, prostate, pancreatic, and colon cancer, colored in progressively darker shade of red. (D1) marks gene positions implicated in other diseases such as ataxia, epilepsy, glaucoma, heart disease, neuropathy, colored in progressively darker shade of red, as well as diabetes (orange), deafness (green), and Alzheimer (blue) disease.

    Grey lines (E) connect positions on ideograms associated with genes that participate in the same biochemical pathways. The shade of the link reflects character of the gene - dark grey indicates that the gene is implicated in cancer, grey in disease, and light grey for all other genes. Colored links (F) connect a subset of genomic region pairs that are highly similar and illustrate the deep level of similarity between genomic regions (about 50% of the genome is in so-called repeat regions regions which appear in the genome multiple times and in a variety of locations).
  • Each line points to a specific gene
  • Sequencing Cancer Genomes - Chemical Engineering at Texas A

    1. 1. John A Pack
    2. 2.  Introduction  DNA Sequencing  Circos Plot  IDH1  Sequencing Genomes  Examples of Sequenced Cancer Genomes  Sequencing Disagreements  Sequencing Proponents  Small Scale Projects  Impact  Conclusions  Future Research
    3. 3.  Heart Disease: 631,636  Cancer: 559,888  Stroke (cerebrovascular diseases): 137,119  Chronic lower respiratory disease: 124,583  Accidents (unintentional injuries): 121,599  Diabetes: 72,449 Data for 2006 obtained from Centers for Disease Control and Prevention (CDC) (
    4. 4. ure01626.html
    5. 5.  DNA  Made up of 3 billion chemical building blocks (A, T, C, and G)  DNA Sequencing  Process of determining the exact order of the building blocks that make up the DNA of the 24 different human chromosomes  Revealed the estimated 20,000- 25,000 human genes within our DNA as well as the regions controlling them basics-of-molecular-biology-explained/
    6. 6.
    7. 7.  Circos is a software package for visualizing data and information  Used for identification and analysis of similarities and differences arising from comparisons of genomes Ledford, Heidi. “The Cancer Genome Challenge”.
    8. 8. http://mkweb.bcgsc.c a/circos/
    9. 9.
    10. 10.
    11. 11.  Mutation in the gene IDH1 found in 2006 study of 35 colorectal cancers  Not expected to be of importance  Changed only a lowly housekeeping enzyme involved in metabolism  13,000 other genes sequenced from each of 300 more samples
    12. 12.  12% of samples of glioblastoma multiforme (type of brain cancer)  8% of actue myeloid leukaemia samples .asp?NavID=92 and-acute-myeloid-leukaemia
    13. 13.  Studies showed the mutation changed the activity of isocirtrate dehydrogenase  Caused a cancer-promoting metabolite to accumulate in cells  Pharmaceutical companies hunting for a drug to stop the process  IDH1 mutation is the inconspicuous needle found in a veritable haystack of cancer-associated mutations thanks to high powered genome sequencing.
    14. 14.  Labs around the world are teaming up to sequence DNA from thousands of tumors as well as healthy cells from the same person  Nearly 75 cancer genomes have at least begun to be to sequenced and published  By the end of 2010 researchers expect to have over 100 cancer genomes fully sequenced rnships/index.html
    15. 15.  Further the research goes the larger the “haystack”  Comparison of tumor cell to healthy cell reveals dozens of single-letter changes, or point mutations  Comparison also reveals repeated, deleted, swapped, or inverted sequences html
    16. 16.  “The Difficulty is going to be figuring out hot to use the information to help people rather than to just catalogue lots and lots of mutations.” – Bert Voglestein, John Hopkins University  Clinically tumors can look the same but most differ genetically oct2003/sw_sept-oct2003_page1.htm
    17. 17.  Drivers – mutations that cause and accelerate cancers  Passengers – Accidental by-products and thwarted DNA-repair mechanisms  Distinguishing between the drivers and passengers is not always trivial 3scc_subaru_us_rally_team_petter_solberg/ph oto_02.html
    18. 18.  Mutations that pop up again and again  Identify key pathways that are mutated at different points  Finding more questions than answers  How do researchers decide which mutations are worthy of follow up and functional analysis? l/ch2-mutations.asp
    19. 19.  The International Cancer Genome Consortium Pilot Project  11 Countries to sequence DNA  20 cancer types  500 tumor samples for each  Cost to sequence each cancer type = US$20 Million 00904
    20. 20. Ledford, Heidi. “The Cancer Genome Challenge”.
    21. 21. UNITED STATES OF AMERICA BRITAIN  More than 6 types of cancer being sequenced  Ovarian Cancer  Brain Cancer  Glioblastoma Multiforme (IDH1 Mutation found in 12%)  Lung Cancer  Adenocarcinoma  Acute Myeloid Leukaemia (IDH1 Mutation found in 8%)  Colon Cancer  Adenocarcinoma  Others  Breast Cancer  ER-, PR-, HER-  Breast Cancer  Lobular  Breast Cancer  ER+, HER-  European Union Sponsored http://www.medicstravel.c anada/usa_and_canada.htm
    22. 22. FRANCE AUSTRALIA  Breast Cancer  HER2 overepxpressing  Liver Cancer  Alcohol-associated  Renal-cell carcinoma  European Union Sponsored  Pancreatic Cancer  Ductal adenocarcinoma  Ovarian Cancer
    23. 23. CANADA CHINA  Pancreatic Cancer  Ductal adenocarcinoma  Gastric Cancer Germany • Pediatric Brain Cancer – Medulloblastoma – Pilocytic Astrocytoma • Oral Cancer – Gingivobuccal India http://www.theco YearbookHomeInte rnal/138389/ http://www.oie ning/chinaMap. html http://geology.c om/world/germ any-satellite- image.shtml
    24. 24. ITALY JAPAN  Rare Pancreatic Cancers  Enteropancreatic endocrine  Pancreatic exocrine  Liver Cancer  Virus- Associated Spain • Chronic lymphocytic leukaemia
    25. 25.  The International Cancer Genome Consortium (ICGC), est. 2008, combined two older, large scale projects  The Cancer Genome Project  Over 100 partial genomes and roughly 15 whole genomes. Tends to tackle over 2,000 more in the next 5-7 years  The US National Institutes of Health’s Cancer Genome Atlas (TCGA)  Sequence up to 500 tumors for each of 20 cancers over next 5 years
    26. 26.  The two groups in the TCGA are collaborating to sequence a subset of tumor samples (about 100) from each cancer type  The most promising areas of the genome will then be sequenced in the remaining 400 samples
    27. 27. p?processStyle=image
    28. 28.  Larger sample numbers could provide driver mutations like the one in IDH1  Knowledge and study of these mutations could lead to developing new cancer therapies according to researchers guide/complementary-therapies/complementary- therapies.html
    29. 29. -Michael Stratton (Co-Director of the Cancer Genome Project) _profiles/2750.shtml
    30. 30.  IDH1 was first overlooked on the basis of the colorectal cancer data alone  Search expanded to other cancers before importance was revealed  Some drivers are mutated at very low frequency (less than 1% of the cancers)  heavy sampling is needed to find these low frequency drivers  Sequencing 500 samples per cancer reveal mutations present in as few as 3% of the tumors, but may still have important biological lessons  Need to know in order to understand the overall genomic landscape of cancer
    31. 31.  Look for mutations that cluster in a pathway  In an analysis of 24 pancreatic cancers  12 identified signaling pathways had been altered  Very difficult approach  Pathways overlap and boundaries not clear  Many pathways that are obtained using data from different animals or cell types do not always match up with what’s found in human tissue
    32. 32.  Distinguishing between drivers and passengers gets increasingly harder as researchers are beginning to sequence entire tumor genomes  Only a fraction of the existing cancer genomes have been completely sequenced Articles/Archive/sabl/2007/Jan/breast- cancer-genome.html
    33. 33.  Most cancer genome sequences are only covering the exome  Keep costs low  Directly codes for protein (easiest to interpret)  Importance of mutations found in the non-protein coding depths  More challenging  Scientists don’t know what function these regions usually serve  Majority of mutations nother-example.html
    34. 34.  Some Full Genome have been Sequenced  Small-cell lung carcinoma (Type of Lung Cancer)  Metastatic melanoma (Type of Skin Cancer)  Basal-like breast cancer (Type of Breast Cancer)  Only exome has been sequenced  Glioblastoma multiforme (Type of Brain Cancer) urorad/internal.asp?NavID=92 http://www.mydoc cancer.php http://www.asa3.o rg/ASA/topics/Yo uth%20page/index .html
    35. 35.  Sequenced: full genome  Source: NCI-H209 cell line  Point mutations: 22,910  Point mutations in gene regions: 134  Genomic rearrangements: 58  Copy-number changes: 334  Highlights: Duplication of the CHD7 gene confirmed in two other small-cell lung carcinoma cell lines 9/12/16/skin-and-lung-cancer-genomes-are- truly-groundbreaking/
    36. 36.  Sequenced: full genome  Source: COLO-829 cell line  Point mutations: 33,345  Point mutations in gene regions: 292  Genomic rearrangements: 51  Copy-number changes: 41  Highlights: Patterns of mutation reflect damage by ultraviolet light Ledford, Heidi. “The Cancer Genome Challenge”.
    37. 37.  Sequenced: full genome  Source:  primary tumor  brain metastasis  tumors transplanted into mice  Point mutations:  27,173 in primary  51,710 in metastasis  109,078 in transplant • Point mutations in gene regions: – 200 – 225 – 328 • Genomic rearrangements: 34 • Copy-number changes: – 155 – 101 – 97 • Highlights: Patterns of mutation reflect damage by ultraviolet light Ledford, Heidi. “The Cancer Genome Challenge”.
    38. 38.  Sequenced: exome (no complete Circos plot)  Source:  7 patient tumors  15 tumors transplanted into mice  Genes containing at least one protein altering mutation: 685  Genes containing at least one protein altering point mutation: 644  Copy-number changes: 281  Highlights: Mutations in the active site of IDH1 have been found in 12% of patients /neurorad/internal.asp?Na vID=92
    39. 39.  Very important to find all, even in non-protein, regions  Maybe none of these mutations could pertain to the causation of cancer  Some could  Only way to find out is to systematically investigate them
    40. 40.  Some researchers Argue against fully sequencing genomes  Cost of projects outweighs the benefits  Prices will drop due to technology advances in next few years, why not wait?  In the mean time  Mutations that affect how many copies of a gene are found in a genome  Cheaper to assess  Provide more intuitive insight into biological processes 2585059/stock-photo-costs-outweigh- benefits.html
    41. 41.  Changes in genome copy number detection  Array-based technology  Fast and relatively inexpensive  Sequencing  Higher-resolution snapshot of regions  The higher resolution can provide  More precision in mapping boundaries  Ability to catch tiny duplications or deletions that an array may not detect
    42. 42. comparative-genomic-41020
    43. 43.  A lot of small scale hospitals are investing millions of dollars into cancer sequencing projects  (e.g.) St. Jude Children’s Research Hospital  Proponents don’t want to wait  The real work starts after the sequencing is over  Determining what these mutations are doing  Old-fashioned biology and experimental analysis extoid=f2bfab46cb118010VgnVCM1000000e2015 acRCRD http://www.the- 12misc/careers.htm
    44. 44.  Two 2-year projects  Develop high-throughput methods  Test how the mutations identified by the TCGA pilot project affect cell function  Aim to pull needles from the haystack and make since of them (like the IDH1 mutation) fighting-foods-spices/
    45. 45.  Dana-Farber Cancer Center (Boston)  Systematically amplify and reduce the expression of genes of interest in cell cultures  Cold Spring Harbor Laboratory (New York)  Study cancer-associated mutations using tumors transplanted into mice na%20Farber-job.htm
    46. 46.  Asses effects of deleting each gene in the mouse genome  Learn more about the normal function of genes that are mutated in cancer nimals-set-7.html
    47. 47.  Global  Cancer is a world-wide disease  Cancer Patients  New Technology  New Treatment Processes  Researchers  More grants to make new advances
    48. 48.  Sequencing tumor DNA genomes can lead to finding cancer-causing gene mutations  Very challenging to pinpoint gene mutations that are cancer-causing  Very high sample numbers  Sampling and sequencing full cancer genomes is extremely expensive  Some opponents think the cost outweighs the benefits right now  A lot of people think the cost is worth it, because there is a lot more work to do after sequencing, so we should not wait for prices to come down
    49. 49.  Better technology for making the sequencing equipment to bring costs down  New technology to detect mutations  Complete Full genome sequences for all cancers  Developing ways to stop or kill these mutations but leave the healthy cells unharmed  Nanotechnology (nanopharmaceuticals could have an impact here)
    50. 50.  Ledford, Heidi. “The Cancer Genome Challenge”. Nature Journal. Vol 464. 15 April 2010. p. 972-974. Macmillan Publishers Limited. 2010  Human Genome Project Information. Facts About Genome Sequencing. Accessed: April 29, 2010. Last modified: September 19, 2008. eqfacts.shtml.  Krzywinski, M. et al. Circos: an Information Aesthetic for Comparative Genomics. Genome Res (2009) 19:1639-1645  Francis S. Collins1, et al. “A vision for the future of genomics research”. Nature Publishing Group. 2010. Accessed April 30, 2010. e01626.html  Circos Website. Acessed April 30, 2010.