DNA Guide - Tech Summary Mapping Genomes w GIS
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

DNA Guide - Tech Summary Mapping Genomes w GIS

on

  • 5,357 views

Genetic Testing, DNA Guide, Genome Mapping, Privacy, GIS, Personalized Medicine, Health IT, GIS, ESRI, Gene Patents, Big Data, Molecular Diagnostics, Alice Rathjen, Mike Hargreaves, Electronic ...

Genetic Testing, DNA Guide, Genome Mapping, Privacy, GIS, Personalized Medicine, Health IT, GIS, ESRI, Gene Patents, Big Data, Molecular Diagnostics, Alice Rathjen, Mike Hargreaves, Electronic Health Records, Next Generation Sequencing, Point of Care, iPAD, Mobile

Statistics

Views

Total Views
5,357
Views on SlideShare
3,158
Embed Views
2,199

Actions

Likes
4
Downloads
19
Comments
2

75 Embeds 2,199

http://dnaguide.blogspot.com 995
http://dnaguide.blogspot.co.uk 209
http://dnatimes.blogspot.com 201
http://dnatimes.blogspot.in 112
http://dnaguide.blogspot.in 104
http://dnaguide.blogspot.se 84
http://dnatimes.blogspot.com.br 75
http://dnaguide.blogspot.ca 36
http://dnaguide.blogspot.ru 34
https://twitter.com 30
http://dnaguide.blogspot.com.br 20
http://dnaguide.blogspot.fr 17
http://dnatimes.blogspot.co.uk 16
http://dnaguide.blogspot.com.es 13
http://dnaguide.blogspot.fi 12
http://dnatimes.blogspot.ae 12
http://dnaguide.blogspot.com.au 10
http://dnaguide.blogspot.jp 10
http://news.google.com 10
http://eventifier.co 10
http://dnatimes.blogspot.ru 9
http://dnaguide.blogspot.be 9
http://dnaguide.blogspot.no 8
http://dnaguide.blogspot.de 8
https://dnaguide.blogspot.com 8
http://dnaguide.blogspot.ae 7
http://dnatimes.blogspot.sg 7
http://dnaguide.blogspot.kr 7
http://dnaguide.blogspot.sg 6
http://dnaguide.blogspot.pt 6
http://dnaguide.blogspot.ch 6
http://dnaguide.blogspot.tw 6
http://dnaguide.blogspot.dk 5
http://dnaguide.blogspot.it 5
http://dnatimes.blogspot.com.ar 5
http://dnatimes.blogspot.it 5
http://eventifier.com 5
http://dnaguide.blogspot.mx 4
http://dnaguide.blogspot.co.nz 4
http://dnaguide.blogspot.com.ar 4
http://dnatimes.blogspot.cz 4
http://www.yatedo.com 4
http://dnaguide.blogspot.co.at 4
http://dnaguide.blogspot.ie 4
http://dnaguide.blogspot.hk 4
http://dnatimes.blogspot.kr 3
http://dnaguide.blogspot.nl 3
http://kred.com 3
http://dnaguide.blogspot.hu 3
http://dnatimes.blogspot.ro 3
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • Alice, nice job, clear and logical.
    Are you sure you want to
    Your message goes here
    Processing…
  • #1 Preparing for the Genomic Data Deluge of Personalized Medicine
    #2 Soon health IT networks will be required to integrate patient genetic data as part of the deployment of personalized medicine. This presentation explains how to use mapping software, GIS, to managing this vast amount of information.
    #3 Currently working with genetic data is very problematic.
    Most molecular diagnostics are delivered as static .pdf files making the info difficult to search or update. There are lot of concerns over privacy. There’s a huge problems of researchers hoarding data. Even those who want to collaborate find it hard to do so.
    Those mapping the genome are struggling with the same issues that the earth mapping community struggled with early on AND SOLVED, using GIS.
    #4 Genetic data is location based information – and it’s not that hard to map. Here’s a sample genome map showing breast cancer markers. In the next slides, I’ll show what you how this map was built.
    #5 I’m covering very basic info with the goal of getting GIS experts over some genetic jargon hurdles. We start with 23 pairs of chromosomes (with one of pair coming from the mother and the second the father).
    #6 Genetic data comes in large text files full of A’s, T’s, C’s and G’s.
    # 7 The double helix has two strands. A’s bind with Ts and C’s bind with G’s.
    #8 The genome .txt files are for just one side of the strand. Keeping track of what strand was used to create the data is critical when data is later combined from different sources
    #9 Here’s a sample dataset showing where this person has points in their genome that vary from the reference genome. These points are called SNPs and have a unique id that starts with RS… Each RS has the chromosome number and position… which can be used to create an X and Y coordinate You can see the genotype lists two letters.
    #10 Often genetic data is shown as pairs of letters.
    So a person could have an A from their mother at this location and a T from their Father for a certain location on a chromosome.
    #11 This genome layout was designed to fit well on mobile platforms and make calculation of coordinates simple. Actual chromosome layouts could be anything (ie. circular, registered raster image from electron microscope). To create an x coordinate, we simply take the position on the chromosome and divide it by 100.
    #12 To create the Y coordinate we multiply the chromosome number by -200,000 to find the midline for a chromosome. Since the chromosomes come in pairs… we then add .01 and subtract .01 from the midline to create the Y coordinates for each chromosome.
    # 13 Here’s some good resources for figuring out coordinates for various genetic markers. Keeping track of which human reference build was used to create each data source is key.
    There’s also about 30,000 genes that could have molecular diagnostic information associated with them. The HUGO site will have the start and end points for each gene for every human reference build. And this third source DBSNP is important – because it will have the coordinates for every RSID number for each human reference build. The number of points for a patients genome that vary from the reference genome is between 4-6 million.
    So you can think of a patients dataset as being 4-6 million rows. Only about 25,000 points of data per patients currently have any sort of meaningful annotation associated with them and the percentage of points in the genome with actionable data is far smaller.
    #14 Here’s some types of diagnostics that’s are being used for personalized medicine. Although genomes are massive, the actual amount of data physicians and patients will need to view is quite small. Most molecular diagnostics are a query to less than a dozen locations on the genome and can be automated using GIS.
    #15 Genetic data has many competing interests in the marketplace. One solution is to carve out areas of turf where stakeholders can be compensated for reviewing and sharing information. For example, the scientific community could review the quality of the data, with pathologists then providing a rating for medical utility and genetic counselors for viewing risk
    Finally, an important benefit of GIS is that it places genetic data in a format that allows for patient engagement.
    For many reasons, it is good practice to setup real time patient consent upfront – and apply various rules for patients accessing annotation. By rating anotation for quality of science, medical utility and viewing risk, that the proper level of guidance could be triggered the moment any user went to access part of a genome.
    The genomic data deluge can be managed using GIS.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

DNA Guide - Tech Summary Mapping Genomes w GIS Presentation Transcript

  • 1. Esri UC2013 . 2013 Esri International User Conference July 8–12, 2013 | San Diego, California Preparing for the Genomic Data Deluge of Personalized Medicine Alice Rathjen, CEO, Founder DNA Guide Mike Hargreaves, R&D DNA Guide, DNA Guide, Inc, © 2013
  • 2. Esri UC2013 . Preparing for Personalized Medicine • Deluge vs Managed Mapping Solutions • Recognizing the First Waves of Data • Addressing Needs of All Stakeholders • Genomic Coordinate Sources • Genome Maps for the Masses Genomic Data Deluge of Personalized Medicine, Alice Rathjen, DNA Guide, Inc.
  • 3. Esri UC2013 . Genomic Data Deluge of Personalized Medicine, Alice Rathjen, DNA Guide, Inc. Fear and Hoarding Deluge
  • 4. Esri UC2013 . Secure Sharing of Genomic Maps Genomic Data Deluge of Personalized Medicine, Alice Rathjen, DNA Guide, Inc. Mapping Software:  Proven Technology  Rapid Deployment  Easy Adoption  Dynamic Reports/Maps  Secure  Scales Indefinitely
  • 5. Esri UC2013 . Waves of Patient Genetic Data • Cancer Tumor Analysis (200 genes) • Carrier Testing (1-50 points of data) • Drug Response (1-15 points of data) • Disease Risk (1 to 60 points of data per disease)
  • 6. Esri UC2013 . Needs of Stakeholders • Quality of Science (A-F, Scientific Peer Review) • Medical Utility (5 stars rating - Pathologists, AMA) • Viewing Risk (E=Everyone, PG = Physician Guidance, R = Restricted –via genetic counselors) Annotation Turf Wars Patient Engagement • Real Time Consent (fine grain control for sharing info)
  • 7. Esri UC2013 . Genomic Coordinate Sources • Human Reference Genome Builds http://www.ncbi.nlm.nih.gov/projects /genome/assembly/grc/human • Human Genes HUGO www.genenames.org • SNPs DBSNP www.ncbi.nlm.nih.gov/SNP
  • 8. Esri UC2013 . Genome Maps for the Masses Fast, Simple, Mobile Genomic Data Deluge of Personalized Medicine, Alice Rathjen, DNA Guide, Inc. • Chromosomes 1-22 • XX = Female • XY = Male • MT = Mitochondria
  • 9. Esri UC2013 . DNA Bases: A, T, C, G’s. A T C G DNA Mapping Genomes consist of large .txt files full of A, T, C and G’s Genomic Data Deluge of Personalized Medicine, Alice Rathjen, DNA Guide, Inc.
  • 10. Esri UC2013 . Double Helix (A’s & T’s, C’s & G’s) T A G C A T C G DNA Mapping A’s and T’s bond together, C’s and G’s bond together. Genomic Data Deluge of Personalized Medicine, Alice Rathjen, DNA Guide, Inc.
  • 11. Esri UC2013 . Top Strand is the Plus (+) Strand T+ A+ G+ C+ A- T- C- G- Bottom Strand is the Minus(-) Strand DNA Mapping The data is .txt files is for one side of the DNA Strand. Genomic Data Deluge of Personalized Medicine, Alice Rathjen, DNA Guide, Inc.
  • 12. Esri UC2013 . DNA Mapping Sample Dataset Genomic Data Deluge of Personalized Medicine, Alice Rathjen, DNA Guide, Inc. • RSID number is the research ID for the marker. • The Genotype “AG” means the person has an “A” at this location on one of their Chromosome 1s and a G at this location on their second Chromosome 1 # Human Reference Build 36 , + orientation #rsid chromosome position genotype rs3094315 1 742429 AG rs3131972 1 742584 AG rs11240777 1 788822 AG rs6681049 1 789870 CC rs4970383 1 828418 CC rs4475691 1 836671 CC rs7537756 1 844113 AA rs13302982 1 851671 GG rs1110052 1 863421 GT rs2272756 1 871896 AG rs3748597 1 878522 CC rs13303106 1 881808 AG rs28415373 1 883844 CC rs13303010 1 884436 AA
  • 13. Esri UC2013 . Patient data often shown in pairs of letters AT+ CC+ CG+ TT- (One letter for each chromosome pair) DNA Mapping Points in genome where humans vary are called SNPs Genomic Data Deluge of Personalized Medicine, Alice Rathjen, DNA Guide, Inc.
  • 14. Esri UC2013 . DNA Mapping Calculating x Coordinate Genomic Data Deluge of Personalized Medicine, Alice Rathjen, DNA Guide, Inc. • Longest Chromosome , Chr 1, has 250,000,000 base pairs • Using WebMercator, 1 base pair = .01 map units • X Coordinate = genetic marker position divided by 100 • Example coordinate/position (1234567 = 12345.67)
  • 15. Esri UC2013 . DNA Mapping Calculating y Coordinate* Genomic Data Deluge of Personalized Medicine, Alice Rathjen, DNA Guide, Inc. 23 Chromosome Pairs • Chromosome number times - 200.000.00 = y coordinate (i.e. chr 22 = -4,400.000.00) • Chromosome X coordinate = -4,600,000.00 • Chromosome Y coordinate = -4,800,000.00 *midline, for diploid browser add .01 and subtract .01 for each chromosome pair
  • 16. Esri UC2013 . Thank You ESRI Alice Rathjen CEO, Founder, DNA Guide alice@dnaguide.com Key Contributors: Mike Hargreaves Saw Yu Wai Richard Couch