SlideShare a Scribd company logo
1 of 25
Download to read offline
Toward Meaningful Whole-Genome
   Interpretation with Open Access Tools
   From the Genome Commons
   BioIT World Expo
   2010-04-22

   Reece Hart, Ph.D.
   Chief Scientist, Genome Commons
   QB3 / Center for Computational Biology
   UC Berkeley
   reece@berkeley.edu
                                            1
2010-04-22 11:43
What did we learn from their genomes?




             Not much.
                                        2
Can we agree to disagree? Probably not.




    Heart Attack Risk Prediction
    from Experimental Man, DE Duncan
    Gene                   Marker  Risk Allele   Genotype   Risk   Company
    CELSR2/PSEC1          rs599839     G           AG       0.86   deCodeMe
    CDKN2A/CDKN2B? rs10116277          T           GT          1   deCodeMe
    CDKN2A/CDKN2B? rs1333049           C           CC       1.72   Navigenics
    MTHFD1L              rs6922269     A           AA       1.53   Navigenics
    CDKN2A/CDKN2B? rs2383207           G           GG       1.22    23andme
                                                                                3
Trouble for direct-to-consumer testing.




                      http://blog.navigenics.com/articles/comments/an_open_letter_to_nature/   4
There's lots of good news, too.

➢   Disease diagnosis & prognosis

➢   Drug dosing and side effects

➢   Disease variant/gene identification

➢   Technological advances




                                           5
The Genome Commons seeks to build
open access, open source tools that
maximize the predictive, preventative,
and personalized value of genomic data.

  ●   Technical – organize date and streamline
       tools
  ●   Scientific – improve predictive accuracy
  ●   Clinical – engage clinicians and counselors
  ●   ELSI – address ineluctable ethical, legal, and
       social dilemmas

                                                       6
Collect data
in one place.


                7
Databases isolation impedes effective use.

   Data are studied, compiled, and stored gene-wise.
   That makes sense for collection, but not for genome-wide use.



                                                                        OMIM
                                                                                        GeneTests/
                                                                                       GeneReviews
                        935 genes
                                                                        LSDBs
1177 Locus-Specific Databases                                                            NHGRI GWAS
Source: http://www.hgvs.org/dblist/glsdb.html on Oct 15.
Some genes have multiple LSDBs.



                                                                                          PharmGKB

                                                           Literature
                                                           Literature
                                                                               dbSNP


                                                                                                      8
GCdb will be a repository of variants and traits.
                 OMIM from
                 dbSNP
     dbSNP                    Genome Commons                          GO

                                  Database
     LSDBs
             ⋮               variants
                                               pheno-
                                                types                ICD-10
 GeneTests


 PharmGKB         Automated bulk            Curated, high-quality,   UMLS
                  loading of                and traceable
                  structured data           association data


 ➢   Genotypes in standard              ➢   Up-to-date
     coordinates                        ➢   Quality-controlled
 ➢   Phenotype ontologies               ➢   Open access
 ➢   Asociations with                   ➢   Based on Unison
     likelihood, confidence,
     evidence, and severity
                                                                              9
Make genomic data
usable and useful.


                     10
The Navigator will integrate data and tools.


                     Infer variants in LD      Align variants to             Identify variants with         Facile user interfaces for basic research,
                     with typed markers        specified genome              known phenotypic impact        clinical application, drug development,
                                                                                                            epidemiology, and other uses.


                              Genome Commons Navigator
                                                                                   V
 Genotypes (e.g.,                                                                  a
                                            Imputer            Remapper                      Annotator
 by hybridization)                                                                 r
                                                                                                                        Variant
                                                                                   i
                                                                                                                      Annotation
                                                                                   a
                                                                                                                      Integrator
  Whole Genome/                         Assembler/                 Variant         n          Impact
Exome Sequences                          Aligner                   Caller          t         Predictor
                                                                                   s




                     Assemble genome                         Phased, aligned variants,      Infer effect of unclassified         Integrate and reconcile all
                     sequence and call variants              from genotyping,               genetic variants                     classified variants into a
                     (separately or jointly)                 imputation, or sequencing                                           comprehensive report




                                                                     External Data and Tools


                              Genome Commons Database



                                                                                                                                                               11
Improve variant
impact predictions.


                      12
CAGI – Critical Assessment of Genome Interpretation
A community assessment of the state-of-the-art in phenotype prediction.


➢   Follow the successful CASP framework
     ●   Solicit unpublished data
     ●   Collect blind predictions from participants
     ●   Assess against revealed annotations,
          mechanisms, and phenotypes

➢   Prediction Domains:
    Molecular phenotype    Cellular phenotype            Organismal phenotype
            A                      A                              A
                T                      T                              T




                                                With John Moult & Steven Brenner   13
MTHFR and Methylation
                                   exogenous
                                   folate              fol3




                                     met13




5,10-Methylene tetrahydrofolate (TH4) is required for the synthesis of nucleic acids, while 5-methyl TH4
is required for the formation of methionine from homocysteine. Methionine, in the form of S-
adenosylmethionine, is required for many biological methylation reactions, including DNA methylation.
Methylene TH4 reductase is a flavin-dependent enzyme required to catalyze the reduction of 5,10-
methylene TH4 to 5-methyl TH4.
                                                                               Linus Pauling Institute
                                                                               http://lpi.oregonstate.edu 14
Sequencing 18 Genes of Folate Pathway
                   Guthrie-Spot Sequencing Protocol

➢   250 NTD children and 250 case matched
    controls

➢   Protocol
    ●   2mm punch
    ●   Isolate genomic DNA
    ●   Amplification
    ●   Purification
    ●   Sequencing by JGI

➢   Variant calls of 238 exons in 18 genes
    ●   Analysis
    ●   Curate
    ●   QC
                                                      Jasper Rine 15
MTHFR variants exhibit 3 classes of effects.
                             S. cerevisiae growth with MTHFR knock-in mutants

                                                Severely Impaired                         Folate Remedial                                       No Effect
                                                  e.g., R134C                           e.g,. M110I, D223N                                     e.g., R519C
                                  0.6                                             0.7                                        0.6
                                                                                                            M110I                                        MTHFR
                                  0.5                                             0.6                                        0.5
                                                       MTHFR                      0.5
                  50 µg/ml




                                  0.4                                                                                        0.4
                             OD




                                                                                  0.4
                                  600                                             600                                        600
                                  0.3                                                                         D223N          0.3
                                  OD                                              OD
                                                                                  0.3                                        OD
                                                                    R134C                       MTHFR                                                    R519C
       [FOLINIC ACID]




                                  0.2                                                                                        0.2
                                                                                  0.2
                                  0.1                                             0.1                                        0.1
                                                                 met13
                                    0                                               0                                              0
                                            0    6                                      0   6                                          0   6
                                                     12 18 24 30 36 42 48 54 60                 12 18 24 30 36 42 48 54 60                     12 18 24 30 36 42 48 54 60
                                                           HOURS                                      HOURS                                          HOURS
                                  0.7                                             0.7                                        0.7
                                  0.6                                             0.6                                        0.6
25 µg/ml




                                  0.5                                             0.5                                        0.5
                             OD




                                  0.4                                             0.4                                        0.4
                                  600                                             600                                        600
                                  OD
                                  0.3                                             OD
                                                                                  0.3                                        OD
                                                                                                                             0.3

                                  0.2                                             0.2                                        0.2

                                  0.1                                             0.1                                        0.1

                                        0                                           0                                              0
                                            0    6                                      0   6                                          0   6
                                                     12 18 24 30 36 42 48 54 60                 12 18 24 30 36 42 48 54 60                     12 18 24 30 36 42 48 54 60
                                                           HOURS                                      HOURS                                          HOURS


                                                                                                     Time
                                                                                                                                                          Jasper Rine 16
Step 1: Collect predictions.




mutation     Team 1       Team 2
M110I        No Effect    Remediable
R134C        Impaired     Remediable
D223N        Remediable
R519C        No Effect    No Effect




                                          17
Step 2: Assess predictions.




mutation    Team 1        Team 2       Experiment
M110I      No Effect
                         Remediable   Remediable
R134C      Impaired     Remediable   Impaired
D223N      Remediable                 Remediable
R519C      No Effect     Effect
                          No           No Effect




                                                    18
Step 3: Celebrate and learn.
               It's not whether you win or lose...




mutation     Team 1             Team 2               Experiment
M110I       No Effect
                               Remediable           Remediable
R134C       Impaired          Remediable           Impaired
D223N       Remediable                              Remediable
R519C       No Effect        Effect
                              No                     No Effect




                                                                  19
Be clinically relevant.


                          20
Sequencing identifies clinically important associations.


                                   Concurrence among cases




              databases
              Intersection among




                                                             21
Do it
ethically.


             22
A few ineluctable ethical issues.

➢   How to fairly acknowledge aggregated
    data?
➢   Should scientifically suggestive results be
    used for clinical care?
➢   What is the balance between openness and
    preventing misinterpretation?
➢   What happens to confidentiality
    agreements during bankruptcy?
➢   How do we balance personal privacy with
    opportunities for public health advances?


                                             Bernard Lo 23
The Genome Commons




Jasper Rine    Steven Brenner   Bernie Lo   Robert Nussbaum




                                                              24
Nature. 2007 Mar 13;452(7184):151. 25

More Related Content

What's hot

Poster64: QTL mapping of resitance to Thips palmi Karny in common bean (Phase...
Poster64: QTL mapping of resitance to Thips palmi Karny in common bean (Phase...Poster64: QTL mapping of resitance to Thips palmi Karny in common bean (Phase...
Poster64: QTL mapping of resitance to Thips palmi Karny in common bean (Phase...CIAT
 
BRED and Butters Mountaintop Biology poster 2013 36x46(1)
BRED and Butters Mountaintop Biology poster 2013 36x46(1)BRED and Butters Mountaintop Biology poster 2013 36x46(1)
BRED and Butters Mountaintop Biology poster 2013 36x46(1)Michael Angelo Santana
 
Friend WIN Symposium 2012-06-28
Friend WIN Symposium 2012-06-28Friend WIN Symposium 2012-06-28
Friend WIN Symposium 2012-06-28Sage Base
 
Specificity Assessment At Santaris Pharma
Specificity Assessment At Santaris PharmaSpecificity Assessment At Santaris Pharma
Specificity Assessment At Santaris PharmaMorten Lindow
 
Marker assisted whole genome selection in crop improvement
Marker assisted whole genome     selection in crop improvementMarker assisted whole genome     selection in crop improvement
Marker assisted whole genome selection in crop improvementSenthil Natesan
 
Friend Oslo 2012-09-09
Friend Oslo 2012-09-09Friend Oslo 2012-09-09
Friend Oslo 2012-09-09Sage Base
 
Stephen Friend Complex Traits: Genomics and Computational Approaches 2012-02-23
Stephen Friend Complex Traits: Genomics and Computational Approaches 2012-02-23Stephen Friend Complex Traits: Genomics and Computational Approaches 2012-02-23
Stephen Friend Complex Traits: Genomics and Computational Approaches 2012-02-23Sage Base
 
Neuromics Presentation V4
Neuromics Presentation V4Neuromics Presentation V4
Neuromics Presentation V4Pete Shuster
 
Stephen Friend ICR UK 2012-06-18
Stephen Friend ICR UK 2012-06-18Stephen Friend ICR UK 2012-06-18
Stephen Friend ICR UK 2012-06-18Sage Base
 
PRODUCTION OF SEROTYPE 6-DERIVED RECOMBINANT ADENO-ASSOCIATED VIRUS IN SERUM-...
PRODUCTION OF SEROTYPE 6-DERIVED RECOMBINANT ADENO-ASSOCIATED VIRUS IN SERUM-...PRODUCTION OF SEROTYPE 6-DERIVED RECOMBINANT ADENO-ASSOCIATED VIRUS IN SERUM-...
PRODUCTION OF SEROTYPE 6-DERIVED RECOMBINANT ADENO-ASSOCIATED VIRUS IN SERUM-...Dr. Érica Schulze
 
Smith,Jacob,MVB_Poster
Smith,Jacob,MVB_PosterSmith,Jacob,MVB_Poster
Smith,Jacob,MVB_PosterJacob Smith
 
H gh power resources
H gh power resourcesH gh power resources
H gh power resourcesabhinav009
 
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16Sage Base
 
Mouse Genomes Project + RNA-Editing
Mouse Genomes Project + RNA-EditingMouse Genomes Project + RNA-Editing
Mouse Genomes Project + RNA-EditingThomas Keane
 
Marker Assisted Selection in Crop Breeding
 Marker Assisted Selection in Crop Breeding Marker Assisted Selection in Crop Breeding
Marker Assisted Selection in Crop BreedingPawan Chauhan
 
2011 course on Molecular Diagnostic Automation - Part 3 - Detection
2011 course on Molecular Diagnostic Automation - Part 3 - Detection2011 course on Molecular Diagnostic Automation - Part 3 - Detection
2011 course on Molecular Diagnostic Automation - Part 3 - DetectionPatrick Merel
 
Stephen Friend Nature Genetics Colloquium 2012-03-24
Stephen Friend Nature Genetics Colloquium 2012-03-24Stephen Friend Nature Genetics Colloquium 2012-03-24
Stephen Friend Nature Genetics Colloquium 2012-03-24Sage Base
 
Marker assisted selection (2)
Marker assisted selection (2)Marker assisted selection (2)
Marker assisted selection (2)Shreya Lodh
 

What's hot (20)

Poster64: QTL mapping of resitance to Thips palmi Karny in common bean (Phase...
Poster64: QTL mapping of resitance to Thips palmi Karny in common bean (Phase...Poster64: QTL mapping of resitance to Thips palmi Karny in common bean (Phase...
Poster64: QTL mapping of resitance to Thips palmi Karny in common bean (Phase...
 
BRED and Butters Mountaintop Biology poster 2013 36x46(1)
BRED and Butters Mountaintop Biology poster 2013 36x46(1)BRED and Butters Mountaintop Biology poster 2013 36x46(1)
BRED and Butters Mountaintop Biology poster 2013 36x46(1)
 
Friend WIN Symposium 2012-06-28
Friend WIN Symposium 2012-06-28Friend WIN Symposium 2012-06-28
Friend WIN Symposium 2012-06-28
 
Specificity Assessment At Santaris Pharma
Specificity Assessment At Santaris PharmaSpecificity Assessment At Santaris Pharma
Specificity Assessment At Santaris Pharma
 
Marker assisted whole genome selection in crop improvement
Marker assisted whole genome     selection in crop improvementMarker assisted whole genome     selection in crop improvement
Marker assisted whole genome selection in crop improvement
 
Friend Oslo 2012-09-09
Friend Oslo 2012-09-09Friend Oslo 2012-09-09
Friend Oslo 2012-09-09
 
Stephen Friend Complex Traits: Genomics and Computational Approaches 2012-02-23
Stephen Friend Complex Traits: Genomics and Computational Approaches 2012-02-23Stephen Friend Complex Traits: Genomics and Computational Approaches 2012-02-23
Stephen Friend Complex Traits: Genomics and Computational Approaches 2012-02-23
 
Neuromics Presentation V4
Neuromics Presentation V4Neuromics Presentation V4
Neuromics Presentation V4
 
Stephen Friend ICR UK 2012-06-18
Stephen Friend ICR UK 2012-06-18Stephen Friend ICR UK 2012-06-18
Stephen Friend ICR UK 2012-06-18
 
PRODUCTION OF SEROTYPE 6-DERIVED RECOMBINANT ADENO-ASSOCIATED VIRUS IN SERUM-...
PRODUCTION OF SEROTYPE 6-DERIVED RECOMBINANT ADENO-ASSOCIATED VIRUS IN SERUM-...PRODUCTION OF SEROTYPE 6-DERIVED RECOMBINANT ADENO-ASSOCIATED VIRUS IN SERUM-...
PRODUCTION OF SEROTYPE 6-DERIVED RECOMBINANT ADENO-ASSOCIATED VIRUS IN SERUM-...
 
MAS
MASMAS
MAS
 
Smith,Jacob,MVB_Poster
Smith,Jacob,MVB_PosterSmith,Jacob,MVB_Poster
Smith,Jacob,MVB_Poster
 
Proposal for student
Proposal for studentProposal for student
Proposal for student
 
H gh power resources
H gh power resourcesH gh power resources
H gh power resources
 
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16
Stephen Friend NIH PPP Coordinating Committee Meeting 2012-02-16
 
Mouse Genomes Project + RNA-Editing
Mouse Genomes Project + RNA-EditingMouse Genomes Project + RNA-Editing
Mouse Genomes Project + RNA-Editing
 
Marker Assisted Selection in Crop Breeding
 Marker Assisted Selection in Crop Breeding Marker Assisted Selection in Crop Breeding
Marker Assisted Selection in Crop Breeding
 
2011 course on Molecular Diagnostic Automation - Part 3 - Detection
2011 course on Molecular Diagnostic Automation - Part 3 - Detection2011 course on Molecular Diagnostic Automation - Part 3 - Detection
2011 course on Molecular Diagnostic Automation - Part 3 - Detection
 
Stephen Friend Nature Genetics Colloquium 2012-03-24
Stephen Friend Nature Genetics Colloquium 2012-03-24Stephen Friend Nature Genetics Colloquium 2012-03-24
Stephen Friend Nature Genetics Colloquium 2012-03-24
 
Marker assisted selection (2)
Marker assisted selection (2)Marker assisted selection (2)
Marker assisted selection (2)
 

Similar to Bio-IT 2010 Genome Commons

Pathema: A Bioinformatics Resource Center
Pathema: A Bioinformatics Resource CenterPathema: A Bioinformatics Resource Center
Pathema: A Bioinformatics Resource CenterPathema
 
Functional genomics, and tools
Functional genomics, and toolsFunctional genomics, and tools
Functional genomics, and toolsKAUSHAL SAHU
 
Advanced genome & epigenome editing tools.pptx
 Advanced genome & epigenome editing tools.pptx Advanced genome & epigenome editing tools.pptx
Advanced genome & epigenome editing tools.pptxberciyalgolda1
 
Trends in Annotation of Genomic Data
Trends in Annotation of Genomic DataTrends in Annotation of Genomic Data
Trends in Annotation of Genomic Databiobase
 
Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...laserxiong
 
Unison: Enabling easy, rapid, and comprehensive proteomic mining
Unison: Enabling easy, rapid, and comprehensive proteomic miningUnison: Enabling easy, rapid, and comprehensive proteomic mining
Unison: Enabling easy, rapid, and comprehensive proteomic miningReece Hart
 
Reporter stem cells generation
Reporter stem cells generationReporter stem cells generation
Reporter stem cells generationCreative Biogene
 
Friend EORTC 2012-11-08
Friend EORTC 2012-11-08Friend EORTC 2012-11-08
Friend EORTC 2012-11-08Sage Base
 
Role of molecular marker
Role of molecular markerRole of molecular marker
Role of molecular markerShweta Tiwari
 
Marker assisted selection
Marker assisted selectionMarker assisted selection
Marker assisted selectionDrSunil Bhakar
 
markers in plant breeding.
markers in plant breeding.markers in plant breeding.
markers in plant breeding.Alemu Abate
 
Functional Genomic l Genomes l proteomic l DNA l #genomics #proteomics #scien...
Functional Genomic l Genomes l proteomic l DNA l #genomics #proteomics #scien...Functional Genomic l Genomes l proteomic l DNA l #genomics #proteomics #scien...
Functional Genomic l Genomes l proteomic l DNA l #genomics #proteomics #scien...DevikaPatel12
 
Reporter stem cells generation
Reporter stem cells generationReporter stem cells generation
Reporter stem cells generationCreative Biogene
 
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical NotebookNaima Tahsin
 
Application of molecular markers in Plant Breeding
Application of molecular markers in Plant BreedingApplication of molecular markers in Plant Breeding
Application of molecular markers in Plant BreedingShubhamYadu1
 
Advanced genetic tools for plant biotechnology
Advanced genetic tools for plant biotechnology Advanced genetic tools for plant biotechnology
Advanced genetic tools for plant biotechnology muhammad shoaib
 

Similar to Bio-IT 2010 Genome Commons (20)

Pathema: A Bioinformatics Resource Center
Pathema: A Bioinformatics Resource CenterPathema: A Bioinformatics Resource Center
Pathema: A Bioinformatics Resource Center
 
Functional genomics, and tools
Functional genomics, and toolsFunctional genomics, and tools
Functional genomics, and tools
 
Advanced genome & epigenome editing tools.pptx
 Advanced genome & epigenome editing tools.pptx Advanced genome & epigenome editing tools.pptx
Advanced genome & epigenome editing tools.pptx
 
Trends in Annotation of Genomic Data
Trends in Annotation of Genomic DataTrends in Annotation of Genomic Data
Trends in Annotation of Genomic Data
 
Genome comparision
Genome comparisionGenome comparision
Genome comparision
 
Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...
 
Unison: Enabling easy, rapid, and comprehensive proteomic mining
Unison: Enabling easy, rapid, and comprehensive proteomic miningUnison: Enabling easy, rapid, and comprehensive proteomic mining
Unison: Enabling easy, rapid, and comprehensive proteomic mining
 
genomic comparison
genomic comparison genomic comparison
genomic comparison
 
Reporter stem cells generation
Reporter stem cells generationReporter stem cells generation
Reporter stem cells generation
 
Friend EORTC 2012-11-08
Friend EORTC 2012-11-08Friend EORTC 2012-11-08
Friend EORTC 2012-11-08
 
Molecular marker
Molecular markerMolecular marker
Molecular marker
 
Role of molecular marker
Role of molecular markerRole of molecular marker
Role of molecular marker
 
Marker assisted selection
Marker assisted selectionMarker assisted selection
Marker assisted selection
 
markers in plant breeding.
markers in plant breeding.markers in plant breeding.
markers in plant breeding.
 
Functional Genomic l Genomes l proteomic l DNA l #genomics #proteomics #scien...
Functional Genomic l Genomes l proteomic l DNA l #genomics #proteomics #scien...Functional Genomic l Genomes l proteomic l DNA l #genomics #proteomics #scien...
Functional Genomic l Genomes l proteomic l DNA l #genomics #proteomics #scien...
 
Reporter stem cells generation
Reporter stem cells generationReporter stem cells generation
Reporter stem cells generation
 
Introduction to Apollo for i5k
Introduction to Apollo for i5kIntroduction to Apollo for i5k
Introduction to Apollo for i5k
 
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical Notebook
 
Application of molecular markers in Plant Breeding
Application of molecular markers in Plant BreedingApplication of molecular markers in Plant Breeding
Application of molecular markers in Plant Breeding
 
Advanced genetic tools for plant biotechnology
Advanced genetic tools for plant biotechnology Advanced genetic tools for plant biotechnology
Advanced genetic tools for plant biotechnology
 

More from Reece Hart

HGVS 2015 poster: hgvs, uta, variantanalyzer
HGVS 2015 poster: hgvs, uta, variantanalyzerHGVS 2015 poster: hgvs, uta, variantanalyzer
HGVS 2015 poster: hgvs, uta, variantanalyzerReece Hart
 
Clinical significance of transcript alignment discrepancies gne - 20141016
Clinical significance of transcript alignment discrepancies   gne - 20141016Clinical significance of transcript alignment discrepancies   gne - 20141016
Clinical significance of transcript alignment discrepancies gne - 20141016Reece Hart
 
The Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment DiscrepanciesThe Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment DiscrepanciesReece Hart
 
Invitae PSB 2014 poster
Invitae PSB 2014 posterInvitae PSB 2014 poster
Invitae PSB 2014 posterReece Hart
 
AWS Life Sciences
AWS Life SciencesAWS Life Sciences
AWS Life SciencesReece Hart
 
ASHG 2012 Poster
ASHG 2012 PosterASHG 2012 Poster
ASHG 2012 PosterReece Hart
 
Building a clinical genome interpretation services company
Building a clinical genome interpretation services companyBuilding a clinical genome interpretation services company
Building a clinical genome interpretation services companyReece Hart
 
HVP Critical Assessment of Genome Interpretation
HVP Critical Assessment of Genome InterpretationHVP Critical Assessment of Genome Interpretation
HVP Critical Assessment of Genome InterpretationReece Hart
 
Introduction to and Applications of Unison, an Open Source Database for Targe...
Introduction to and Applications of Unison, an Open Source Database for Targe...Introduction to and Applications of Unison, an Open Source Database for Targe...
Introduction to and Applications of Unison, an Open Source Database for Targe...Reece Hart
 
A Tour of Research Computing at Genentech
A Tour of Research Computing at GenentechA Tour of Research Computing at Genentech
A Tour of Research Computing at GenentechReece Hart
 
Integrating Public and Private Data: Lessons Learned from Unison
Integrating Public and Private Data: Lessons Learned from UnisonIntegrating Public and Private Data: Lessons Learned from Unison
Integrating Public and Private Data: Lessons Learned from UnisonReece Hart
 
Unison: An Integrated Platform for Computational Biology Discovery
Unison: An Integrated Platform for Computational Biology DiscoveryUnison: An Integrated Platform for Computational Biology Discovery
Unison: An Integrated Platform for Computational Biology DiscoveryReece Hart
 
Mining for Novel TNF Ligands
Mining for Novel TNF LigandsMining for Novel TNF Ligands
Mining for Novel TNF LigandsReece Hart
 

More from Reece Hart (13)

HGVS 2015 poster: hgvs, uta, variantanalyzer
HGVS 2015 poster: hgvs, uta, variantanalyzerHGVS 2015 poster: hgvs, uta, variantanalyzer
HGVS 2015 poster: hgvs, uta, variantanalyzer
 
Clinical significance of transcript alignment discrepancies gne - 20141016
Clinical significance of transcript alignment discrepancies   gne - 20141016Clinical significance of transcript alignment discrepancies   gne - 20141016
Clinical significance of transcript alignment discrepancies gne - 20141016
 
The Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment DiscrepanciesThe Clinical Significance of Transcript Alignment Discrepancies
The Clinical Significance of Transcript Alignment Discrepancies
 
Invitae PSB 2014 poster
Invitae PSB 2014 posterInvitae PSB 2014 poster
Invitae PSB 2014 poster
 
AWS Life Sciences
AWS Life SciencesAWS Life Sciences
AWS Life Sciences
 
ASHG 2012 Poster
ASHG 2012 PosterASHG 2012 Poster
ASHG 2012 Poster
 
Building a clinical genome interpretation services company
Building a clinical genome interpretation services companyBuilding a clinical genome interpretation services company
Building a clinical genome interpretation services company
 
HVP Critical Assessment of Genome Interpretation
HVP Critical Assessment of Genome InterpretationHVP Critical Assessment of Genome Interpretation
HVP Critical Assessment of Genome Interpretation
 
Introduction to and Applications of Unison, an Open Source Database for Targe...
Introduction to and Applications of Unison, an Open Source Database for Targe...Introduction to and Applications of Unison, an Open Source Database for Targe...
Introduction to and Applications of Unison, an Open Source Database for Targe...
 
A Tour of Research Computing at Genentech
A Tour of Research Computing at GenentechA Tour of Research Computing at Genentech
A Tour of Research Computing at Genentech
 
Integrating Public and Private Data: Lessons Learned from Unison
Integrating Public and Private Data: Lessons Learned from UnisonIntegrating Public and Private Data: Lessons Learned from Unison
Integrating Public and Private Data: Lessons Learned from Unison
 
Unison: An Integrated Platform for Computational Biology Discovery
Unison: An Integrated Platform for Computational Biology DiscoveryUnison: An Integrated Platform for Computational Biology Discovery
Unison: An Integrated Platform for Computational Biology Discovery
 
Mining for Novel TNF Ligands
Mining for Novel TNF LigandsMining for Novel TNF Ligands
Mining for Novel TNF Ligands
 

Bio-IT 2010 Genome Commons

  • 1. Toward Meaningful Whole-Genome Interpretation with Open Access Tools From the Genome Commons BioIT World Expo 2010-04-22 Reece Hart, Ph.D. Chief Scientist, Genome Commons QB3 / Center for Computational Biology UC Berkeley reece@berkeley.edu 1 2010-04-22 11:43
  • 2. What did we learn from their genomes? Not much. 2
  • 3. Can we agree to disagree? Probably not. Heart Attack Risk Prediction from Experimental Man, DE Duncan Gene Marker Risk Allele Genotype Risk Company CELSR2/PSEC1 rs599839 G AG 0.86 deCodeMe CDKN2A/CDKN2B? rs10116277 T GT 1 deCodeMe CDKN2A/CDKN2B? rs1333049 C CC 1.72 Navigenics MTHFD1L rs6922269 A AA 1.53 Navigenics CDKN2A/CDKN2B? rs2383207 G GG 1.22 23andme 3
  • 4. Trouble for direct-to-consumer testing. http://blog.navigenics.com/articles/comments/an_open_letter_to_nature/ 4
  • 5. There's lots of good news, too. ➢ Disease diagnosis & prognosis ➢ Drug dosing and side effects ➢ Disease variant/gene identification ➢ Technological advances 5
  • 6. The Genome Commons seeks to build open access, open source tools that maximize the predictive, preventative, and personalized value of genomic data. ● Technical – organize date and streamline tools ● Scientific – improve predictive accuracy ● Clinical – engage clinicians and counselors ● ELSI – address ineluctable ethical, legal, and social dilemmas 6
  • 8. Databases isolation impedes effective use. Data are studied, compiled, and stored gene-wise. That makes sense for collection, but not for genome-wide use. OMIM GeneTests/ GeneReviews 935 genes  LSDBs 1177 Locus-Specific Databases NHGRI GWAS Source: http://www.hgvs.org/dblist/glsdb.html on Oct 15. Some genes have multiple LSDBs. PharmGKB Literature Literature dbSNP 8
  • 9. GCdb will be a repository of variants and traits. OMIM from dbSNP dbSNP Genome Commons GO Database LSDBs ⋮ variants pheno- types ICD-10 GeneTests PharmGKB Automated bulk Curated, high-quality, UMLS loading of and traceable structured data association data ➢ Genotypes in standard ➢ Up-to-date coordinates ➢ Quality-controlled ➢ Phenotype ontologies ➢ Open access ➢ Asociations with ➢ Based on Unison likelihood, confidence, evidence, and severity 9
  • 10. Make genomic data usable and useful. 10
  • 11. The Navigator will integrate data and tools. Infer variants in LD Align variants to Identify variants with Facile user interfaces for basic research, with typed markers specified genome known phenotypic impact clinical application, drug development, epidemiology, and other uses. Genome Commons Navigator V Genotypes (e.g., a Imputer Remapper Annotator by hybridization) r Variant i Annotation a Integrator Whole Genome/ Assembler/ Variant n Impact Exome Sequences Aligner Caller t Predictor s Assemble genome Phased, aligned variants, Infer effect of unclassified Integrate and reconcile all sequence and call variants from genotyping, genetic variants classified variants into a (separately or jointly) imputation, or sequencing comprehensive report External Data and Tools Genome Commons Database 11
  • 13. CAGI – Critical Assessment of Genome Interpretation A community assessment of the state-of-the-art in phenotype prediction. ➢ Follow the successful CASP framework ● Solicit unpublished data ● Collect blind predictions from participants ● Assess against revealed annotations, mechanisms, and phenotypes ➢ Prediction Domains: Molecular phenotype Cellular phenotype Organismal phenotype A A A T T T With John Moult & Steven Brenner 13
  • 14. MTHFR and Methylation exogenous folate fol3 met13 5,10-Methylene tetrahydrofolate (TH4) is required for the synthesis of nucleic acids, while 5-methyl TH4 is required for the formation of methionine from homocysteine. Methionine, in the form of S- adenosylmethionine, is required for many biological methylation reactions, including DNA methylation. Methylene TH4 reductase is a flavin-dependent enzyme required to catalyze the reduction of 5,10- methylene TH4 to 5-methyl TH4. Linus Pauling Institute http://lpi.oregonstate.edu 14
  • 15. Sequencing 18 Genes of Folate Pathway Guthrie-Spot Sequencing Protocol ➢ 250 NTD children and 250 case matched controls ➢ Protocol ● 2mm punch ● Isolate genomic DNA ● Amplification ● Purification ● Sequencing by JGI ➢ Variant calls of 238 exons in 18 genes ● Analysis ● Curate ● QC Jasper Rine 15
  • 16. MTHFR variants exhibit 3 classes of effects. S. cerevisiae growth with MTHFR knock-in mutants Severely Impaired Folate Remedial No Effect e.g., R134C e.g,. M110I, D223N e.g., R519C 0.6 0.7 0.6 M110I MTHFR 0.5 0.6 0.5 MTHFR 0.5 50 µg/ml 0.4 0.4 OD 0.4 600 600 600 0.3 D223N 0.3 OD OD 0.3 OD R134C MTHFR R519C [FOLINIC ACID] 0.2 0.2 0.2 0.1 0.1 0.1 met13 0 0 0 0 6 0 6 0 6 12 18 24 30 36 42 48 54 60 12 18 24 30 36 42 48 54 60 12 18 24 30 36 42 48 54 60 HOURS HOURS HOURS 0.7 0.7 0.7 0.6 0.6 0.6 25 µg/ml 0.5 0.5 0.5 OD 0.4 0.4 0.4 600 600 600 OD 0.3 OD 0.3 OD 0.3 0.2 0.2 0.2 0.1 0.1 0.1 0 0 0 0 6 0 6 0 6 12 18 24 30 36 42 48 54 60 12 18 24 30 36 42 48 54 60 12 18 24 30 36 42 48 54 60 HOURS HOURS HOURS Time Jasper Rine 16
  • 17. Step 1: Collect predictions. mutation Team 1 Team 2 M110I No Effect Remediable R134C Impaired Remediable D223N Remediable R519C No Effect No Effect 17
  • 18. Step 2: Assess predictions. mutation Team 1 Team 2 Experiment M110I No Effect Remediable Remediable R134C Impaired Remediable Impaired D223N Remediable Remediable R519C No Effect  Effect No No Effect 18
  • 19. Step 3: Celebrate and learn. It's not whether you win or lose... mutation Team 1 Team 2 Experiment M110I  No Effect  Remediable Remediable R134C  Impaired  Remediable Impaired D223N  Remediable Remediable R519C  No Effect  Effect No No Effect 19
  • 21. Sequencing identifies clinically important associations. Concurrence among cases databases Intersection among 21
  • 23. A few ineluctable ethical issues. ➢ How to fairly acknowledge aggregated data? ➢ Should scientifically suggestive results be used for clinical care? ➢ What is the balance between openness and preventing misinterpretation? ➢ What happens to confidentiality agreements during bankruptcy? ➢ How do we balance personal privacy with opportunities for public health advances? Bernard Lo 23
  • 24. The Genome Commons Jasper Rine Steven Brenner Bernie Lo Robert Nussbaum 24
  • 25. Nature. 2007 Mar 13;452(7184):151. 25