SlideShare a Scribd company logo
1 of 47
Download to read offline
Adventures in Translational Bioinformatics




Harry Hochheiser

University of Pittsburgh School of Medicine
Department of Biomedical Informatics
harryh@pitt.edu


Harry Hochheiser, harryh@pitt.edu                                                                                Texas A&M Health Sciences Center February 6, 2013
                                    © 2010 Harry Hochheiser
                                    Licensed under Creative Commons Attribution-NonCommercial-NoDerivs license
Biomedical Informatics
  
         The use of informatics for the improvement of biomedical
        research and clinical care




                                                                                Friedman “A "fundamental theorem" of biomedical
                                                                                informatics. JAMIA 2009
Shortliffe & Blois “The Computer Meets Medicine and
Biology: Emergence of a Discipline” in Shortliffe & Cimino, eds.,
“Biomedical Informatics: Computer Applications in Health Care and Biomedicine

 Harry Hochheiser, harryh@pitt.edu                                                    Texas A&M Health Sciences Center February 6, 2013
What is “Translational” research?
 ●
     My research is translational...
                 ●
                      ..because I want to get funding?




                             http://www.ncats.nih.gov/research/cts/cts.html



Harry Hochheiser, harryh@pitt.edu                                             Texas A&M Health Sciences Center February 6, 2013
Co-Clinical Trials
 Nardella, Lunardi, Pataik, and Cantley
 The APL Paradigm
 and the “Co-Clinical Trial” Project
 Cancer Discovery 2011
  Acute Proyelocytic Leukemia
   ●



   ●
       13 years of work – cloning, description, transgenics, trials
   ●
       Successful therapy now approved for clinical use..
   ●
       Co-clinical trials – synchronize human and animal trials.




Harry Hochheiser, harryh@pitt.edu                           Texas A&M Health Sciences Center February 6, 2013
Measuring Success

 ●
     How is success of translational science measured?
 ●
     Cures? Therapies?
 ●
     Citations mapping model organism studies to T2, T3, T4
      success?
 ●
     How do we assess impact that doesn't play out in the usual
      3-5 year grant cycle?




Harry Hochheiser, harryh@pitt.edu              Texas A&M Health Sciences Center February 6, 2013
FaceBase
 http://www.facebase.org


 National Institute of Dental and Craniofacial Research
 Five-year initial phase 2009-2014
 “..systematically compile the biological instructions to construct the middle
     region of the human face and precisely define the genetics underlying its
     common developmental disorders, such as cleft lip and palate”

 10 Projects: U-flavor contracts
 Data Management and Coordination Hub
                 – “One-stop access to craniofacial research data”
                 –     “Allow scientists to more rapidly and effectively generate
                         hypotheses and accelerate the pace of their research”.



Harry Hochheiser, harryh@pitt.edu                                    Texas A&M Health Sciences Center February 6, 2013
FaceBase Projects
Hochheiser, et al. 2011




Harry Hochheiser, harryh@pitt.edu   Texas A&M Health Sciences Center February 6, 2013
FaceBase Data
 ●
     10 projects
 ●
       4 organisms: mouse, zebrafish, chick, human
 ●
       Varying developmental time points
 ●
     Data types
                 ●
                      Expression: microarray, ChIP-Seq, miRNA,
                 ●
                      Images: 3D MRI, OPT, microMRI, microCT, 3D human facial
                        mesh
                 ●
                      Sequences: Genotypes.GWAS, CNV
                 ●
                      Software: 3D facial image analysis
                 ●
                      Strains: Cre recobminase drivers
 ●
     1 Data management and coordination hub
Harry Hochheiser, harryh@pitt.edu                                Texas A&M Health Sciences Center February 6, 2013
What does it mean to effectively
 share data?


Diversity of data


Diversity of projects, goals, etc.

 .
Challenges are both technical and social/organizational




Harry Hochheiser, harryh@pitt.edu            Texas A&M Health Sciences Center February 6, 2013
FaceBase Hub Technical Challenges
                                    Sequences
   Data Diversity                                                Images..
                                     miRNA      microarrays

                                     Anatomy              Phenotype

     Integration                    UCSC Genome Browser        Between datasets

      Tools                         Searching         Browsing

                                        Metadata
                                      Genotypes               RNA-Seq
      Data Volume

    “Controlled Access” Human Data                   Genotypes      Facial Images

Harry Hochheiser, harryh@pitt.edu                                    Texas A&M Health Sciences Center February 6, 2013
Metadata: Theory

 ●
       Well-defined metadata fields/attributes
 ●
       Controlled vocabularies provide consistent terminology for
       each field
                 ●
                      Link to appropriate resources as needed: NCBI, MGI, etc..
 ●
       Additional attributes specific to each data type


 ●
       Consistent metadata supports search, navigation...




Harry Hochheiser, harryh@pitt.edu                               Texas A&M Health Sciences Center February 6, 2013
Metadata in practice




“What's the difference between mutation and Genotype?”

“Uhh. I'll have to get back to you on that.”

“Strain name? Is “C57B6J” the same as “C57Bl/6J?”

“We're lucky to have any information at all when it comes to
  these mice...”
“The official name of the strain is ____, but everybody
  calls it ____”
 Harry Hochheiser, harryh@pitt.edu             Texas A&M Health Sciences Center February 6, 2013
Biomedical Ontologies might solve
 some terminology and structure
 problems
      Human Anatomy: EHDAA, FMA
      Mouse Anatomy: EMAP, MA
      Mouse/Human Phenotypes:
             MP/HPO
      Genes: GO
      Data Models: MIAME, MiXX,
          ISA-TAB, OBI




Harry Hochheiser, harryh@pitt.edu   Texas A&M Health Sciences Center February 6, 2013
Metadata Possibilties:
    Ontologies & Data Models



Ontologies, Data models, etc.
                                                                                             User Practice




                                       Grand Canyon NPS
                                           http://www.fotopedia.com/items/fickr-
                                           7553734530




   Harry Hochheiser, harryh@pitt.edu                                               Texas A&M Health Sciences Center February 6, 2013
Implications

 Good metadata is vital for search, retrieval, data sharing,
   reuse
 Curation effort can compensate for some inadequacies, but
                 – EC Curation Effort, Mq Metadata Quality

                 – EC = f( 1/Mq)
                 – Possibly non-linear
                 –
 This doesn't scale


Harry Hochheiser, harryh@pitt.edu                    Texas A&M Health Sciences Center February 6, 2013
Search tools

 ●
     Search + Link: current practice in bioinformatics
      repositories
                 ●
                      Text search over metadata
                 ●
                      Linear results list
                 ●
                      Results pages with many links
 ●
     Advantages
                 ●
                      Easy, fast
 ●
     Disadvantages
                 ●
                      Relationships between items in result list are not clear
                 ●
                      No overviews to help with higher-level meta-understanding
Harry Hochheiser, harryh@pitt.edu                                 Texas A&M Health Sciences Center February 6, 2013
The Ontology of Craniofacial
Development and Malformation
Extend existing ontologies to provide richer models of
  craniofacial anatomy, development, and malformation
Foundational model of Anatomy (FMA), Human Phenotype
  Ontology (HPO), Mouse Anatomy (MA)..
Add human developmental description
Human-Mouse links
https://www.facebase.org/content/ocdm

But... better models don't guarantee usage




Harry Hochheiser, harryh@pitt.edu            Texas A&M Health Sciences Center February 6, 2013
The tool gap

  “Tools have advanced to the point of being able to support
    users fairly successfully in finding and reading off data
    (e.g. to classify and find multidimensional relationships of
    interest) but not in being able to interactively explore
    these complex relationships in context to infer causal
    explanations and build convincing biological stories amid
    uncertainty.”
  B. Mirel Supporting cognition in systems biology analysis: findings on users' processes and design
      implications (2009) J. Biomedical Discovery & Collaboration




  • Need: Tools that effectively leverage combination of
    human intellect and computing power



Harry Hochheiser, harryh@pitt.edu                                            Texas A&M Health Sciences Center February 6, 2013
Information Visualization
 ●
     Interactive displays of high-dimensional data sets
 ●
     Coordinated views facilitate comparison across dimensions and
 ●
     Multi-scale
                 ●
                      Overview
                 ●
                      Zoom/filter
                 ●
                      Full details on requested items
 ●
     Rapid, incremental, reversible queries
 ●
     Avoid 0-hit or million-hit queries.




Harry Hochheiser, harryh@pitt.edu                         Texas A&M Health Sciences Center February 6, 2013
Toward integrative tools
                                                  How do we build
                                                  connections?
   Genotypes

                                               How do we get
Morphometry                                    users to think
                                               differently
                                               about the data?
           Images


  Expression


     Sequence

                                      Human   Mouse              Zebrafish
  Harry Hochheiser, harryh@pitt.edu                      Texas A&M Health Sciences Center February 6, 2013
Information Visualization
 ●
     Interactive displays of high-dimensional data sets
 ●
     Coordinated views facilitate comparison across dimensions and
 ●
     Multi-scale
                 ●
                      Overview
                 ●
                      Zoom/filter
                 ●
                      Full details on requested items
 ●
     Rapid, incremental, reversible queries
 ●
     Avoid 0-hit or million-hit queries.




Harry Hochheiser, harryh@pitt.edu                         Texas A&M Health Sciences Center February 6, 2013
Technology Probe:
         Connecting related data items

                                    • Single search, multiple
                                    data types
                                    • Links between items
                                    indicate connections
                                    • Images or other details
                                    on demand for direct
                                    access to data
                                    • Rapid response supports
                                    interactivity, exploration,
                                    serendipitous discovery




Harry Hochheiser, harryh@pitt.edu           Texas A&M Health Sciences Center February 6, 2013
Technology Probe: Gene Atlas

                                    • Genes displayed
                                    on timeline,
                                    indicating when
                                    active
                                    • Images display
                                    expression
                                    localization
                                    • Large image for
                                    detailed views.
                                    • Mouse-over
                                    coordinated links


Harry Hochheiser, harryh@pitt.edu   Texas A&M Health Sciences Center February 6, 2013
Vision vs. reality

 ●
     Data is still coming in
 ●
     Gene atlas requires coordinated analysis ..
 ●
     What can we do with data that is available?




Harry Hochheiser, harryh@pitt.edu                  Texas A&M Health Sciences Center February 6, 2013
Technology Probe: Gene Atlas

                                    • Genes displayed
                                    on timeline,
                                    indicating when
                                    active
                                    • Images display
                                    expression
                                    localization
                                    • Large image for
                                    detailed views.
                                    • Mouse-over
                                    coordinated links


Harry Hochheiser, harryh@pitt.edu   Texas A&M Health Sciences Center February 6, 2013
Current Search Interface




Harry Hochheiser, harryh@pitt.edu   Texas A&M Health Sciences Center February 6, 2013
Timeline view
Approximate alignment of datasets on common timeline




 Harry Hochheiser, harryh@pitt.edu           Texas A&M Health Sciences Center February 6, 2013
3D Facial Meshes

 Two datasets
                 – Weinberg & Marazita:
                    Caucasian facial norms
                 – Spritz: Tanzanian

 Identifiable data -access only
   upon DAC approval
 Query tool: identify subsets
   for further analysis.




Harry Hochheiser, harryh@pitt.edu            Texas A&M Health Sciences Center February 6, 2013
UCSC Genome Browser Mirror

●
    Genomebrowser.facebase.org
●
    > 30 tracks, mouse mm9
               ●
                    RNA-seq
               ●
                    ChIP-seq
               ●
                    Various anatomic
                      locations, time
                      points, etc.
●
    Human CNV hg19




Harry Hochheiser, harryh@pitt.edu       Texas A&M Health Sciences Center February 6, 2013
Imaging

●
    microMRI,
●
    MicroCT
●
    OPT
●
    Interactive Browsers?




Harry Hochheiser, harryh@pitt.edu   Texas A&M Health Sciences Center February 6, 2013
Fishface


Atlas of zebrafish
   craniofacial
   development
 C. Kimmel, et al.




 Harry Hochheiser, harryh@pitt.edu   Texas A&M Health Sciences Center February 6, 2013
Current Status

 As of 4 February 2013
       > 200 datasets
       > 100 in queue awaiting curation or investigator approval
 Continuing to refine metadata and data organization
 Expecting large infux of data over the next year.




Harry Hochheiser, harryh@pitt.edu                Texas A&M Health Sciences Center February 6, 2013
Toward Greater Integration
Currently                                        Ideally


 Metadata                            Metadata   Metadata                Metadata
         Data                        Data          Data                    Data

Links via metadata:                             Links via data:
 Comparable organism, stage,                        Commonly expressed genes,
 Gene, etc..                                        Sequences,
                                                    Pathways
                                                “Easy” for similar data
                                                    Expression, Sequences
                                                Harder for diverse data
                                                Images?
 Harry Hochheiser, harryh@pitt.edu                         Texas A&M Health Sciences Center February 6, 2013
FaceBase as a model

 ●
     Diverse researchers and projects
 ●
     Focus: specific clinical domain
 ●
     Goal: collect and coordinated data – community resource
 ●
     Outstanding challenges?
 ●
       Is this even the right idea?




Harry Hochheiser, harryh@pitt.edu             Texas A&M Health Sciences Center February 6, 2013
Build it and they (may) come?

 ●
     Speculative claim: assemble good data and it will be of use
 ●
     How to evaluate this?
 ●
     How to compare to n R01s that might otherwise have been
      funded? (question asked by program officer)
 ●
     ENCODE controversy
     “..the lesson I learned from ENCODE is that projects like
      ENCODE are not a good idea.”
                      M. Eisen http://www.michaeleisen.org/blog/?p=1179




Harry Hochheiser, harryh@pitt.edu                                         Texas A&M Health Sciences Center February 6, 2013
Three key factors


                        Tools                  Incentives




                                    Outreach


Harry Hochheiser, harryh@pitt.edu                Texas A&M Health Sciences Center February 6, 2013
Tools
 Desperately needed!


 Search tools – finding data


 Annotation Tools – describing data
                 – Using ontologies
                 – Common data models



   Minimize the effort required to provide the high-quality
   metadata needed for translational bioinformatics.

Harry Hochheiser, harryh@pitt.edu            Texas A&M Health Sciences Center February 6, 2013
Perhaps the worst
 bioinformatics data management tool




Harry Hochheiser, harryh@pitt.edu   Texas A&M Health Sciences Center February 6, 2013
Perhaps the best
 bioinformatics data management tool




Harry Hochheiser, harryh@pitt.edu   Texas A&M Health Sciences Center February 6, 2013
Pros and Cons of Spreadsheets
                                                Alternatives:

Cons:
              – ad-hoc data models
              – ad-hoc data fields
              – No consistency, provenance...

Pros:
              – Ubiquitous
              – Simple                          Analytic tools: more rigor,
                                                less generality
              – Flexible
              – Familiar
                                                           Experiment Metadata
Harry Hochheiser, harryh@pitt.edu                            Texas A&M Health Sciences Center February 6, 2013
Usability and Ontologies
 Can we at least get the terms right?




                                        Maguire et al. 2012
Harry Hochheiser, harryh@pitt.edu              Texas A&M Health Sciences Center February 6, 2013
Beyond Autocomplete

 ●
     Substring autocomplete –
      state of art for finding
      terms in biomedical
      ontologies
 ●
     Cons: no context, room
      for specialization or
      generalization?
 ●
     Can we build a tool that
      will easily support both
      search and navigation?
                                    Foundational Model Explorer
                                    http://fme.biostr.washington.edu/FME/index.html

Harry Hochheiser, harryh@pitt.edu                           Texas A&M Health Sciences Center February 6, 2013
Why not let the curators do it?

 ●
     Expense
 ●
     Accuracy
 ●
     Speed
 ●
     Scientists know their data best

  Claim – To be sustainable, data sharing must be self-curated.


  But.. why should anyone bother?


Harry Hochheiser, harryh@pitt.edu             Texas A&M Health Sciences Center February 6, 2013
Scientific incentives...
                                                       Where does
Traditionally....                                      Data sharing fit in?


                             Research
                              Results




        Funding                             Papers    Data
                                                     Sharing

                                        ?
Harry Hochheiser, harryh@pitt.edu                        Texas A&M Health Sciences Center February 6, 2013
How to promote sharing..

 1. Better tools → reduced effort
         Eannotation < Ecollection + epsilon

 2. Recognize effort: alternative models of academic credit
         ImpactStory...
         Altmetrics: Value all research products
         (H. Piwowar, Nature 1/9/13, doi:10.1038/493159a)

       “For all new grant applications from 14 January, the US National Science Foundation
       (NSF) asks a principal investigator to list his or her research “products” rather than
       “publications” in the biographical sketch section.”




Harry Hochheiser, harryh@pitt.edu                                    Texas A&M Health Sciences Center February 6, 2013
Related Efforts

 ●
     Genomics Research in Alpha-1 Antitrypsin Deficiency and
      Sarcoidosis (GRADS) – microbiome, expression, phenotype
      sharing
 ●
     LAMHDI/MONARCH – User tools for ontologically-driven
      discovery of model genes for human diseases
 ●
     Research Social Network Collaboration Finding tool – using
      VIVO and related networking tools to find collaborators..




Harry Hochheiser, harryh@pitt.edu              Texas A&M Health Sciences Center February 6, 2013
Acknowledgments
 FaceBase: Mary Marazita, Jeff Murray, Yang Chai, Bruce Arnonow, Kristin Artinger, Teri
     Beaty, Jim Brinkley, David Clouthier, Michael Cunningham, Michael Dixon, Leah Rae
     Donahue, Scott E. Fraser, Benedikt Hallgrimsson, Junichi Iwata, Ophir Klein, Stephen
     Murray, Fernando Pardo-Manuel de Villena, John Postlethwait, Steven Potter, Lina
     Shapiro, Richard Spritz, Axel Visel, Seth Weinberg, Paul Trainor

 U. Pittsburgh: Mike Becich, Becky Boes, Chuck Borromeo, Lance Kennelty, Annette Krag-
     Jensen, Tom Maher, Johnson Paul, Linda Schmandt, Shiyi Shen, Bill Shirey, Cristy Spino,
     Mike Stefanko, Justin Stickel

 OCDM :James Brinkley; Jose Leonardo Mejino; Landon Detwiler; Ravensara Travillian; Melissa
    Clarkson; Timothy Cox; Carrie Heike; Michael Cunningham; Linda Shapiro

 NIDCR: Steven Scholnick, Emily Harris, Jeannine Helm, Nadya Lumelsky, Lillian Shum

 Support: NIH Grants U01 DE020057, 3U01DE020050-03S1 




Harry Hochheiser, harryh@pitt.edu                                   Texas A&M Health Sciences Center February 6, 2013

More Related Content

What's hot

Enabling Real-time Genome Data Research with In-memory Database Technology (S...
Enabling Real-time Genome Data Research with In-memory Database Technology (S...Enabling Real-time Genome Data Research with In-memory Database Technology (S...
Enabling Real-time Genome Data Research with In-memory Database Technology (S...
Matthieu Schapranow
 
Ontology-enabled Healthcare Applications exploiting Physical-Cyber-Social Big...
Ontology-enabled Healthcare Applications exploiting Physical-Cyber-Social Big...Ontology-enabled Healthcare Applications exploiting Physical-Cyber-Social Big...
Ontology-enabled Healthcare Applications exploiting Physical-Cyber-Social Big...
Amit Sheth
 

What's hot (13)

Enabling Real-time Genome Data Research with In-memory Database Technology (S...
Enabling Real-time Genome Data Research with In-memory Database Technology (S...Enabling Real-time Genome Data Research with In-memory Database Technology (S...
Enabling Real-time Genome Data Research with In-memory Database Technology (S...
 
Developing a Clinical Decision Support System with Grakn
Developing a Clinical Decision Support System with GraknDeveloping a Clinical Decision Support System with Grakn
Developing a Clinical Decision Support System with Grakn
 
Utilize Digital and Social Media Data to Inform Your Research in Novel Ways
Utilize Digital and Social Media Data to Inform Your Research in Novel WaysUtilize Digital and Social Media Data to Inform Your Research in Novel Ways
Utilize Digital and Social Media Data to Inform Your Research in Novel Ways
 
PhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhRMA Some Early Thoughts
PhRMA Some Early Thoughts
 
Ontology-enabled Healthcare Applications exploiting Physical-Cyber-Social Big...
Ontology-enabled Healthcare Applications exploiting Physical-Cyber-Social Big...Ontology-enabled Healthcare Applications exploiting Physical-Cyber-Social Big...
Ontology-enabled Healthcare Applications exploiting Physical-Cyber-Social Big...
 
Finding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsFinding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics Datasets
 
Digital Scholar Webinar: Clinicaltrials.gov Registration and Reporting Documents
Digital Scholar Webinar: Clinicaltrials.gov Registration and Reporting DocumentsDigital Scholar Webinar: Clinicaltrials.gov Registration and Reporting Documents
Digital Scholar Webinar: Clinicaltrials.gov Registration and Reporting Documents
 
Bookman.GIRLeadInstitute.2016.v3.distro
Bookman.GIRLeadInstitute.2016.v3.distroBookman.GIRLeadInstitute.2016.v3.distro
Bookman.GIRLeadInstitute.2016.v3.distro
 
Career Trends Transferring Skills
Career Trends Transferring SkillsCareer Trends Transferring Skills
Career Trends Transferring Skills
 
Leveraging Medical Health Record Data for Identifying Research Study Particip...
Leveraging Medical Health Record Data for Identifying Research Study Particip...Leveraging Medical Health Record Data for Identifying Research Study Particip...
Leveraging Medical Health Record Data for Identifying Research Study Particip...
 
Developing a Replicable Methodology for Automated Identification of Emerging ...
Developing a Replicable Methodology for Automated Identification of Emerging ...Developing a Replicable Methodology for Automated Identification of Emerging ...
Developing a Replicable Methodology for Automated Identification of Emerging ...
 
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
 
Engaging Diverse Communities in Cancer Conversations Through Creation of Stru...
Engaging Diverse Communities in Cancer Conversations Through Creation of Stru...Engaging Diverse Communities in Cancer Conversations Through Creation of Stru...
Engaging Diverse Communities in Cancer Conversations Through Creation of Stru...
 

Viewers also liked

Transformational-Generative Grammar
Transformational-Generative GrammarTransformational-Generative Grammar
Transformational-Generative Grammar
Ruth Ann Llego
 
Transformational Grammar by: Noam Chomsky
Transformational Grammar by: Noam ChomskyTransformational Grammar by: Noam Chomsky
Transformational Grammar by: Noam Chomsky
Shiela May Claro
 
Transformational generative grammar
Transformational  generative grammarTransformational  generative grammar
Transformational generative grammar
Baishakhi Amin
 

Viewers also liked (6)

Baobab Health, Cognitive Walkthrough
Baobab Health, Cognitive WalkthroughBaobab Health, Cognitive Walkthrough
Baobab Health, Cognitive Walkthrough
 
How novel compute technology transforms life science research
How novel compute technology transforms life science researchHow novel compute technology transforms life science research
How novel compute technology transforms life science research
 
Transformational generative grammar
Transformational generative grammarTransformational generative grammar
Transformational generative grammar
 
Transformational-Generative Grammar
Transformational-Generative GrammarTransformational-Generative Grammar
Transformational-Generative Grammar
 
Transformational Grammar by: Noam Chomsky
Transformational Grammar by: Noam ChomskyTransformational Grammar by: Noam Chomsky
Transformational Grammar by: Noam Chomsky
 
Transformational generative grammar
Transformational  generative grammarTransformational  generative grammar
Transformational generative grammar
 

Similar to Adventures in Translational Bioinformatics

Translational Data Sharing: Informatics Challenges and Opportunities
Translational Data Sharing: Informatics Challenges and OpportunitiesTranslational Data Sharing: Informatics Challenges and Opportunities
Translational Data Sharing: Informatics Challenges and Opportunities
Harry Hochheiser
 
Khoury ashg2014
Khoury ashg2014Khoury ashg2014
Khoury ashg2014
muink
 
[DSC Europe 23] Ivana Sesic - Use of AI in Public Health.pptx
[DSC Europe 23] Ivana Sesic - Use of AI in Public Health.pptx[DSC Europe 23] Ivana Sesic - Use of AI in Public Health.pptx
[DSC Europe 23] Ivana Sesic - Use of AI in Public Health.pptx
DataScienceConferenc1
 
Bigdatapdi2015 150112111012-conversion-gate02
Bigdatapdi2015 150112111012-conversion-gate02Bigdatapdi2015 150112111012-conversion-gate02
Bigdatapdi2015 150112111012-conversion-gate02
soniamra
 
i need an answer for each discussion.discussion 1Even though I.docx
i need an answer for each discussion.discussion 1Even though I.docxi need an answer for each discussion.discussion 1Even though I.docx
i need an answer for each discussion.discussion 1Even though I.docx
evontdcichon
 

Similar to Adventures in Translational Bioinformatics (20)

The Learning Health System: Thinking and Acting Across Scales
The Learning Health System: Thinking and Acting Across ScalesThe Learning Health System: Thinking and Acting Across Scales
The Learning Health System: Thinking and Acting Across Scales
 
Towards an Environmental Health Sciences Ontology: CHEAR to HHEAR and Beyond
Towards an Environmental Health Sciences Ontology:CHEAR to HHEAR and BeyondTowards an Environmental Health Sciences Ontology:CHEAR to HHEAR and Beyond
Towards an Environmental Health Sciences Ontology: CHEAR to HHEAR and Beyond
 
Translational Data Sharing: Informatics Challenges and Opportunities
Translational Data Sharing: Informatics Challenges and OpportunitiesTranslational Data Sharing: Informatics Challenges and Opportunities
Translational Data Sharing: Informatics Challenges and Opportunities
 
Sun==big data analytics for health care
Sun==big data analytics for health careSun==big data analytics for health care
Sun==big data analytics for health care
 
Khoury ashg2014
Khoury ashg2014Khoury ashg2014
Khoury ashg2014
 
[DSC Europe 23] Ivana Sesic - Use of AI in Public Health.pptx
[DSC Europe 23] Ivana Sesic - Use of AI in Public Health.pptx[DSC Europe 23] Ivana Sesic - Use of AI in Public Health.pptx
[DSC Europe 23] Ivana Sesic - Use of AI in Public Health.pptx
 
Obeid generic_2017-11
Obeid generic_2017-11Obeid generic_2017-11
Obeid generic_2017-11
 
Expert Panel on Data Challenges in Translational Research
Expert Panel on Data Challenges in Translational ResearchExpert Panel on Data Challenges in Translational Research
Expert Panel on Data Challenges in Translational Research
 
Why study Data Sharing? (+ why share your data)
Why study Data Sharing?  (+ why share your data)Why study Data Sharing?  (+ why share your data)
Why study Data Sharing? (+ why share your data)
 
Bigdatapdi2015 150112111012-conversion-gate02
Bigdatapdi2015 150112111012-conversion-gate02Bigdatapdi2015 150112111012-conversion-gate02
Bigdatapdi2015 150112111012-conversion-gate02
 
Big Data: Big Opportunities or Big Trouble?
Big Data: Big Opportunities or Big Trouble?Big Data: Big Opportunities or Big Trouble?
Big Data: Big Opportunities or Big Trouble?
 
Izant openscience
Izant openscienceIzant openscience
Izant openscience
 
The informationist in team science
The informationist in team scienceThe informationist in team science
The informationist in team science
 
Dr. Obumneke Amadi _Transcript
Dr. Obumneke Amadi _TranscriptDr. Obumneke Amadi _Transcript
Dr. Obumneke Amadi _Transcript
 
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The HyveOpen Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
 
Secure Data Sharing and Related Matters – An NIH View
Secure Data Sharing and Related Matters – An NIH ViewSecure Data Sharing and Related Matters – An NIH View
Secure Data Sharing and Related Matters – An NIH View
 
Latest Project to Cure PKD
Latest Project to Cure PKDLatest Project to Cure PKD
Latest Project to Cure PKD
 
ODF III - 3.15.16 - Day Two Morning Sessions
ODF III - 3.15.16 - Day Two Morning SessionsODF III - 3.15.16 - Day Two Morning Sessions
ODF III - 3.15.16 - Day Two Morning Sessions
 
i need an answer for each discussion.discussion 1Even though I.docx
i need an answer for each discussion.discussion 1Even though I.docxi need an answer for each discussion.discussion 1Even though I.docx
i need an answer for each discussion.discussion 1Even though I.docx
 
Introduction to open-data
Introduction to open-dataIntroduction to open-data
Introduction to open-data
 

More from Harry Hochheiser

Contextual Inquiry and Modeling Notes, Baobab Trust, March 2014
Contextual Inquiry and Modeling Notes, Baobab Trust, March 2014Contextual Inquiry and Modeling Notes, Baobab Trust, March 2014
Contextual Inquiry and Modeling Notes, Baobab Trust, March 2014
Harry Hochheiser
 

More from Harry Hochheiser (17)

Baobab Health 2015 Usability Inspections
Baobab Health 2015 Usability InspectionsBaobab Health 2015 Usability Inspections
Baobab Health 2015 Usability Inspections
 
Baobab User stories
Baobab User storiesBaobab User stories
Baobab User stories
 
Baobab 2015 Cognitive issues and usability
Baobab 2015 Cognitive issues and usabilityBaobab 2015 Cognitive issues and usability
Baobab 2015 Cognitive issues and usability
 
Baobab 2015 modeling and design
Baobab 2015   modeling and designBaobab 2015   modeling and design
Baobab 2015 modeling and design
 
Baobab spring 2015 analyzing
Baobab spring 2015   analyzingBaobab spring 2015   analyzing
Baobab spring 2015 analyzing
 
Baobab spring 2015 usability and contextual inquiry
Baobab spring 2015   usability and contextual inquiryBaobab spring 2015   usability and contextual inquiry
Baobab spring 2015 usability and contextual inquiry
 
Toward interactive visual tools for comparing phenotype profiles
Toward interactive visual tools for comparing phenotype profilesToward interactive visual tools for comparing phenotype profiles
Toward interactive visual tools for comparing phenotype profiles
 
The Monarch Initiative Phenotype Grid
The Monarch Initiative Phenotype GridThe Monarch Initiative Phenotype Grid
The Monarch Initiative Phenotype Grid
 
Hochheiser nlm-meeting-201406041612
Hochheiser nlm-meeting-201406041612Hochheiser nlm-meeting-201406041612
Hochheiser nlm-meeting-201406041612
 
Notes on redesign of Baobab Health Trust Prescribing Interface
Notes on redesign of Baobab Health Trust Prescribing InterfaceNotes on redesign of Baobab Health Trust Prescribing Interface
Notes on redesign of Baobab Health Trust Prescribing Interface
 
Introduction to usability studies, presented to Baobab Health Trust
Introduction to usability studies, presented to Baobab Health TrustIntroduction to usability studies, presented to Baobab Health Trust
Introduction to usability studies, presented to Baobab Health Trust
 
Notes on evaluation and usability inspection, Baobab Health Trust, March 2014
Notes on evaluation and usability inspection, Baobab Health Trust, March 2014Notes on evaluation and usability inspection, Baobab Health Trust, March 2014
Notes on evaluation and usability inspection, Baobab Health Trust, March 2014
 
Modeling and Design Notes for HIV Testing and Counseling, Baobab Health
Modeling and Design Notes for HIV Testing and Counseling, Baobab HealthModeling and Design Notes for HIV Testing and Counseling, Baobab Health
Modeling and Design Notes for HIV Testing and Counseling, Baobab Health
 
User Interface design notes
User Interface design notesUser Interface design notes
User Interface design notes
 
Usability - cognitive Factors - Baobab Health Trust, March 2014
Usability - cognitive Factors - Baobab Health Trust, March 2014 Usability - cognitive Factors - Baobab Health Trust, March 2014
Usability - cognitive Factors - Baobab Health Trust, March 2014
 
Contextual Inquiry and Modeling Notes, Baobab Trust, March 2014
Contextual Inquiry and Modeling Notes, Baobab Trust, March 2014Contextual Inquiry and Modeling Notes, Baobab Trust, March 2014
Contextual Inquiry and Modeling Notes, Baobab Trust, March 2014
 
Notes on user observations for Baobab Health Trust, March 2014
Notes on user observations for Baobab Health Trust, March 2014Notes on user observations for Baobab Health Trust, March 2014
Notes on user observations for Baobab Health Trust, March 2014
 

Recently uploaded

❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...
❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...
❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...
Rashmi Entertainment
 
Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...
Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...
Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...
amritaverma53
 
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
rajnisinghkjn
 
Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...
Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...
Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...
Cara Menggugurkan Kandungan 087776558899
 

Recently uploaded (20)

❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...
❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...
❤️ Chandigarh Call Girls☎️98151-579OO☎️ Call Girl service in Chandigarh ☎️ Ch...
 
Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...
Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...
Call Girl in Chennai | Whatsapp No 📞 7427069034 📞 VIP Escorts Service Availab...
 
Call Girls Bangalore - 450+ Call Girl Cash Payment 💯Call Us 🔝 6378878445 🔝 💃 ...
Call Girls Bangalore - 450+ Call Girl Cash Payment 💯Call Us 🔝 6378878445 🔝 💃 ...Call Girls Bangalore - 450+ Call Girl Cash Payment 💯Call Us 🔝 6378878445 🔝 💃 ...
Call Girls Bangalore - 450+ Call Girl Cash Payment 💯Call Us 🔝 6378878445 🔝 💃 ...
 
Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...
Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...
Bhawanipatna Call Girls 📞9332606886 Call Girls in Bhawanipatna Escorts servic...
 
💞 Safe And Secure Call Girls Coimbatore🧿 6378878445 🧿 High Class Coimbatore C...
💞 Safe And Secure Call Girls Coimbatore🧿 6378878445 🧿 High Class Coimbatore C...💞 Safe And Secure Call Girls Coimbatore🧿 6378878445 🧿 High Class Coimbatore C...
💞 Safe And Secure Call Girls Coimbatore🧿 6378878445 🧿 High Class Coimbatore C...
 
ANATOMY AND PHYSIOLOGY OF REPRODUCTIVE SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF REPRODUCTIVE SYSTEM.pptxANATOMY AND PHYSIOLOGY OF REPRODUCTIVE SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF REPRODUCTIVE SYSTEM.pptx
 
Call 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room Delivery
Call 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room DeliveryCall 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room Delivery
Call 8250092165 Patna Call Girls ₹4.5k Cash Payment With Room Delivery
 
Call Girls in Lucknow Just Call 👉👉8630512678 Top Class Call Girl Service Avai...
Call Girls in Lucknow Just Call 👉👉8630512678 Top Class Call Girl Service Avai...Call Girls in Lucknow Just Call 👉👉8630512678 Top Class Call Girl Service Avai...
Call Girls in Lucknow Just Call 👉👉8630512678 Top Class Call Girl Service Avai...
 
Call Girls Service Jaipur {9521753030 } ❤️VVIP BHAWNA Call Girl in Jaipur Raj...
Call Girls Service Jaipur {9521753030 } ❤️VVIP BHAWNA Call Girl in Jaipur Raj...Call Girls Service Jaipur {9521753030 } ❤️VVIP BHAWNA Call Girl in Jaipur Raj...
Call Girls Service Jaipur {9521753030 } ❤️VVIP BHAWNA Call Girl in Jaipur Raj...
 
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptxANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF RESPIRATORY SYSTEM.pptx
 
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
👉 Chennai Sexy Aunty’s WhatsApp Number 👉📞 7427069034 👉📞 Just📲 Call Ruhi Colle...
 
Indore Call Girls ❤️🍑7718850664❤️🍑 Call Girl service in Indore ☎️ Indore Call...
Indore Call Girls ❤️🍑7718850664❤️🍑 Call Girl service in Indore ☎️ Indore Call...Indore Call Girls ❤️🍑7718850664❤️🍑 Call Girl service in Indore ☎️ Indore Call...
Indore Call Girls ❤️🍑7718850664❤️🍑 Call Girl service in Indore ☎️ Indore Call...
 
Cardiac Output, Venous Return, and Their Regulation
Cardiac Output, Venous Return, and Their RegulationCardiac Output, Venous Return, and Their Regulation
Cardiac Output, Venous Return, and Their Regulation
 
Call Girls in Lucknow Just Call 👉👉 8875999948 Top Class Call Girl Service Ava...
Call Girls in Lucknow Just Call 👉👉 8875999948 Top Class Call Girl Service Ava...Call Girls in Lucknow Just Call 👉👉 8875999948 Top Class Call Girl Service Ava...
Call Girls in Lucknow Just Call 👉👉 8875999948 Top Class Call Girl Service Ava...
 
Bhopal❤CALL GIRL 9352988975 ❤CALL GIRLS IN Bhopal ESCORT SERVICE
Bhopal❤CALL GIRL 9352988975 ❤CALL GIRLS IN Bhopal ESCORT SERVICEBhopal❤CALL GIRL 9352988975 ❤CALL GIRLS IN Bhopal ESCORT SERVICE
Bhopal❤CALL GIRL 9352988975 ❤CALL GIRLS IN Bhopal ESCORT SERVICE
 
Call Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service Available
Call Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service AvailableCall Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service Available
Call Girls Mussoorie Just Call 8854095900 Top Class Call Girl Service Available
 
💰Call Girl In Bangalore☎️63788-78445💰 Call Girl service in Bangalore☎️Bangalo...
💰Call Girl In Bangalore☎️63788-78445💰 Call Girl service in Bangalore☎️Bangalo...💰Call Girl In Bangalore☎️63788-78445💰 Call Girl service in Bangalore☎️Bangalo...
💰Call Girl In Bangalore☎️63788-78445💰 Call Girl service in Bangalore☎️Bangalo...
 
Chennai Call Girls Service {7857862533 } ❤️VVIP ROCKY Call Girl in Chennai
Chennai Call Girls Service {7857862533 } ❤️VVIP ROCKY Call Girl in ChennaiChennai Call Girls Service {7857862533 } ❤️VVIP ROCKY Call Girl in Chennai
Chennai Call Girls Service {7857862533 } ❤️VVIP ROCKY Call Girl in Chennai
 
Call Girls Rishikesh Just Call 9667172968 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 9667172968 Top Class Call Girl Service AvailableCall Girls Rishikesh Just Call 9667172968 Top Class Call Girl Service Available
Call Girls Rishikesh Just Call 9667172968 Top Class Call Girl Service Available
 
Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...
Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...
Cara Menggugurkan Kandungan Dengan Cepat Selesai Dalam 24 Jam Secara Alami Bu...
 

Adventures in Translational Bioinformatics

  • 1. Adventures in Translational Bioinformatics Harry Hochheiser University of Pittsburgh School of Medicine Department of Biomedical Informatics harryh@pitt.edu Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013 © 2010 Harry Hochheiser Licensed under Creative Commons Attribution-NonCommercial-NoDerivs license
  • 2. Biomedical Informatics  The use of informatics for the improvement of biomedical research and clinical care Friedman “A "fundamental theorem" of biomedical informatics. JAMIA 2009 Shortliffe & Blois “The Computer Meets Medicine and Biology: Emergence of a Discipline” in Shortliffe & Cimino, eds., “Biomedical Informatics: Computer Applications in Health Care and Biomedicine Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 3. What is “Translational” research? ● My research is translational... ● ..because I want to get funding? http://www.ncats.nih.gov/research/cts/cts.html Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 4. Co-Clinical Trials Nardella, Lunardi, Pataik, and Cantley The APL Paradigm and the “Co-Clinical Trial” Project Cancer Discovery 2011 Acute Proyelocytic Leukemia ● ● 13 years of work – cloning, description, transgenics, trials ● Successful therapy now approved for clinical use.. ● Co-clinical trials – synchronize human and animal trials. Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 5. Measuring Success ● How is success of translational science measured? ● Cures? Therapies? ● Citations mapping model organism studies to T2, T3, T4 success? ● How do we assess impact that doesn't play out in the usual 3-5 year grant cycle? Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 6. FaceBase http://www.facebase.org National Institute of Dental and Craniofacial Research Five-year initial phase 2009-2014 “..systematically compile the biological instructions to construct the middle region of the human face and precisely define the genetics underlying its common developmental disorders, such as cleft lip and palate” 10 Projects: U-flavor contracts Data Management and Coordination Hub – “One-stop access to craniofacial research data” – “Allow scientists to more rapidly and effectively generate hypotheses and accelerate the pace of their research”. Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 7. FaceBase Projects Hochheiser, et al. 2011 Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 8. FaceBase Data ● 10 projects ● 4 organisms: mouse, zebrafish, chick, human ● Varying developmental time points ● Data types ● Expression: microarray, ChIP-Seq, miRNA, ● Images: 3D MRI, OPT, microMRI, microCT, 3D human facial mesh ● Sequences: Genotypes.GWAS, CNV ● Software: 3D facial image analysis ● Strains: Cre recobminase drivers ● 1 Data management and coordination hub Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 9. What does it mean to effectively share data? Diversity of data Diversity of projects, goals, etc. . Challenges are both technical and social/organizational Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 10. FaceBase Hub Technical Challenges Sequences Data Diversity Images.. miRNA microarrays Anatomy Phenotype Integration UCSC Genome Browser Between datasets Tools Searching Browsing Metadata Genotypes RNA-Seq Data Volume “Controlled Access” Human Data Genotypes Facial Images Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 11. Metadata: Theory ● Well-defined metadata fields/attributes ● Controlled vocabularies provide consistent terminology for each field ● Link to appropriate resources as needed: NCBI, MGI, etc.. ● Additional attributes specific to each data type ● Consistent metadata supports search, navigation... Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 12. Metadata in practice “What's the difference between mutation and Genotype?” “Uhh. I'll have to get back to you on that.” “Strain name? Is “C57B6J” the same as “C57Bl/6J?” “We're lucky to have any information at all when it comes to these mice...” “The official name of the strain is ____, but everybody calls it ____” Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 13. Biomedical Ontologies might solve some terminology and structure problems Human Anatomy: EHDAA, FMA Mouse Anatomy: EMAP, MA Mouse/Human Phenotypes: MP/HPO Genes: GO Data Models: MIAME, MiXX, ISA-TAB, OBI Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 14. Metadata Possibilties: Ontologies & Data Models Ontologies, Data models, etc. User Practice Grand Canyon NPS http://www.fotopedia.com/items/fickr- 7553734530 Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 15. Implications Good metadata is vital for search, retrieval, data sharing, reuse Curation effort can compensate for some inadequacies, but – EC Curation Effort, Mq Metadata Quality – EC = f( 1/Mq) – Possibly non-linear – This doesn't scale Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 16. Search tools ● Search + Link: current practice in bioinformatics repositories ● Text search over metadata ● Linear results list ● Results pages with many links ● Advantages ● Easy, fast ● Disadvantages ● Relationships between items in result list are not clear ● No overviews to help with higher-level meta-understanding Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 17. The Ontology of Craniofacial Development and Malformation Extend existing ontologies to provide richer models of craniofacial anatomy, development, and malformation Foundational model of Anatomy (FMA), Human Phenotype Ontology (HPO), Mouse Anatomy (MA).. Add human developmental description Human-Mouse links https://www.facebase.org/content/ocdm But... better models don't guarantee usage Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 18. The tool gap “Tools have advanced to the point of being able to support users fairly successfully in finding and reading off data (e.g. to classify and find multidimensional relationships of interest) but not in being able to interactively explore these complex relationships in context to infer causal explanations and build convincing biological stories amid uncertainty.” B. Mirel Supporting cognition in systems biology analysis: findings on users' processes and design implications (2009) J. Biomedical Discovery & Collaboration • Need: Tools that effectively leverage combination of human intellect and computing power Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 19. Information Visualization ● Interactive displays of high-dimensional data sets ● Coordinated views facilitate comparison across dimensions and ● Multi-scale ● Overview ● Zoom/filter ● Full details on requested items ● Rapid, incremental, reversible queries ● Avoid 0-hit or million-hit queries. Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 20. Toward integrative tools How do we build connections? Genotypes How do we get Morphometry users to think differently about the data? Images Expression Sequence Human Mouse Zebrafish Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 21. Information Visualization ● Interactive displays of high-dimensional data sets ● Coordinated views facilitate comparison across dimensions and ● Multi-scale ● Overview ● Zoom/filter ● Full details on requested items ● Rapid, incremental, reversible queries ● Avoid 0-hit or million-hit queries. Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 22. Technology Probe: Connecting related data items • Single search, multiple data types • Links between items indicate connections • Images or other details on demand for direct access to data • Rapid response supports interactivity, exploration, serendipitous discovery Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 23. Technology Probe: Gene Atlas • Genes displayed on timeline, indicating when active • Images display expression localization • Large image for detailed views. • Mouse-over coordinated links Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 24. Vision vs. reality ● Data is still coming in ● Gene atlas requires coordinated analysis .. ● What can we do with data that is available? Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 25. Technology Probe: Gene Atlas • Genes displayed on timeline, indicating when active • Images display expression localization • Large image for detailed views. • Mouse-over coordinated links Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 26. Current Search Interface Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 27. Timeline view Approximate alignment of datasets on common timeline Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 28. 3D Facial Meshes Two datasets – Weinberg & Marazita: Caucasian facial norms – Spritz: Tanzanian Identifiable data -access only upon DAC approval Query tool: identify subsets for further analysis. Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 29. UCSC Genome Browser Mirror ● Genomebrowser.facebase.org ● > 30 tracks, mouse mm9 ● RNA-seq ● ChIP-seq ● Various anatomic locations, time points, etc. ● Human CNV hg19 Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 30. Imaging ● microMRI, ● MicroCT ● OPT ● Interactive Browsers? Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 31. Fishface Atlas of zebrafish craniofacial development C. Kimmel, et al. Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 32. Current Status As of 4 February 2013 > 200 datasets > 100 in queue awaiting curation or investigator approval Continuing to refine metadata and data organization Expecting large infux of data over the next year. Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 33. Toward Greater Integration Currently Ideally Metadata Metadata Metadata Metadata Data Data Data Data Links via metadata: Links via data: Comparable organism, stage, Commonly expressed genes, Gene, etc.. Sequences, Pathways “Easy” for similar data Expression, Sequences Harder for diverse data Images? Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 34. FaceBase as a model ● Diverse researchers and projects ● Focus: specific clinical domain ● Goal: collect and coordinated data – community resource ● Outstanding challenges? ● Is this even the right idea? Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 35. Build it and they (may) come? ● Speculative claim: assemble good data and it will be of use ● How to evaluate this? ● How to compare to n R01s that might otherwise have been funded? (question asked by program officer) ● ENCODE controversy “..the lesson I learned from ENCODE is that projects like ENCODE are not a good idea.” M. Eisen http://www.michaeleisen.org/blog/?p=1179 Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 36. Three key factors Tools Incentives Outreach Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 37. Tools Desperately needed! Search tools – finding data Annotation Tools – describing data – Using ontologies – Common data models Minimize the effort required to provide the high-quality metadata needed for translational bioinformatics. Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 38. Perhaps the worst bioinformatics data management tool Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 39. Perhaps the best bioinformatics data management tool Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 40. Pros and Cons of Spreadsheets Alternatives: Cons: – ad-hoc data models – ad-hoc data fields – No consistency, provenance... Pros: – Ubiquitous – Simple Analytic tools: more rigor, less generality – Flexible – Familiar Experiment Metadata Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 41. Usability and Ontologies Can we at least get the terms right? Maguire et al. 2012 Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 42. Beyond Autocomplete ● Substring autocomplete – state of art for finding terms in biomedical ontologies ● Cons: no context, room for specialization or generalization? ● Can we build a tool that will easily support both search and navigation? Foundational Model Explorer http://fme.biostr.washington.edu/FME/index.html Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 43. Why not let the curators do it? ● Expense ● Accuracy ● Speed ● Scientists know their data best Claim – To be sustainable, data sharing must be self-curated. But.. why should anyone bother? Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 44. Scientific incentives... Where does Traditionally.... Data sharing fit in? Research Results Funding Papers Data Sharing ? Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 45. How to promote sharing.. 1. Better tools → reduced effort Eannotation < Ecollection + epsilon 2. Recognize effort: alternative models of academic credit ImpactStory... Altmetrics: Value all research products (H. Piwowar, Nature 1/9/13, doi:10.1038/493159a) “For all new grant applications from 14 January, the US National Science Foundation (NSF) asks a principal investigator to list his or her research “products” rather than “publications” in the biographical sketch section.” Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 46. Related Efforts ● Genomics Research in Alpha-1 Antitrypsin Deficiency and Sarcoidosis (GRADS) – microbiome, expression, phenotype sharing ● LAMHDI/MONARCH – User tools for ontologically-driven discovery of model genes for human diseases ● Research Social Network Collaboration Finding tool – using VIVO and related networking tools to find collaborators.. Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013
  • 47. Acknowledgments FaceBase: Mary Marazita, Jeff Murray, Yang Chai, Bruce Arnonow, Kristin Artinger, Teri Beaty, Jim Brinkley, David Clouthier, Michael Cunningham, Michael Dixon, Leah Rae Donahue, Scott E. Fraser, Benedikt Hallgrimsson, Junichi Iwata, Ophir Klein, Stephen Murray, Fernando Pardo-Manuel de Villena, John Postlethwait, Steven Potter, Lina Shapiro, Richard Spritz, Axel Visel, Seth Weinberg, Paul Trainor U. Pittsburgh: Mike Becich, Becky Boes, Chuck Borromeo, Lance Kennelty, Annette Krag- Jensen, Tom Maher, Johnson Paul, Linda Schmandt, Shiyi Shen, Bill Shirey, Cristy Spino, Mike Stefanko, Justin Stickel OCDM :James Brinkley; Jose Leonardo Mejino; Landon Detwiler; Ravensara Travillian; Melissa Clarkson; Timothy Cox; Carrie Heike; Michael Cunningham; Linda Shapiro NIDCR: Steven Scholnick, Emily Harris, Jeannine Helm, Nadya Lumelsky, Lillian Shum Support: NIH Grants U01 DE020057, 3U01DE020050-03S1  Harry Hochheiser, harryh@pitt.edu Texas A&M Health Sciences Center February 6, 2013