Bioinformatics: Definitions, Challenges and Impact on Health Care Systems Daniel Masys, M.D. Professor and Chair Department of Biomedical Informatics Vanderbilt University School of Medicine
What is Bioinformatics?
Health Informatics compared to Bioinformatics
Scope of Bioinformatics
Genomics data and patient care
Impact of Bioinformatics on Health Information Systems
Central Dogma of Molecular Biology DNA RNA Protein Phenotype Phenotype Transcription Translation Replication Post Translational Modification
What is Bioinformatics? Definitions…
NIH Working Definition
Bioinformatics : Research, development, or application of computational tools and approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, organize, archive, analyze, or visualize such data.
http://www. bisti . nih . gov / CompuBioDef . pdf
Another… NCBI (National Center for Biotechnology Information
Bioinformatics is the field of science in which biology, computer science, and information technology merge into a single discipline. The ultimate goal of the field is to enable the discovery of new biological insights and to create a global perspective from which unifying principles in biology can be discerned.
Central Dogma of Molecular Biology DNA RNA Protein Phenotype Phenotype Tissues Organs Organisms Genomics Transcriptomics Functional Genetics Proteomics
Proteome and Proteomics
Proteome – the entire set of proteins (and other gene products) made by the genome.
Proteomics – study of the interactions among proteins in the proteome, including networks of interacting proteins and metabolic considerations. Also includes differences in developmental stages, tissues and organs.
Nutrition and storage
Contraction and mobility
Correspond to (and derived from) Genome data bases
Record observations of protein-protein interactions in cells
Attempts to detail interactions observed in thousands of small-scale experiments described in published articles
BIND: Biomolecular Interaction Network Database
DIP: Database of Interacting Proteins
MIPS: Munich Information Center for Protein Sequences
PRONET: Protein interaction on the Web
Many others, both academic and commercial
Controlled Vocabularies in Bioinformatics
The Gene Ontology http://www. geneontology .org/
Knowledge about gene function (the ontology itself)
Annotation of gene products (for comparisons)
The MGED Ontology (arising from MIAME)
http:// mged . sourceforge .net/
Annotation of microarray experiments for public repositories
Clinical Bioinformatics Ontology:
Annotation of gene tests in electronic medical records
MIAPE from Proteomics Standards Initiative (PSI)
Annotation of proteomics experiments for public repositories
Genomics Data and Patient Care From genotype to phenotype
Human Disease Gene Specifics
Genes linked to human diseases (9-2004)
+ 425 in 2 yrs
1700/20,000 = 9% of loci
Informatics Issues related to Genomics Data and Patient Care
Linking known data for genes causing human diseases to clinical decision support and EMR documentation
Representation of genetic data in electronic medical records
Clinical Bioinformatics: Common Questions
What genes cause the condition?
What are the normal function of the gene?
What mutations have been linked to diseases?
How does the mutation alter gene function?
What laboratories are performing DNA tests?
Are there gene therapies or clinical trials?
What names are used to refer to the genes and the diseases?
What other conditions are linked to these same genes?
Answers exist online
… but it is not easy; answers in many places
Can’t navigate by genes names - must use hot links and numeric identifiers
The number and function of alternate forms of the protein are inconsistently reported
Synonymy (many names, same meaning) and polysemy (same name, different meanings) cause confusion
Upper and lower case are used for species distinctions
Major Challenges of Navigation
Complexity of data
Dynamic nature of the data
Diverse foci and number of data/knowledge base systems
Data and knowledge representation lack standards
Can navigate if you know what you are looking for.
Genetics Home Reference
Consumer health resource to help the public navigate from phenotype to genotype.
Focus on health implications of the Human Genome Project.
Mitchell, Fun, McCray, JAMIA, 2004 Nov 11(6):439-437
Genetics is Impacting Medicine Today
1700 genes & health conditions
> 1100 gene tests for diagnosis
Relate to diagnosis, therapy, drug dosage, occupational hazards, reproductive plans, health risks, ….
CYP450 alleles: exaggerated, diminished or ultra-rapid drug responses. E.G., Warfarin. 93% of patients are OK on standard doses. 7% of patients have severe hemorrhage. CYP2C9*2 and CYP2C9*3 most severe of 6 known mutations.
Sickle Cell trait carrier and malaria parasite
PKU and avoidance of phenylalanine
Non-small cell lung CA ~ 140,000 pt/yr
Iressa (Astra Zeneca) causes remission in 1 of 10 patients if taken daily for life.
Iressa efficacy correlates with EGFR mutation in the tumor. Now have gene testing for EGFR so can target appropriate people. http://www.sciencemag.org/cgi/content/full/305/5688/1222a
BUT – Astra Zeneca can’t make money on only 14,000 per year.