Semantic Web for Health Care and Biomedical Informatics


Published on

Amit Sheth, "Semantic Web for Health Care and Biomedical Informatics," Keynote at NSF Biomed Web Workshop, Corbett, Oregon, December 4-5, 2007.

Published in: Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Biomedical informatics needs the connection between the macro (medical informatics) and the micro (bioinformatics). Information is found in several sources, from text to structured data. Semantic Web aims to bridge this gap. Semantic Web will provide more advanced capabilities for search, integration, analysis, links to new insights and discoveries. “ Does this gene influence has a causal relationship with this disease?” “ What would be the best gene for me to perform experiments of knock out based on the information we have?” “ What is the probable course that a patient will take if it has these symptoms and this genetic background?”
  • We see a change of paradigm on the Web. Researchers once had to extensively navigate through pages to obtain the answer to a question. We are getting closer to the time where one can pose a question to the Web and have the solution computed by integrated sources. Some key areas of work include: How to integrate pages, databases, services and human contributions on the Web How to detect and propagate changes, control authorship and trust How to ask questions and visualize the results How to automatically perform knowlege discovery over this global knowledge base
  • 1: the whole pathway is shown from the Dolichol compound over the first sugar: N-Acetyl-D-glucosaminyldiphosphodolichol (or GlcNAc-PP-dol) to the N-Glycan G00022 (KEGG accession No) or (GlcNAc)7 (Man)3 (Asn)1 (just numbers of residues, the glycan doesn’t have a common name, but belongs to a class of “Pentaantennary complex-type sugar chains”). 2. GNT-I (UDP-N-acetyl-D-glucosamine:3-(alpha-D-mannosyl)-beta-D-mannosyl-$glycoprotein 2-beta-N-acetyl-D-glucosaminyltransferase) catalyzes the reaction from 3-(alpha-D-mannosyl)-beta-D-mannosyl-R to 3-(2-[N-acetyl-beta-$D-glucosaminyl]-alpha-D-mannosyl)-beta-D-mannosyl-R 3. GNT-V (UDP-N-acetyl-D-glucosamine:6-[2-(N-acetyl-beta-D-glucosaminyl)-$alpha-D-mannosyl]-glycoprotein $6-beta-N-acetyl-D-glucosaminyltransferase) catalyzes the reaction from 6-(2-[N-acetyl-beta-D-glucosaminyl]-$alpha-D-mannosyl)-beta-D-mannosyl-R to 6-(2,6-bis[N-acetyl-$beta-D-glucosaminyl]-alpha-D-mannosyl)-beta-D-mannosyl-R, which is part of the Glycan G00021 4. The part of the ontology tree just shows where GNT-V is. 5. The GNT-V entry in the ontology shows that N-Glycan_beta_GlcNAc_9 is added with the help of Enzyme GNT-V to a sugar containing the residue N-glycan_alpha_man_4. Why this is important for GLycomics: G00021 is a so-called tetraantennary complex N-Glycan. When the red BlcNAc beta 1-6 is present due to GNT-V, this chain can be extended with polylactosamine. Polylactosamine is found in some metastatic cells. A challenge now is to find out whether this Glycan structure is always made by GNT-V. Then we might be able to tell something about GNT-V and cancer That is where probabilistic reasoning comes into play. Mention that man_4 and glcnac_9 are Contextual residues. Mention GlycoTree
  • NIDA undertook a project to study the genes implicated in nicotine dependency. The result of this study was a list of genes with their gene symbols, chromosomal location and a brief comment about the gene. These genes were all from humans. The next step in their study is to correlate these genes with biological pathway information to answer a variety of queries such as list of all interactions between genes or ‘hub’ genes i.e. genes that are highly active in terms of participation in pathways or categorize genes by their anatomical or tissue location. Clearly, this required integrating genome and pathway information
  • We identified the primary biological pathway information sources namely HumanCyc, KEGG and Reactome. The primary genome information sources were Entrez Gene and HomoloGene for homology information. We note that though we started with human genes only, later we added homologues gene records for four model organisms namely zebrafish, fruit fly, mouse and C. elegans. The Gene ontology is mainly a resource for GO annotation information. We needed to integrate these data sources effectively to answer the queries we discussed in the last slide.
  • Schema integration: As we discussed earlier, we integrate the two knowledge models at the schema level i.e. in terms of classes and relationships. Hence, instead of creating a new class for ‘pathway’ and ‘protein’ we re-used these concepts that were already defined in the BioPAX ontology. Thus these two classes server as anchors between the two schemas and we will a query that uses protein as common class to traverse from genome information to pathway information.
  • One of the primary advantages of an ontology is the ability to create and execute inference rules that lead to information gain i.e. they make explicit information that could only through human interpretation of actual data. For example, if we revisit the first query, then given that two genes interact with each other, given certain number of parameters being met, we can assert that the gene products also interact with each other. We can formally state the rule as shown.
  • Here we lay down a scenario in which a user would have to browse through multiple data sources to answer to a query: “ how are glycosyltransferase activity and congenital muscular dystrophy related”?
  • Here we show a user MANUALLY spotting from a web page the important concepts to answer his or her query.
  • Once the information is enhanced with ontologies, finding the connections is a matter of querying. No need for extensive navigation in an integrated environment. We show that three datasets (LARGE, MIM and GO) can be integrated to answer the user needs.
  • A demonstration of how a user interface can benefit from ontologies to guide the user in formulating a query. The ontology schema is shown in the bottom-right corner as a reference to where the program is reading the possible connections between concepts.
  • Here the query builder in the context of a bigger application (Tcruzi PSE) Also showing different perspectives for results exploration. Graphs are good for finding connections, while charts are good for overview.
  • By N-glycosylation Process, we mean the identification and quantification of glycopeptides Separation and identification of N-Glycans Proteolysis: treat with trypsin Separation technique I: chromatography like lectin affinity chromatography From PNGase F: we get fractions that contain peptides and glycans – we focus only on peptides. Separation technique II: chromatography like reverse phase chromatography
  • Core clinical/biomedical problems that we can address today or in future What are the semantic web technologies that can help
  • Semantic Web for Health Care and Biomedical Informatics

    1. 1. Semantic Web for Health Care and Biomedical Informatics Keynote at NSF Biomed Web Workshop, December 4-5, 2007 Amit P. Sheth [email_address] Thanks Pablo Mendes, Satya Sahoo and Kno.e.sis team; Collaborators at Athens Heart Center (Dr. Agrawal), NLM (Olivier Bodenreider ), CCRC, UGA (Will York), CCHMC (Bruce Aronow)
    2. 2. Outline <ul><li>Semantic Web – very brief intro </li></ul><ul><li>Scenarios to demonstrate the applications and benefit of semantic web technologies </li></ul><ul><ul><li>Health care </li></ul></ul><ul><ul><li>Biomedical Research </li></ul></ul>
    3. 3. Biomedical Informatics... Medical Informatics Bioinformatics Etiology Pathogenesis Clinical findings Diagnosis Prognosis Treatment Genome Transcriptome Proteome Metabolome Physiome ...ome Genbank Uniprot ...needs a connection Hypothesis Validation Experiment design Predictions Personalized medicine Semantic Web research aims at providing this connection! More advanced capabilities for search, integration, analysis, linking to new insights and discoveries! Pubmed Clinical Biomedical Informatics
    4. 4. Evolution of the Web 2007 1997 Web as an oracle / assistant / partner - “ask to the Web” - using semantics to leverage text + data + services + people Web of pages - text, manually created links - extensive navigation Web of databases - dynamically generated pages - web query interfaces Web of services - data = service = data, mashups - ubiquitous computing Web of people - social networks, user-created content - GeneRIF, Connotea
    5. 5. <ul><li>Ontology : Agreement with Common Vocabulary & Domain Knowledge; Schema + Knowledge base </li></ul><ul><li>Semantic Annotation (meatadata Extraction) : Manual, Semi-automatic (automatic with human verification), Automatic </li></ul><ul><li>Reasoning/computation : semantics enabled search, integration, complex queries, analysis (paths, subgraph), pattern finding, mining, hypothesis validation, discovery, visualization </li></ul>Semantic Web Enablers and Techniques
    6. 6. Maturing capabilites and ongoing research <ul><li>Text mining: Entity recognition, Relationship extraction </li></ul><ul><li>Integrating text, experimetal data, curated and multimedia data </li></ul><ul><li>Clinical and Scientific Workflows with semantic web services </li></ul><ul><li>Hypothesis driven retrieval of scientific literature, Undiscovered public knowledge </li></ul>
    7. 7. Metadata and Ontology: Primary Semantic Web enablers Shallow semantics Deep semantics Expressiveness, Reasoning
    8. 8. Characteristics of Semantic Web Self Describing Machine & Human Readable Issued by a Trusted Authority Easy to Understand Convertible Can be Secured The Semantic Web: XML, RDF & Ontology Adapted from William Ruh (CISCO)
    9. 9. Open Biomedical Ontologies Open Biomedical Ontologies, Many ontologies exist
    10. 10. Drug Ontology Hierarchy (showing is-a relationships) interaction_ with_non_ drug_reactant owl:thing prescription_drug_ brand_name brandname_undeclared brandname_composite prescription_drug monograph_ix_class cpnum_ group prescription_drug_ property indication_ property formulary_ property non_drug_ reactant interaction_property property formulary brandname_individual interaction_with_prescription_drug interaction indication generic_ individual prescription_drug_ generic generic_ composite interaction_with_monograph_ix_class
    11. 11. N-Glycosylation metabolic pathway GNT-I attaches GlcNAc at position 2 UDP-N-acetyl-D-glucosamine + alpha-D-Mannosyl-1,3-(R1)-beta-D-mannosyl-R2 <=> UDP + N-Acetyl-$beta-D-glucosaminyl-1,2-alpha-D-mannosyl-1,3-(R1)-beta-D-mannosyl-$R2 GNT-V attaches GlcNAc at position 6 UDP-N-acetyl-D-glucosamine + G00020 <=> UDP + G00021 N-acetyl-glucosaminyl_transferase_V N-glycan_beta_GlcNAc_9 N-glycan_alpha_man_4
    12. 12. Opportunity: exploiting clinical and biomedical data Health Information Services Elsevier iConsult Scientific Literature PubMed 300 Documents Published Online each day User-contributed Content ( Informal) GeneRifs NCBI Public Datasets Genome, Protein DBs new sequences daily Laboratory Data Lab tests, RTPCR, Mass spec Clinical Data Personal health history Search, browsing, complex query, integration, workflow, analysis, hypothesis validation, decision support. binary text
    13. 13. Scenario 1: <ul><li>Status: In use today </li></ul><ul><li>Where: Athens Heart Center </li></ul><ul><li>What: Use of semantic Web technologies for clinical decision support </li></ul>
    14. 14. Operational since January 2006
    15. 15. <ul><li>Goals: </li></ul><ul><li>Increase efficiency with decision support </li></ul><ul><ul><li>formulary, billing, reimbursement </li></ul></ul><ul><ul><li>real time chart completion </li></ul></ul><ul><ul><li>automated linking with billing </li></ul></ul><ul><li>Reduce Errors, Improve Patient Satisfaction & Reporting </li></ul><ul><ul><li>drug interactions, allergy, insurance </li></ul></ul><ul><li>Improve Profitability </li></ul><ul><li>Technologies: </li></ul><ul><li>Ontologies, semantic annotations & rules </li></ul><ul><li>Service Oriented Architecture </li></ul>Thanks -- Dr. Agrawal, Dr. Wingeth, and others. ISWC2006 paper Active Semantic Electronic Medical Records (ASEMR)
    16. 16. <ul><li>Demonstration </li></ul>
    17. 17. ASMER Efficiency Chart Completion before the preliminary deployment Chart Completion after the preliminary deployment
    18. 18. Scenario 2: <ul><li>Status: Demonstration </li></ul><ul><li>Where: W3C Health Care and Life Sciences (HCLS) interest group </li></ul><ul><li>What: Using semantic web to aggregate and query data about Alzheimer’s </li></ul><ul><li> </li></ul>
    19. 19. Scenario 2: Scientific Data Sets for Alzheimer’s
    20. 20. SPARQL Query spanning multiple sources
    21. 21. Scenario 3 <ul><li>Status: Completed research </li></ul><ul><li>Where: NIH </li></ul><ul><li>What: Understanding the genetic basis of nicotine dependence. Integrate gene and pathway information and show how three complex biological queries can be answered by the integrated knowledge base. </li></ul><ul><li>How: Semantic Web technologies (especially RDF, OWL, and SPARQL) support information integration and make it easy to create semantic mashups (semantically integrated resources). </li></ul>
    22. 22. Motivation <ul><li>NIDA study on nicotine dependency </li></ul><ul><li>List of candidate genes in humans </li></ul><ul><li>Analysis objectives include: </li></ul><ul><ul><li>Find interactions between genes </li></ul></ul><ul><ul><li>Identification of active genes – maximum number of pathways </li></ul></ul><ul><ul><li>Identification of genes based on anatomical locations </li></ul></ul><ul><li>Requires integration of genome and biological pathway information </li></ul>
    23. 23. Entrez Gene Reactome KEGG HumanCyc GeneOntology HomoloGene Genome and pathway information integration <ul><li>pathway </li></ul><ul><li>protein </li></ul><ul><li>pmid </li></ul><ul><li>pathway </li></ul><ul><li>protein </li></ul><ul><li>pmid </li></ul><ul><li>pathway </li></ul><ul><li>protein </li></ul><ul><li>pmid </li></ul><ul><li>GO ID </li></ul><ul><li>HomoloGene ID </li></ul>
    24. 24. JBI
    25. 25. BioPAX ontology Entrez Knowledge Model (EKoM)
    26. 26. Deductive Reasoning Protein-Protein Interaction RULE: given that two genes interact with each other, given certain number of parameters being met, we can assert that the gene products also interact with each other IF (x have_common_pathway y) AND (x rdf:type gene) AND (y rdf:type gene) AND (x has_product m) AND (y has_product n) AND (m rdf:type gene_product) AND (n rdf:type gene_product) THEN (m ? n) gene_product gene_product has_product have_common_pathway gene2 gene1 has_product database_identifier 2 associated_with associated_with database_identifier 1 interacts_with
    27. 27. Scenario 4 <ul><li>Status: Completed research </li></ul><ul><li>Where: NIH </li></ul><ul><li>What: queries across integrated data sources </li></ul><ul><ul><li>Enriching data with ontologies for integration, querying, and automation </li></ul></ul><ul><ul><li>Ontologies beyond vocabularies: the power of relationships </li></ul></ul>
    28. 28. Use data to test hypothesis Glycosyltransferase Congenital muscular dystrophy Link between glycosyltransferase activity and congenital muscular dystrophy? Adapted from: Olivier Bodenreider, presentation at HCLS Workshop, WWW07 gene GO PubMed Gene name OMIM Sequence Interactions
    29. 29. In a Web pages world… Adapted from: Olivier Bodenreider, presentation at HCLS Workshop, WWW07 Congenital muscular dystrophy, type 1D (GeneID: 9215) has_associated_disease has_molecular_function Acetylglucosaminyl-transferase activity
    30. 30. With the semantically enhanced data From medinfo paper. Adapted from: Olivier Bodenreider, presentation at HCLS Workshop, WWW07 SELECT DISTINCT ?t ?g ?d { ?t is_a GO:0016757 . ?g has molecular function ?t . ?g has_associated_phenotype ?b2 . ?b2 has_textual_description ?d . FILTER (?d, “muscular distrophy”, “i”) . FILTER (?d, “congenital”, “i”) } MIM:608840 Muscular dystrophy, congenital, type 1D GO:0008375 has_associated_phenotype has_molecular_function EG:9215 LARGE acetylglucosaminyl- transferase GO:0016757 glycosyltransferase GO:0008194 isa GO:0008375 acetylglucosaminyl- transferase GO:0016758
    31. 31. Scenario 5 <ul><li>Status: Research prototype and in progress </li></ul><ul><ul><ul><li>Workflow withSemantic Annotation of Experimental Data already in use </li></ul></ul></ul><ul><li>Where: UGA </li></ul><ul><li>What: </li></ul><ul><ul><li>Knowledge driven query formulation </li></ul></ul><ul><ul><li>Semantic Problem Solving Environment (PSE) for Trypanosoma cruzi (Chagas Disease) </li></ul></ul>
    32. 32. Knowledge driven query formulation <ul><li>Complex queries can also include: </li></ul><ul><li>- on-the-fly Web services execution to retrieve additional data </li></ul><ul><li>inference rules to make implicit knowledge explicit </li></ul>
    33. 33. T.Cruzi PSE Query Interface Figure 4: Semantic annotation of ms scientific data
    34. 34. N-Glycosylation Process ( NGP ) Cell Culture Glycoprotein Fraction Glycopeptides Fraction extract Separation technique I Glycopeptides Fraction n*m n Signal integration Data correlation Peptide Fraction Peptide Fraction ms data ms/ms data ms peaklist ms/ms peaklist Peptide list N-dimensional array Glycopeptide identification and quantification proteolysis Separation technique II PNGase Mass spectrometry Data reduction Data reduction Peptide identification binning n 1
    35. 35. Semantic Annotation Applications Semantic Web Process to incorporate provenance Storage Standard Format Data Raw Data Filtered Data Search Results Final Output Agent Agent Agent Agent Biological Sample Analysis by MS/MS Raw Data to Standard Format Data Pre- process DB Search (Mascot/Sequest) Results Post-process (ProValt) O I O I O I O I O Biological Information
    36. 36. ProPreO: Ontology-mediated provenance 830.9570 194.9604 2 580.2985 0.3592 688.3214 0.2526 779.4759 38.4939 784.3607 21.7736 1543.7476 1.3822 1544.7595 2.9977 1562.8113 37.4790 1660.7776 476.5043 parent ion m/z fragment ion m/z ms/ms peaklist data fragment ion abundance parent ion abundance parent ion charge M ass S pectrometry (MS) Data
    37. 37. ProPreO: Ontology-mediated provenance <ms-ms_peak_list> <parameter instrument=“micromass_QTOF_2_quadropole_time_of_flight_mass_spectrometer” mode=“ms-ms”/> <parent_ion m-z =“830.9570” abundance=“194.9604” z=“2”/> <fragment_ion m-z =“580.2985” abundance=“0.3592”/> <fragment_ion m-z =“688.3214” abundance=“0.2526”/> <fragment_ion m-z =“779.4759” abundance=“38.4939”/> <fragment_ion m-z =“784.3607” abundance=“21.7736”/> <fragment_ion m-z =“1543.7476” abundance=“1.3822”/> <fragment_ion m-z =“1544.7595” abundance=“2.9977”/> <fragment_ion m-z =“1562.8113” abundance=“37.4790”/> <fragment_ion m-z =“1660.7776” abundance=“476.5043”/> </ms-ms_peak_list> Ontological Concepts Semantically Annotated MS Data
    38. 38. Scenario 6 <ul><li>When: Research in progress </li></ul><ul><li>Where: Athens Heart Center and Cincinatti Children’s Hospital Medical Center </li></ul><ul><li>What: scientific literature mining </li></ul><ul><ul><li>Dealing with unstructured information </li></ul></ul><ul><ul><li>Extracting knowledge from text </li></ul></ul><ul><ul><li>Complex entity recognition </li></ul></ul><ul><ul><li>Relationship extraction </li></ul></ul>
    39. 39. Heart Failure Clinical Pathway <ul><ul><li>Ontology: A Framework for Schema-Driven Relationship Discovery from Unstructured Text, Ramakrishnan, et. al., ISWC 2006, LNCS 4273, pp. 583-596 </li></ul></ul>causes Disease Angiotension Receptor Blocker (ARB)
    40. 40. Contextual delivery of information
    41. 41. <ul><li>Two technical challenges </li></ul><ul><ul><li>Text mining </li></ul></ul><ul><ul><li>Workflow adaptation </li></ul></ul>
    42. 42. Extracting the Relationship Diabetes mellitus adversely affects the outcomes in patients with myocardial infarction (MI), due in part to the exacerbation of left ventricular (LV) remodeling. Although angiotensin II type 1 receptor blocker (ARB) has been demonstrated to be effective in the treatment of heart failure, information about the potential benefits of ARB on advanced LV failure associated with diabetes is lacking. To induce diabetes, male mice were injected intraperitoneally with streptozotocin (200 mg/kg). At 2 weeks, anterior MI was created by ligating the left coronary artery. These animals received treatment with olmesartan (0.1 mg/kg/day; n = 50) or vehicle (n = 51) for 4 weeks. Diabetes worsened the survival and exaggerated echocardiographic LV dilatation and dysfunction in MI. Treatment of diabetic MI mice with olmesartan significantly improved the survival rate (42% versus 27%, P < 0.05) without affecting blood glucose, arterial blood pressure, or infarct size. It also attenuated LV dysfunction in diabetic MI. Likewise, olmesartan attenuated myocyte hypertrophy, interstitial fibrosis, and the number of apoptotic cells in the noninfarcted LV from diabetic MI. Post-MI LV remodeling and failure in diabetes were ameliorated by ARB, providing further evidence that angiotensin II plays a pivotal role in the exacerbated heart failure after diabetic MI. Angiotensin II type 1 receptor blocker attenuates exacerbated left ventricular remodeling and failure in diabetes-associated myocardial infarction., Matsusaka H, et. al. ARB causes heart failure
    43. 43. Problem – Extracting relationships between MeSH terms from PubMed Biologically active substance Lipid Disease or Syndrome affects causes affects causes complicates Fish Oils Raynaud’s Disease ??????? instance_of instance_of UMLS Semantic Network MeSH PubMed 9284 documents 4733 documents 5 documents
    44. 44. Background knowledge used <ul><li>UMLS – A high level schema of the biomedical domain </li></ul><ul><ul><li>136 classes and 49 relationships </li></ul></ul><ul><ul><li>Synonyms of all relationship – using variant lookup (tools from NLM) </li></ul></ul><ul><ul><li>49 relationship + their synonyms = ~ 350 mostly verbs </li></ul></ul><ul><li>MeSH </li></ul><ul><ul><li>22,000+ topics organized as a forest of 16 trees </li></ul></ul><ul><ul><li>Used to query PubMed </li></ul></ul><ul><li>PubMed </li></ul><ul><ul><li>Over 16 million abstract </li></ul></ul><ul><ul><li>Abstracts annotated with one or more MeSH terms </li></ul></ul>T147—effect T147—induce T147—etiology T147—cause T147—effecting T147—induced
    45. 45. Method – Parse Sentences in PubMed SS-Tagger (University of Tokyo) SS-Parser (University of Tokyo) (TOP (S (NP (NP (DT An) (JJ excessive) (ADJP (JJ endogenous) (CC or) (JJ exogenous) ) (NN stimulation) ) (PP (IN by) (NP (NN estrogen) ) ) ) (VP (VBZ induces) (NP (NP (JJ adenomatous) (NN hyperplasia) ) (PP (IN of) (NP (DT the) (NN endometrium) ) ) ) ) ) ) <ul><li>Entities (MeSH terms) in sentences occur in modified forms </li></ul><ul><ul><li>“ adenomatous ” modifies “ hyperplasia ” </li></ul></ul><ul><ul><li>“ An excessive endogenous or exogenous stimulation ” modifies “ estrogen ” </li></ul></ul><ul><li>Entities can also occur as composites of 2 or more other entities </li></ul><ul><ul><li>“ adenomatous hyperplasia ” and “ endometrium ” occur as “ adenomatous hyperplasia of the endometrium” </li></ul></ul>
    46. 46. Method – Identify entities and Relationships in Parse Tree TOP NP VP S NP VBZ induces NP PP NP IN of DT the NN endometrium JJ adenomatous NN hyperplasia NP PP IN by NN estrogen DT the JJ excessive ADJP NN stimulation JJ endogenous JJ exogenous CC or MeSHID D004967 MeSHID D006965 MeSHID D004717 UMLS ID T147 Modifiers Modified entities Composite Entities
    47. 47. <ul><li>What can we do with the extracted knowledge? </li></ul><ul><li>Semantic browser demo </li></ul>
    48. 48. Evaluating hypotheses PubMed Keyword query: Migraine[MH] + Magnesium[MH] Complex Query Supporting Document sets retrieved Migraine Stress Patient affects isa Magnesium Calcium Channel Blockers inhibit
    49. 49. Workflow Adaptation: Why and How <ul><li>Volatile nature of execution environments </li></ul><ul><ul><li>May have an impact on multiple activities/ tasks in the workflow </li></ul></ul><ul><li>HF Pathway </li></ul><ul><ul><li>New information about diseases, drugs becomes available </li></ul></ul><ul><ul><li>Affects treatment plans, drug-drug interactions </li></ul></ul><ul><li>Need to incorporate the new knowledge into execution </li></ul><ul><ul><li>capture the constraints and relationships between different tasks activities </li></ul></ul>
    50. 50. Workflow Adaptation Why? New knowledge about treatment found during the execution of the pathway New knowledge about drugs, drug drug interactions
    51. 51. Workflow Adaptation: How <ul><li>Decision theoretic approaches </li></ul><ul><ul><li>Markov Decision Processes </li></ul></ul><ul><li>Given the state S of the workflow when an event E occurs </li></ul><ul><ul><li>What is the optimal path to a goal state G </li></ul></ul><ul><ul><li>Greedy approaches rely on local optimization </li></ul></ul><ul><ul><ul><li>Need to choose actions based on optimality across the entire horizon, not just the current best action </li></ul></ul></ul><ul><ul><li>Model the horizon and use MDP to find the best path to a goal state </li></ul></ul>
    52. 52. Conclusion <ul><li>semantic web technologies can help with: </li></ul><ul><ul><li>Fusion of data: semi-structured, structured, experimental, literature, multimedia </li></ul></ul><ul><ul><li>Analysis and mining of data, extraction, annotation, capture provenance of data through annotation, workflows with SWS </li></ul></ul><ul><ul><li>Querying of data at different levels of granularity, complex queries, knowledge-driven query interface </li></ul></ul><ul><ul><li>Perform inference across data sets </li></ul></ul>
    53. 53. Take home points <ul><li>Shift of paradigm: from browsing to querying </li></ul><ul><li>Machine understanding: </li></ul><ul><ul><li>extracting knowledge from text </li></ul></ul><ul><ul><li>Inference, software interoperation </li></ul></ul><ul><li>Semantic-enabled interfaces towards hypothesis validation </li></ul>
    54. 54. References <ul><li>A. Sheth, S. Agrawal, J. Lathem, N. Oldham, H. Wingate, P. Yadav, and K. Gallagher, Active Semantic Electronic Medical Record, Intl Semantic Web Conference , 2006. </li></ul><ul><li>Satya Sahoo, Olivier Bodenreider, Kelly Zeng, and Amit Sheth, An Experiment in Integrating Large Biomedical Knowledge Resources with RDF: Application to Associating Genotype and Phenotype Information WWW2007 HCLS Workshop , May 2007. </li></ul><ul><li>Satya S. Sahoo, Kelly Zeng, Olivier Bodenreider, and Amit Sheth, From &quot; Glycosyltransferase to Congenital Muscular Dystrophy: Integrating Knowledge from NCBI Entrez Gene and the Gene Ontology , Amsterdam: IOS, August 2007, PMID: 17911917, pp. 1260-4 </li></ul><ul><li>Satya S. Sahoo, Olivier Bodenreider, Joni L. Rutter, Karen J. Skinner , Amit P. Sheth, An ontology-driven semantic mash-up of gene and biological pathway information: Application to the domain of nicotine dependence, submitted, 2007. </li></ul><ul><li>Cartic Ramakrishnan, Krzysztof J. Kochut, and Amit Sheth, &quot; A Framework for Schema-Driven Relationship Discovery from Unstructured Text&quot;, Intl Semantic Web Conference , 2006, pp. 583-596 </li></ul><ul><li>Satya S. Sahoo, Christopher Thomas, Amit Sheth, William S. York, and Samir Tartir, &quot; Knowledge Modeling and Its Application in Life Sciences: A Tale of Two Ontologies &quot;, 15th International World Wide Web Conference (WWW2006), Edinburgh, Scotland, May 23-26, 2006. </li></ul><ul><li>Demos at: </li></ul>