1. Extracting, representing and mining Semantic Metadata from text: Facilitating Knowledge Discovery in Biomedicine Cartic Ramakrishnan Advisor: Dr. Amit Sheth Committee Members: Dr. Michael Raymer Dr. Guozhu Dong Dr. Thaddeus Tarpey Dr. Vasant Honavar Dr. Shaojun Wang
2.
3.
4.
5. Element of surprise – Swanson’s discoveries Magnesium Migraine PubMed ? Stress Spreading Cortical Depression Calcium Channel Blockers Swanson’s Discoveries Associations Discovered based on keyword searches followed by manually analysis of text to establish possible relevant relationships 11 possible associations found
6. Knowledge Discovery in AI- The robot scientist Planned search over an well-defined (axiomatic) space leading to knowledge discovery. Knowledge discovery by humans is done in non-axiomatic ill-defined spaces over multi-modal data . Scientific literature is ill-defined and loosely structured source of data used in scientific investigations. Assigning structure and interpretation to text (Semantics) Syntax Structure Semantics
8. Information Extraction & Text Mining This MEK dependency was observed in BRAF mutant cells regardless of tissue lineage, and correlated with both downregulation of cyclin D1 protein expression and the induction of G1 arrest. *MEK dependency ISA Dependency_on_an_Organic_chemical *BRAF mutant cells ISA Cell_type *downregulation of cyclin D1 protein expression ISA Biological_process *tissue lineage ISA Biological_concept *induction of G1 arrest ISA Biological_process Information Extraction = segmentation+classification+association+mining Text mining = entity identification+named relationship extraction+discovering association chains…. Segmentation Classification Named Relationship Extraction MEK dependency observed in BRAF mutant cells downregulation of cyclin D1 protein expression correlated with induction of G1 arrest correlated with
9.
10. Knowledge Discovery over text Extraction of Semantics from text Semantic Metadata Guided Knowledge Explorations Assigning interpretation to text Semantic Metadata Guided Knowledge Discovery Triple-based Semantic Search Semantic browser Subgraph discovery Semantic metadata in the form of semi-structured data Text
11. Ontology-enabled Information Extraction Cartic Ramakrishnan , Krys Kochut, Amit P. Sheth: A Framework for Schema-Driven Relationship Discovery from Unstructured Text. International Semantic Web Conference 2006 : 583-596
12.
13. Information Extraction via Ontology assisted text mining – Relationship extraction Biologically active substance Lipid Disease or Syndrome affects causes affects causes complicates Fish Oils Raynaud’s Disease ??????? instance_of instance_of UMLS Semantic Network MeSH PubMed 9284 documents 4733 documents 5 documents
14.
15.
16. Method – Identify entities and relationships in Parse Tree TOP NP VP S NP VBZ induces NP PP NP IN of DT the NN endometrium JJ adenomatous NN hyperplasia NP PP IN by NN estrogen DT the JJ excessive ADJP NN stimulation JJ endogenous JJ exogenous CC or MeSHID D004967 MeSHID D006965 MeSHID D004717 UMLS ID T147 Modifiers Modified entities Composite Entities
19. Paths between Migraine and Magnesium Paths are considered interesting if they have one or more named relationship Other than hasPart or hasModifiers in them
25. Unsupervised Joint Extraction of Compound Entities and Relationship Cartic Ramakrishnan , Pablo N. Mendes, Shaojun Wang and Amit P. Sheth "Unsupervised Discovery of Compound Entities for Relationship Extraction" EKAW 2008 - 16th International Conference on Knowledge Engineering and Knowledge Management Knowledge Patterns
47. Schema-driven edge weight assignment company Entertainment company Manufacturing company Oil company Automotive company Electronics company Sporting goods company Ford Motors Cartic’s Company 0.67 0.33 <0.5 Schema Instances 1.0 1.0 1.0 1.0 1.0 1.0 0.33