Driving Deep Semantics in Middleware and Networks: What, why and how?


Published on

Amit Sheth, "Driving Deep Semantics in Middleware and Networks: What, why and how?," Keynote talk at Semantic Sensor Networks Workshop at the 5th International Semantic Web Conference (ISWC-2006), November 6, 2006, Athens, Georgia, USA.

Published in: Education, Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • CENTRAL ROLE OF ONTOLOGIES Ontology represents agreement, represents common terminology/nomenclature Ontology is populated with extensive domain knowledge or known facts/assertions Key enabler of semantic metadata extraction from all forms of content: unstructured text (and 150 file formats) semi-structured (HTML, XML) and structured data Ontology is in turn the center price that enables resolution of semantic heterogeneity semantic integration semantically correlating/associating objects and documents Large number of ontologies have been developed and many are in use
  • Driving Deep Semantics in Middleware and Networks: What, why and how?

    1. 1. Driving Deep Semantics in Middleware and Networks: What, why and how? Amit Sheth Keynote @ Semantic Sensor Networks Workshop @ ISWC2006 November 06, 2006, Athens GA Thanks: Doug Brewer, Lakshmish Ramaswamy
    2. 2. SW Today <ul><li>Can create large populated ontologies </li></ul><ul><li>Lots of manually annotated documents; can do high-quality semantic meta-data extraction/annotation </li></ul><ul><li>Have query languages (SPARQL), RDF query processing, reasoning, and rule processing capabilities </li></ul>
    3. 3. <ul><li>Upper ontologies: modeling of time, space, process, etc </li></ul><ul><li>Broad-based or general purpose ontology/nomenclatures: Cyc, WordNet ; </li></ul><ul><li>Domain-specific or Industry specific ontologies </li></ul><ul><ul><li>News: politics, sports, business, entertainment (also see TAP and SWETO ) ( P ) </li></ul></ul><ul><ul><li>Financial Market ( C ) </li></ul></ul><ul><ul><li>Terrorism (L/G) </li></ul></ul><ul><ul><li>Biology: Open Biomedical Ontologies , GlycO ; PropeO ( P ) </li></ul></ul><ul><ul><li>Clinical (See Open Clinical ) (L, P, C ) </li></ul></ul><ul><ul><li>GO (nomenclature), NCI (schema), UMLS (knowledgebase), …( P ) </li></ul></ul><ul><li>Application Specific and Task specific ontologies </li></ul><ul><ul><li>Risk/Anti-money laundering ( C ), Equity Research ( C ), Repertoire Management ( C ) </li></ul></ul><ul><ul><li>NeedToKnow (L/G), Financial Irregularity (L/G) </li></ul></ul><ul><li>P= Public , G=Government, L=Limited Availability, C=Commercial </li></ul>Differnent approaches in developing ontologies: schema vs populated; community efforts vs reusing knowledge sources Types of Ontologies (or things close to ontology)
    4. 4. Open Biomedical Ontologies Open Biomedical Ontologies, http://obo.sourceforge.net/
    5. 5. Example Life Science Ontologies <ul><li>ProPreO </li></ul><ul><ul><ul><li>An ontology for capturing process and lifecycle information related to proteomic experiments </li></ul></ul></ul><ul><ul><ul><li>398 classes, 32 relationships </li></ul></ul></ul><ul><ul><ul><li>3.1 million instances </li></ul></ul></ul><ul><ul><ul><li>Published through the National Center for Biomedical Ontology (NCBO) and Open Biomedical Ontologies (OBO) </li></ul></ul></ul><ul><li>Glyco </li></ul><ul><ul><ul><li>An ontology for structure and function of Glycopeptides </li></ul></ul></ul><ul><ul><ul><li>573 classes, 113 relationships </li></ul></ul></ul><ul><ul><ul><li>Published through the National Center for Biomedical Ontology (NCBO) </li></ul></ul></ul>
    6. 6. Manual Annotation (Example PubMed abstract) Abstract Classification/Annotation
    7. 7. Semantic Annotation/Metadata Extraction + Enhancement [Hammond, Sheth, Kochut 2002]
    8. 8. Automatic Semantic Annotation © Semagix, Inc. Limited tagging (mostly syntactic) COMTEX Tagging Content ‘ Enhancement’ Rich Semantic Metatagging Value-added Semagix Semantic Tagging <ul><li>Value-added </li></ul><ul><li>relevant metatags </li></ul><ul><li>added by Semagix </li></ul><ul><li>to existing </li></ul><ul><li>COMTEX tags: </li></ul><ul><li>Private companies </li></ul><ul><li>Type of company </li></ul><ul><li>Industry affiliation </li></ul><ul><li>Sector </li></ul><ul><li>Exchange </li></ul><ul><li>Company Execs </li></ul><ul><li>Competitors </li></ul>
    9. 9. Spatio-temporal-thematic semantics http://lsdis.cs.uga.edu/library/download/ACM-GIS_06_Perry.pdf
    10. 10. Scene Description Tree Retrieve Scene Description Track “ NSF Playoff” Node Enhanced XML Description MPEG-2/4/7 Enhanced Digital Cable Video MPEG Encoder MPEG Decoder Node = AVO Object Voqutte/Taalee Semantic Engine <ul><li>Produced by: Fox Sports   </li></ul><ul><li>Creation Date: 12/05/2000 </li></ul><ul><li>League: NFL </li></ul><ul><li>Teams: Seattle Seahawks, </li></ul><ul><li>Atlanta Falcons </li></ul><ul><li>Players: John Kitna </li></ul><ul><li>Coaches: Mike Holmgren, </li></ul><ul><li>Dan Reeves </li></ul><ul><li>Location: Atlanta </li></ul>Object Content Information (OCI) Metadata-rich Value-added Node Create Scene Description Tree  GREAT USER EXPERIENCE Embedding Metadata in multimedia, a/v or sensor data Channel sales through Video Server Vendors, Video App Servers, and Broadcasters License metadata decoder and semantic applications to device makers “ NSF Playoff”
    11. 11. Metadata for Automatic Content Enrichment Interactive Television This segment has embedded or referenced metadata that is used by personalization application to show only the stocks that user is interested in. This screen is customizable with interactivity feature using metadata such as whether there is a new Conference Call video on CSCO. Part of the screen can be automatically customized to show conference call specific information– including transcript, participation, etc. all of which are relevant metadata Conference Call itself can have embedded metadata to support personalization and interactivity.
    12. 12. WSDL-S Metamodel Action Attribute for Functional Annotation Pre and Post Conditions Pre and Post Conditions Can use XML, OWL or UML types Extension Adaptation schemaMapping
    13. 13. WSDL-S <?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?> <definitions ………………. xmlns:rosetta = &quot; http://lsdis.cs.uga.edu/projects/meteor-s/wsdl-s/pips.owl “ > <interface name = &quot;BatterySupplierInterface&quot; description = &quot;Computer PowerSupply Battery Buy Quote Order Status &quot; domain=&quot;naics:Computer and Electronic Product Manufacturing&quot; > <operation name = &quot;getQuote&quot; pattern = &quot;mep:in-out&quot; action = &quot; rosetta:#RequestQuote &quot; > <input messageLabel = ”qRequest” element=&quot; rosetta :#QuoteRequest &quot; /> <output messageLabel = ”quote” elemen =&quot; rosetta :#QuoteConfirmation &quot; /> < pre condition = qRequested.Quantity > 10000 &quot; /> </operation> </interface> </definitions> Function from Rosetta Net Ontology Data from Rosetta Net Ontology Pre Condition on input data
    14. 14. Relationship Extraction Disease or Syndrome causes affects causes complicates Fish Oils Raynaud’s Disease ??????? instance_of instance_of UMLS MeSH PubMed 9284 documents 4733 documents Biologically active substance Lipid affects 5 documents
    15. 15. About the data used <ul><li>UMLS – A high level schema of the biomedical domain </li></ul><ul><ul><li>136 classes and 49 relationships </li></ul></ul><ul><ul><li>Synonyms of all relationship – using variant lookup (tools from NLM) </li></ul></ul><ul><li>MeSH </li></ul><ul><ul><li>Terms already asserted as instance of one or more classes in UMLS </li></ul></ul><ul><li>PubMed </li></ul><ul><ul><li>Abstracts annotated with one or more MeSH terms </li></ul></ul>T147—effect T147—induce T147—etiology T147—cause T147—effecting T147—induced
    16. 16. Method – Parse Sentences in PubMed SS-Tagger (University of Tokyo) SS-Parser (University of Tokyo) (TOP (S (NP (NP (DT An) (JJ excessive) (ADJP (JJ endogenous) (CC or) (JJ exogenous) ) (NN stimulation) ) (PP (IN by) (NP (NN estrogen) ) ) ) (VP (VBZ induces) (NP (NP (JJ adenomatous) (NN hyperplasia) ) (PP (IN of) (NP (DT the) (NN endometrium) ) ) ) ) ) )
    17. 17. Method – Identify entities and Relationships in Parse Tree [Ramakrishnan, Kochut, Sheth 2006] Modifiers Modified entities Composite Entities
    18. 18. Limitations of Current N/W Design <ul><li>Too rigid </li></ul><ul><ul><li>Knowledge of exact IP address is mandatory </li></ul></ul><ul><li>No support for content-based communication </li></ul><ul><ul><li>Content-based communication is implemented on the overlay N/W as an application </li></ul></ul><ul><li>Overlay-Underlay mismatch leads to inefficiencies </li></ul><ul><ul><li>The Overlay network creates Logical Links over the links the provided physical network </li></ul></ul><ul><ul><ul><li>A logical link can traverse many physical links and nodes </li></ul></ul></ul><ul><ul><ul><li>At each node a packet must traverse the network stack to routed </li></ul></ul></ul>
    19. 19. Limitations (Contd.) <ul><li>Security mechanisms not adequate </li></ul><ul><li>Communication control based on firewalls & black/white lists is not powerful </li></ul><ul><ul><li>Newer applications like P2P file sharing circumvent communication controls </li></ul></ul><ul><li>Cannot prevent deliberate information leakage </li></ul><ul><li>Weak accountability and audit mechanisms </li></ul>
    20. 20. What Can Semantics Do For N/Ws <ul><li>“ Richer” communication paradigm </li></ul><ul><ul><li>Liberates parties from needing to know exact addresses </li></ul></ul><ul><ul><ul><li>Routing based on Semantic Concepts </li></ul></ul></ul><ul><li>Improved efficiency </li></ul><ul><ul><li>Single or very few traversals of network stack </li></ul></ul><ul><ul><ul><li>Content is routed by the physical network based on the Semantics </li></ul></ul></ul><ul><li>Enhanced security and control </li></ul><ul><ul><li>Control based on message content rather than origin (or destination) </li></ul></ul><ul><li>Better accountability and audit </li></ul>
    21. 21. Content Based Networking <ul><li>Several existing products use “rudimentary” forms of semantics </li></ul><ul><li>Content switches </li></ul><ul><ul><li>Redirects in-coming requests to appropriate content servers/caches </li></ul></ul><ul><li>Application Oriented Networking </li></ul><ul><ul><li>CISCO’s XML-based networking platform </li></ul></ul><ul><ul><li>Does in router processing of XML </li></ul></ul><ul><ul><ul><li>XPath, XSLT, etc… </li></ul></ul></ul>
    22. 22. CISCO AON Diagram: CISCO AON (www.cisco.com) Think of modern router as a blade server.
    23. 23. Semantic Aware Networking Semantic Enabled Network Systems, NSF Proposal, Sheth, A., Ramaswamy, L., et. al.
    24. 24. Semantic Network Auditing Figure: Semantics-enabled Accountable Systems, LSDIS Lab, SAIC, Cisco
    25. 25. Medical Domain Example <ul><li>Use Semantics at the network level to deliver to doctors critical information in a timely manner. </li></ul><ul><ul><li>Allowing the doctor to treat the patient more efficiently with the most current, relevant information </li></ul></ul>
    26. 26. Data Sources Elsevier iConsult Health Information through SOAP Web Services PubMed 300 Documents Published Online each day NCBI Genome, Protein DBs Updated Daily with new Sequences Heterogenous Datasources need for integration and getting the right information to those who need it.
    27. 27. <ul><li>Human Constructed </li></ul><ul><ul><li>Graphical Interface with which they select part of an Ontology for their subscription </li></ul></ul><ul><li>Computer Constructed </li></ul><ul><ul><li>The computer uses information it already has (like a clinical pathway) and an Ontology to generate a subscription </li></ul></ul>Profiles (Subscriptions) <ul><ul><li>Heart Failure Clinical Pathway: SEIII Proposal, Sheth, et. al. </li></ul></ul><ul><ul><li>Ontology: A Framework for Schema-Driven Relationship Discovery from Unstructured Text, Ramakrishnan, et. al., ISWC 2006, LNCS 4273, pp. 583-596 </li></ul></ul>causes Disease Angiotension Receptor Blocker
    28. 28. Extracting the Relationship Diabetes mellitus adversely affects the outcomes in patients with myocardial infarction (MI), due in part to the exacerbation of left ventricular (LV) remodeling. Although angiotensin II type 1 receptor blocker (ARB) has been demonstrated to be effective in the treatment of heart failure, information about the potential benefits of ARB on advanced LV failure associated with diabetes is lacking. To induce diabetes, male mice were injected intraperitoneally with streptozotocin (200 mg/kg). At 2 weeks, anterior MI was created by ligating the left coronary artery. These animals received treatment with olmesartan (0.1 mg/kg/day; n = 50) or vehicle (n = 51) for 4 weeks. Diabetes worsened the survival and exaggerated echocardiographic LV dilatation and dysfunction in MI. Treatment of diabetic MI mice with olmesartan significantly improved the survival rate (42% versus 27%, P < 0.05) without affecting blood glucose, arterial blood pressure, or infarct size. It also attenuated LV dysfunction in diabetic MI. Likewise, olmesartan attenuated myocyte hypertrophy, interstitial fibrosis, and the number of apoptotic cells in the noninfarcted LV from diabetic MI. Post-MI LV remodeling and failure in diabetes were ameliorated by ARB, providing further evidence that angiotensin II plays a pivotal role in the exacerbated heart failure after diabetic MI. Angiotensin II type 1 receptor blocker attenuates exacerbated left ventricular remodeling and failure in diabetes-associated myocardial infarction., Matsusaka H, et. al. ARB causes heart failure
    29. 29. Ontology Work at the Network Level <ul><li>What can be done? </li></ul><ul><ul><li>Routing documents based on annotation </li></ul></ul><ul><ul><li>Distributed Relationship Computation </li></ul></ul><ul><ul><ul><li>A router at document arrival can compute whether it should be forwarded over named relationship or not </li></ul></ul></ul><ul><ul><ul><ul><li>Store minimal set of related entities and relationships at each node </li></ul></ul></ul></ul><ul><li>What are the challenges? </li></ul><ul><ul><li>Distributing the annotation across all nodes </li></ul></ul><ul><ul><ul><li>Instances bases for Ontologies are quite large </li></ul></ul></ul><ul><ul><ul><ul><li>Cannot expect a node to have that much storage </li></ul></ul></ul></ul><ul><ul><ul><li>How to forward the documents/events across the network? </li></ul></ul></ul><ul><ul><ul><ul><li>Possibly to all children? (Overhead in document duplication) </li></ul></ul></ul></ul>
    30. 30. Ontology Network Ontology: A Framework for Schema-Driven Relationship Discovery from Unstructured Text, Ramakrishnan, et. al., ISWC 2006, LNCS 4273, pp. 583-596 causes produces ARB causes heart failure PubMed NCBI Elsevier
    31. 31. Conclusions <ul><li>In the future, content will be able to be addressed to nodes on the network by use of concepts and topics instead of IP addresses </li></ul><ul><ul><li>Providing users with critical information in a timely manner </li></ul></ul><ul><li>Semantics will be used to allow networks to audit information flowing through them in a more in-depth, reliable manner </li></ul>
    32. 32. References <ul><li>Clinical Pathways: SEIII Proposal, Sheth, et al. </li></ul><ul><li>AON: www.cisco.com </li></ul><ul><li>PubMed </li></ul><ul><ul><li>http://www.ncbi.nlm.nih.gov/entrez/ </li></ul></ul><ul><ul><li>PMID: 17031262, Angiotensin II type 1 receptor blocker attenuates exacerbated left ventricular remodeling and failure in diabetes-associated myocardial infarction, Matsusaka H, et. al. </li></ul></ul><ul><li>Ontology: A Framework for Schema-Driven Relationship Discovery from Unstructured Text, Ramakrishnan, et al. </li></ul><ul><li>Semantic Auditing: Semantics-enabled Accountable Systems, LSDIS Lab, SAIC, Cisco </li></ul><ul><li>Relationship Extraction: A Framework for Schema-Driven Relationship Discovery from Unstructured Text , Ramakrishnan, et. al., ISWC 2006, LNCS 4273, pp. 583-596 </li></ul><ul><li>Open Biological Ontologies </li></ul><ul><ul><li>http:// obo.sourceforge.net / </li></ul></ul><ul><li>Semantic Networking Figure </li></ul><ul><ul><li>Semantic Enabled Network Systems, NSF Proposal, Sheth, A., Ramaswamy, L., et. al. </li></ul></ul>
    33. 33. For more information <ul><li>LSDIS Lab: http://lsdis.cs.uga.edu </li></ul><ul><li>Kno.e.sis Center: http://www.knoesis.org </li></ul>