Successfully reported this slideshow.
Bethesda, Maryland, April 6, 1999Bethesda, Maryland, April 6, 1999
Amit ShethAmit Sheth
Large Scale Distributed Informatio...
Information Integration PerspectiveInformation Integration Perspective
distribution
autonomy
heterogeneity
Three perspecti...
MermaidMermaid
DDTSDDTS
Multibase, MRDSM, ADDS,Multibase, MRDSM, ADDS,
IISS, Omnibase, ...IISS, Omnibase, ...
Generation I...
Generation IGeneration I
• Data recognized as corporate resource — leverage it!
• Data predominantly in structured databas...
(heterogeneity in FDBMSs)
CC
oo
mm
mm
uu
nn
ii
cc
aa
tt
ii
oo
nn
Hardware/System
• instruction set
• data representation/c...
Generation IGeneration I
(Federated Database Systems: Schema Architecture)
Component
DBS
Local
Schema
Component
Schema
Exp...
(characterization of schematic conflicts in multidatabase systems)
SchematicSchematic
ConflictsConflicts
Sheth & Kashyap, ...
Generation IIGeneration II
• Significant improvements in computing and connectivity (standardization
of protocol, public n...
(limited types of metadata, extractors, mappers, wrappers)
Generation IIGeneration II
Global/Enterprise
Web Repositories
M...
(a metadata classification: the informartion pyramid)
Generation IIGeneration II
Data (Heterogeneous Types/Media)(Heteroge...
VisualHarness – an exampleVisualHarness – an example
Query processing and information requestsQuery processing and information requests
NOWNOW
 traditional queries based on k...
GIS Data Representation – ExampleGIS Data Representation – Example
multiple heterogeneous metadata models with different
t...
Generation IIIGeneration III
• Increasing information overload and broader variety of information
content (video content, ...
Information Brokering: An Enabler for the InfocosmInformation Brokering: An Enabler for the Infocosm
INFORMATION/DATAINFOR...
Information Brokering: Three DimensionsInformation Brokering: Three Dimensions
S E M A N T I C SS E M A N T I C S
S T R U ...
W W WW W W
a confusing heterogeneity of media,
formats (Tower of Babel)
information correlation using physical (HREF)
link...
Concepts, tools and techniques to support semanticsConcepts, tools and techniques to support semantics
context
media-indep...
Tools to support semanticsTools to support semantics
• Context, context, contextContext, context, context
• Media-independ...
We shall focus on these!
Information Brokering over HeterogeneousInformation Brokering over Heterogeneous
Digital Data: A ...
Heterogeneity...Heterogeneity... …… is a Babel Tower!!is a Babel Tower!!
SEMANTIC INTEROPERABILITYSEMANTIC INTEROPERABILIT...
The InfoQuilt ProjectThe InfoQuilt Project
THE INFOQUILT VISIONTHE INFOQUILT VISION
• Semantic interoperability between sy...
InfoQuilt Project: using theInfoQuilt Project: using the MMetadataetadata REFREFerence linkerence link
http://lsdis.cs.uga...
domain specific metadata: terms chosen from domain specific ontologies
Domain Specific Correlation – exampleDomain Specifi...
Domain Specific Correlation – exampleDomain Specific Correlation – example
A DL II approach for Information BrokeringA DL II approach for Information Brokering
CONSTRUCTING ADDITIONAL
META-INFORMAT...
ADEPT Information Landscape Concept PrototypeADEPT Information Landscape Concept Prototype
(a scenario for Digital Earth:
...
Putting MREFs to workPutting MREFs to work
User
Agent
Profile
Manager
user information
MREF request
retrieve
profile
User
...
Context: the lynchpin of semanticsContext: the lynchpin of semantics
“For instance, if you were to use Yahoo! or Infoseek ...
Constructing c-contexts from ontological termsConstructing c-contexts from ontological terms
Advantages:
• Use of ontologi...
Using c-contexts to reason aboutUsing c-contexts to reason about
information in databaseinformation in database
Cdef(DOC)
...
Estimating information loss for multi-ontology basedEstimating information loss for multi-ontology based
query processing ...
Estimating information loss for multi-ontology basedEstimating information loss for multi-ontology based
query processing ...
Estimating information loss for multi-ontology basedEstimating information loss for multi-ontology based
query processing ...
Estimating information loss for multi-ontology basedEstimating information loss for multi-ontology based
query processing ...
Estimating information loss for multi-ontology basedEstimating information loss for multi-ontology based
query processing ...
Estimating information loss for multi-ontology basedEstimating information loss for multi-ontology based
query processing ...
Estimating information loss for multi-ontology basedEstimating information loss for multi-ontology based
query processing ...
Estimating information loss for multi-ontology basedEstimating information loss for multi-ontology based
query processing ...
SummarySummary
TextText
Structured DatabasesStructured Databases
DataData
Syntax,Syntax,
SystemSystem
Federated DBFederate...
Agenda for researchAgenda for research
• Interoperation not at systems level, but at informational and
possibly knowledge ...
http://lsdis.cs.uga.eduhttp://lsdis.cs.uga.edu
[See publications on Metadata, Semantics,Context,[See publications on Metad...
Upcoming SlideShare
Loading in …5
×

Semantic Interoperability & Information Brokering in Global Information Systems

1,382 views

Published on

Amit Sheth, "Semantic Interoperability and Information Brokering in Global Information Systems," Keynote talk at IEEE-Metadata Conference, Bethesda, MD, USA, April 6, 1999.

Key coverage:
Use of ontologies for semantic interoperability (http://knoesis.org/library/resource.php?id=00277); InfoHarness (http://knoesis.org/library/resource.php?id=00275) and VisualHarness (http://knoesis.org/library/resource.php?id=00267) demonstrate faceted search; MREF - putting metadata on HREF is way ahead of its time (see: http://knoesis.org/library/resource.php?id=00294); multi-ontology query processing in OBSERVER system (http://knoesis.org/library/resource.php?id=00273)

Published in: Education
  • Be the first to comment

  • Be the first to like this

Semantic Interoperability & Information Brokering in Global Information Systems

  1. 1. Bethesda, Maryland, April 6, 1999Bethesda, Maryland, April 6, 1999 Amit ShethAmit Sheth Large Scale Distributed Information Systems LabLarge Scale Distributed Information Systems Lab University of GeorgiaUniversity of Georgia http://lsdis.cs.uga.eduhttp://lsdis.cs.uga.edu
  2. 2. Information Integration PerspectiveInformation Integration Perspective distribution autonomy heterogeneity Three perspectives to GlobISThree perspectives to GlobIS Information Brokering PerspectiveInformation Brokering Perspective data meta-data semantic (terminological, contextual) ““Vision” PerspectiveVision” Perspective dataconnectivity computing information knowledge
  3. 3. MermaidMermaid DDTSDDTS Multibase, MRDSM, ADDS,Multibase, MRDSM, ADDS, IISS, Omnibase, ...IISS, Omnibase, ... Generation IGeneration I 1980s1980s Evolving targets and approaches in integratingEvolving targets and approaches in integrating data and informationdata and information (a personal perspective)(a personal perspective) DL-II projectsDL-II projects ADEPT,ADEPT, InfoQuiltInfoQuilt Generation IIIGeneration III 1997...1997... InfoSleuth, KMed, DL-I projectsInfoSleuth, KMed, DL-I projects Infoscopes, HERMES, SIMS,Infoscopes, HERMES, SIMS, Garlic,TSIMMIS,Harvest, RUFUS,...Garlic,TSIMMIS,Harvest, RUFUS,... Generation IIGeneration II 1990s1990s VisualHarnessVisualHarness InfoHarnessInfoHarness a society for ubiquitous exchange of (tradeable) information in all digital forms of representation; information anywhere, anytime, any forms
  4. 4. Generation IGeneration I • Data recognized as corporate resource — leverage it! • Data predominantly in structured databases, different data models, transitioning from network and hierarchical to relational DBMSs • Heterogeneity (system, modeling and schematic) as well as need to support autonomy posed main challenges; major issues were data access and connectivity • Information integration through Federated architecture • Support for corporate IS applications as the primary objective, update often required, data integrity important
  5. 5. (heterogeneity in FDBMSs) CC oo mm mm uu nn ii cc aa tt ii oo nn Hardware/System • instruction set • data representation/coding • configuration Operating System • file system • naming, file types, operation • transaction support • IPC Database System • Semantic HeterogeneitySemantic Heterogeneity • Differences in DBMSDifferences in DBMS • data modelsdata models (abstractions, constraints, query languages) • System level supportSystem level support (concurrency control, commit, recovery) 1970s1970s 1980s1980s Generation IGeneration I
  6. 6. Generation IGeneration I (Federated Database Systems: Schema Architecture) Component DBS Local Schema Component Schema Export Schema Export Schema Export Schema Federated Schema External Schema External Schema . . .. . . Component DBS Local Schema Component Schema . . .. . . . . .. . . . . .. . . . . .. . . schema translation schema integration • Model Heterogeneity: Common/Canonical Data Model Schema Translation • Information sharing while preserving autonomy • Dimensions for interoperability and integration: distribution, autonomy and heterogeneity
  7. 7. (characterization of schematic conflicts in multidatabase systems) SchematicSchematic ConflictsConflicts Sheth & Kashyap, Kim & SeoSheth & Kashyap, Kim & Seo Generalization Conflicts Aggregation Conflicts Abstraction LevelAbstraction Level IncompatibilityIncompatibility Data Value Attribute Conflict Entity Attribute Conflict Data Value Entity Conflict SchematicSchematic DiscrepanciesDiscrepancies Naming Conflicts Database Identifier Conflicts Schema Isomorphism Conflicts Missing Data Items Conflicts Entity DefinitionEntity Definition IncompatibilityIncompatibility Naming Conflicts Data Representation Conflicts Data Scaling Conflicts Data Precision Conflicts Default Value Conflicts Attribute Integrity Constraint Conflicts Domain DefinitionDomain Definition IncompatibilityIncompatibility Known Inconsistency Temporal Inconsistency Acceptable Inconsistency Data ValueData Value IncompatibilityIncompatibility B U T these techniques for dealing with schematic heterogeneity do not directly map to dealing with much larger variety of heterogeneous media Generation IGeneration I
  8. 8. Generation IIGeneration II • Significant improvements in computing and connectivity (standardization of protocol, public network, Internet/Web); remote data access as given; • Increasing diversity in data formats, with focus on variety of textual data and semi-structured documents • Many more data sources, heterogeneous information sources, but not necessarily better understanding of data • Use of data beyond traditional business applications: mining + warehousing, marketing, e-commerce • Web search engines for keyword based querying against HTML pages; attribute-based querying available in a few search systems • Use of metadata for information access; early work on ontology support distribution applied to metadata in some cases • Mediator architecture for information management
  9. 9. (limited types of metadata, extractors, mappers, wrappers) Generation IIGeneration II Global/Enterprise Web Repositories METADATAMETADATA EXTRACTORSEXTRACTORS Digital Maps Nexis UPI AP Documents Digital Audios Data Stores Digital Videos Digital Images . . . . . . . . . Find Marketing Manager positions in a company that is within 15 miles of San Francisco and whose stock price has been growing at a rate of at least 25% per year over the last three years Junglee, SIGMOD Record, Dec. 1997
  10. 10. (a metadata classification: the informartion pyramid) Generation IIGeneration II Data (Heterogeneous Types/Media)(Heterogeneous Types/Media) Content Independent Metadata (creation-date, location, type-of-sensor...)(creation-date, location, type-of-sensor...) Content Dependent Metadata (size, max colors, rows, columns...)(size, max colors, rows, columns...) Direct Content Based Metadata (inverted lists, document vectors, WAIS, Glimpse, LSI)(inverted lists, document vectors, WAIS, Glimpse, LSI) Domain Independent (structural) Metadata (C++ class-subclass relationships, HTML/SGML(C++ class-subclass relationships, HTML/SGML Document Type Definitions, C program structure...)Document Type Definitions, C program structure...) Domain Specific Metadata area, population (Census),area, population (Census), land-cover, relief (GIS),metadataland-cover, relief (GIS),metadata concept descriptions from ontologiesconcept descriptions from ontologies Ontologies ClassificationsClassifications Domain ModelsDomain Models User METADATA STANDARDSMETADATA STANDARDS General Purpose: Dublin Core, MCF Domain/industry specific: Geographic (FGDC, UDK, …), Library (MARC,…) Move in thisMove in this direction todirection to tackletackle informationinformation overload!!overload!!
  11. 11. VisualHarness – an exampleVisualHarness – an example
  12. 12. Query processing and information requestsQuery processing and information requests NOWNOW  traditional queries based on keywords  attribute based queries  content-based queries NEXTNEXT  ‘high level’ information requests involving ontology-based, iconic, mixed-media, and media-independent information rrequests  user selected ontology, use of profiles What’s next (after comprehensive use of metadata)?What’s next (after comprehensive use of metadata)?
  13. 13. GIS Data Representation – ExampleGIS Data Representation – Example multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain FGDC Metadata ModelFGDC Metadata Model Theme keywordsTheme keywords:: digital line graph, hydrography, transportation... TitleTitle: Dakota Aquifer Online linkageOnline linkage:: http://gisdasc.kgs.ukans.edu/dasc/ Direct Spatial Reference Method:Direct Spatial Reference Method: Vector Horizontal Coordinate System Definition:Horizontal Coordinate System Definition: Universal Transverse Mercator … … … ... UDK Metadata ModelUDK Metadata Model Search termsSearch terms:: digital line graph, hydrography, transportation... TopicTopic:: Dakota Aquifer Adress Id:Adress Id: http://gisdasc.kgs.ukans.edu/dasc/ Measuring Techniques:Measuring Techniques: Vector Co-ordinate System:Co-ordinate System: Universal Transverse Mercator … … … ... Kansas StateKansas State
  14. 14. Generation IIIGeneration III • Increasing information overload and broader variety of information content (video content, audio clips etc) with increasing amount of visual information, scientific/engineering data • Continued standardization related to Web for representational and metadata issues (MCF, RDF, XML) • Changes in Web architecture; distributed computing (CORBA, Java) • Users demand simplicity, but complexities continue to rise • Web is no longer just another information source, but decision supportdecision support through “data mining and information discovery, information fusion, information dissemination, knowledge creation and management”, “information management complemented by cooperation between the information system and humans” • Information Brokering Architecture proposed for information management
  15. 15. Information Brokering: An Enabler for the InfocosmInformation Brokering: An Enabler for the Infocosm INFORMATION/DATAINFORMATION/DATA OVERLOADOVERLOAD INFORMATION PROVIDERS Newswires Universities Corporations Research Labs Information System Data Repository Information System INFORMATION CONSUMERS Corporations Universities People Government Programs User Query User Query User Query arbitration between information consumers and providers for resolving information impedance INFORMATION BROKERINGINFORMATION BROKERING Information System Data Repository Information System Information Request Information Request Information Request dynamic reinterpretation of information requests for determination of relevant information services and products — dynamic creation and composition of information products
  16. 16. Information Brokering: Three DimensionsInformation Brokering: Three Dimensions S E M A N T I C SS E M A N T I C S S T R U C T U R ES T R U C T U R E S Y N T A XS Y N T A X S Y S T E MS Y S T E M C O N S U M E R SC O N S U M E R S B R O K E R SB R O K E R S P R O V I D E R SP R O V I D E R S DATADATA METADATAMETADATA VOCABULARYVOCABULARY T H R E E D I M E N S I O N S Objective:Objective: Reduce the problem of knowing structure and semantics of data in the huge number of information sources on a global scale to: understanding and navigating a significantly smaller number of domain ontologies
  17. 17. W W WW W W a confusing heterogeneity of media, formats (Tower of Babel) information correlation using physical (HREF) links at the extensional data level location dependent browsing of information using physical (HREF) links user has to keep track of information content !! W W WW W W + Information Brokering+ Information Brokering Domain Specific Ontologies as “semantic conceptual views” Information correlation using concept mappings at the intensional concept level Browsing of information using terminological relationships across ontologies Higher level of abstraction, closer to user view of information !! What else can Information Brokering do?What else can Information Brokering do?
  18. 18. Concepts, tools and techniques to support semanticsConcepts, tools and techniques to support semantics context media-independent information correlations semantic proximity inter-ontological relations ontologies (esp. domain-specific) profiles domain-specific metadata
  19. 19. Tools to support semanticsTools to support semantics • Context, context, contextContext, context, context • Media-independent information correlations • Multiple ontologies – Semantic Proximity (relationships between concepts within and across ontologies) using domain, context, modeling/abstraction/representation, state – Characterizing Loss of Information incurred due to differences in vocabulary BIG challenge:BIG challenge: identifying relationship oridentifying relationship or similarity between objects of different media,similarity between objects of different media, developed and managed by different persons and systemsdeveloped and managed by different persons and systems
  20. 20. We shall focus on these! Information Brokering over HeterogeneousInformation Brokering over Heterogeneous Digital Data: A Metadata-based ApproachDigital Data: A Metadata-based Approach I N F O R M A T I O N O V E R L O A D =I N F O R M A T I O N O V E R L O A D = HETEROGENEITY + GLOBALIZATIONHETEROGENEITY + GLOBALIZATION • Systems Heterogeneity:Systems Heterogeneity: information system heterogeneity (DBMSs, concurrency control); platform Heterogeneity (operating systems, hardware) • Syntactic Heterogeneity:Syntactic Heterogeneity: different formats and storage for digital media; machine readable aspects of data representation • Structural Heterogeneity:Structural Heterogeneity: heterogeneity in data model constructs; schematic/representational heterogeneity • Semantic Heterogeneity:Semantic Heterogeneity: terminological/vocabulary heterogeneity; contextual heterogeneity • Information Resource DiscoveryInformation Resource Discovery – which/where are the relevant information sources ? • Modeling of information ContentModeling of information Content – increasing number of modeling possibilities • Querying of Information ContentQuerying of Information Content – Information Focusing – Information Correlation – combinatorial combinations of combining/subsetting information
  21. 21. Heterogeneity...Heterogeneity... …… is a Babel Tower!!is a Babel Tower!! SEMANTIC INTEROPERABILITYSEMANTIC INTEROPERABILITY metadata ontologies contexts SEMANTIC HETEROGENEITYSEMANTIC HETEROGENEITY
  22. 22. The InfoQuilt ProjectThe InfoQuilt Project THE INFOQUILT VISIONTHE INFOQUILT VISION • Semantic interoperability between systems, sharing knowledgeSemantic interoperability between systems, sharing knowledge using multiple ontologiesusing multiple ontologies • Logical correlation of informationLogical correlation of information • Media independent information processingMedia independent information processing REALIZATION OF THE VISIONREALIZATION OF THE VISION • fully distributed, adaptable, agent-based systemfully distributed, adaptable, agent-based system • information/knowledgement supported by collaborativeinformation/knowledgement supported by collaborative processesprocesses http://lsdis.cs.uga.edu/proj/iq/iq.html
  23. 23. InfoQuilt Project: using theInfoQuilt Project: using the MMetadataetadata REFREFerence linkerence link http://lsdis.cs.uga.edu/proj/iq/iq.html MREFMREF Complements HREF, creating a “logical web” through mediaComplements HREF, creating a “logical web” through media independent ontology & metadata based correlationindependent ontology & metadata based correlation It is a description of the information asset we want to retrieveIt is a description of the information asset we want to retrieve MREFMREF domain ontologies IQ_Asset ontology + extension ontologies attributes relations constraints keywords content attributes (color, scene cuts, …) Semantic Correlation using MREF MREF Concept Model for logical correlation using ontological terms and metadata Framework for representing MREF’s Serialization (one implementation choice) X M L M R E F R D F
  24. 24. domain specific metadata: terms chosen from domain specific ontologies Domain Specific Correlation – exampleDomain Specific Correlation – example Potential locations for a future shopping mall identified by all regionsregions having a populationpopulation greater than 5000, and areaarea greater than 50 sq. ft. having an urban land coverland cover and moderate reliefrelief <A MREF ATTRIBUTES(population > 5000; area > 50; region-type = ‘block’; land-cover = ‘urban’; relief = ‘moderate’) can be viewed here</A> Population: Area: Land cover: Relief: Boundaries: Census DB TIGER/Line DB US Geological Survey Regions (SQL): ← Boundaries → Image Features (image processing routines) => media-independent relationships between domain specific metadata: population, area, land cover, relief => correlation between image and structured data at a higher domain specific level as opposed to physical “link- chasing” in the WWW
  25. 25. Domain Specific Correlation – exampleDomain Specific Correlation – example
  26. 26. A DL II approach for Information BrokeringA DL II approach for Information Brokering CONSTRUCTING ADDITIONAL META-INFORMATION RESOURCES Physical/Simulation World DISCOVERING COLLECTIONS OF HETEROGENEOUS INFORMATION AND META-INFORMATION RESOURCES Images Data Stores Documents Digital Media Domain Specific Ontologies Domain Independent Ontologies Iscape N CONSTRUCTING APPROPRIATE INFORMATION LANDSCAPESCONSTRUCTING APPROPRIATE INFORMATION LANDSCAPES Iscape 1
  27. 27. ADEPT Information Landscape Concept PrototypeADEPT Information Landscape Concept Prototype (a scenario for Digital Earth: learning in the context of the “El Niño” phenomenon) Sample Iscapes Requests: – How does El Niño affect sea animals? Look for broadcast videos of less than 2 minutes. – How are some regions affected by El Niño? Look at East/West Pacific regions. – What disasters have been related to El Niño? – What storm occurrences are attributed to El Niño? – Show reports related to El Niño that contain Clinton. TRY ISCAPE CONCEPT DEMO request information using keywordskeywords domain-specific attributesdomain-specific attributes domain-independent attributesdomain-independent attributes
  28. 28. Putting MREFs to workPutting MREFs to work User Agent Profile Manager user information MREF request retrieve profile User display results change profile design MREF domain ontologies MREF Builder IQ_Asset ontology + extension ontologies construct new MREF Broker Agent send MREFsend results retrieve MREF retrieve MREF MREF repository MREF repository User profiles
  29. 29. Context: the lynchpin of semanticsContext: the lynchpin of semantics “For instance, if you were to use Yahoo! or Infoseek to search the web for pizza, your results would probably be hundreds of matches for the word pizza. Many of these could be pizza parlors around the world. Yet if you run the same search within NeighborNet, you will allows you to order pizza to be delivered instead of shipped.” From a Press Resease of FutureOne, Inc. March 24, 1999 http://home.futureone.com/about/pr/021699.asp Cricket
  30. 30. Constructing c-contexts from ontological termsConstructing c-contexts from ontological terms Advantages: • Use of ontologies for an intensional domain specific description of data • Representation of extra information → Relationships between objects not represented in the database schema → Using terminological relationships in the ontology ONTOLOGICAL TERMS C-CONTEXT: “All documents stored in the database have been published by some agency” => Cdef(DOC) = <(hasOrganization, AgencyConcept)> C-Context = <(C1 , V1) (C2 , V2) ... (Ck , Vk) > a collection of contextual coordinates Ci s (roles) and values Vi s (concepts/concept descriptions) Agency Concept DATABASE OBJECTS Document ConcepthasOrganization AGENCY(RegNo, Name, Affiliation) DOC(Id, Title, Agency)
  31. 31. Using c-contexts to reason aboutUsing c-contexts to reason about information in databaseinformation in database Cdef(DOC) <(hasOrganization, AgencyConcept)> CQ <(hasOrganization, { “USGS”})> - Reasoning with c-contexts: glb(Cdef(DOC), CQ) - Ontological Inferences: - DocumentConcept - (hasOrganization, { “USGS” }) Challenge 1: use of multiple ontologies Challenge 2: estimating the loss of information EXAMPLEEXAMPLE glb(Cdef(DOC), CQ) <(self, DocumentConcept),(hasOrganization, { “USGS” })>
  32. 32. Estimating information loss for multi-ontology basedEstimating information loss for multi-ontology based query processing in the OBSERVER/InfoQuilt systemquery processing in the OBSERVER/InfoQuilt system OBSERVER architectureOBSERVER architecture Data Repositories Mappings Ontologies COMPONENT NODE Data Repositories Mappings Ontologies COMPONENT NODE Data Repositories Mappings Ontology Server Query Processor User Query Ontologies USER NODE Interontologies Terminological Relationships IRM IRM NODE Ontology Server Ontology Server Query Processor Query Processor Eduardo Mena (III’98)
  33. 33. Estimating information loss for multi-ontology basedEstimating information loss for multi-ontology based query processing in the OBSERVER/InfoQuilt systemquery processing in the OBSERVER/InfoQuilt system “Get title and number of pages of books written by Carl Sagan” Query construction - ExampleQuery construction - Example Eduardo Mena (III’98) User ontology: WN [name pages] for (AND book (FILLS creator “Carl Sagan”)) Target ontology: Stanford-I Integrated ontology WN-Stanford-I [title number-of-pages] for (AND book (FILLS doc-author-name “Carl Sagan”)) Ontologies sites: http://www.cogsci.princeton.edu/~wn/w3wn.htmlOntologies sites: http://www.cogsci.princeton.edu/~wn/w3wn.html http://www-ksl.stanford.edu/knowledge-sharing/ontologies/html/bibliographic-data/http://www-ksl.stanford.edu/knowledge-sharing/ontologies/html/bibliographic-data/
  34. 34. Estimating information loss for multi-ontology basedEstimating information loss for multi-ontology based query processing in the OBSERVER/InfoQuilt systemquery processing in the OBSERVER/InfoQuilt system “Get title and number of pages of books written by Carl Sagan” Query construction - ExampleQuery construction - Example Eduardo Mena (III’98) User ontology: WN [name pages] for (AND book (FILLS creator “Carl Sagan”)) Target ontology: Stanford-I Integrated ontology WN-Stanford-I [title number-of-pages] for (AND book (FILLS doc-author-name “Carl Sagan”)) Ontologies sites: http://www.cogsci.princeton.edu/~wn/w3wn.htmlOntologies sites: http://www.cogsci.princeton.edu/~wn/w3wn.html http://www-ksl.stanford.edu/knowledge-sharing/ontologies/html/bibliographic-data/http://www-ksl.stanford.edu/knowledge-sharing/ontologies/html/bibliographic-data/ Biblio-Thing Document Book Edited-Book Technical-Report Periodical-Publication Journal Magazine Newspaper Miscellaneous-Publication Technical-Manual Computer-Program Multimedia-DocumentArtwork Cartographic-Map Thesis Doctoral-Thesis Master-Thesis Proceedings Conference Agent Person Author Organization Publisher University Re-use of Knowledge: Bibliography Data OntologyStanford-I
  35. 35. Estimating information loss for multi-ontology basedEstimating information loss for multi-ontology based query processing in the OBSERVER/InfoQuilt systemquery processing in the OBSERVER/InfoQuilt system “Get title and number of pages of books written by Carl Sagan” Query construction - ExampleQuery construction - Example Eduardo Mena (III’98) User ontology: WN [name pages] for (AND book (FILLS creator “Carl Sagan”)) Target ontology: Stanford-I Integrated ontology WN-Stanford-I [title number-of-pages] for (AND book (FILLS doc-author-name “Carl Sagan”)) Ontologies sites: http://www.cogsci.princeton.edu/~wn/w3wn.htmlOntologies sites: http://www.cogsci.princeton.edu/~wn/w3wn.html http://www-ksl.stanford.edu/knowledge-sharing/ontologies/html/bibliographic-data/http://www-ksl.stanford.edu/knowledge-sharing/ontologies/html/bibliographic-data/ Re-use of Knowledge: A subset of WordNet 1.5Print-Media Press Publication Journalism Newspaper Magazine Book Periodical Trade-Book Brochure TextBook Reference-Book SongBook PrayerBook Pictorial Series Journals CookBook Instruction-Book WordBook HandBook Directory Annual Encyclopedia Manual Bible GuideBook Instructions Reference-Manual
  36. 36. Estimating information loss for multi-ontology basedEstimating information loss for multi-ontology based query processing in the OBSERVER/InfoQuilt systemquery processing in the OBSERVER/InfoQuilt system “Get title and number of pages of books written by Carl Sagan” Query construction - ExampleQuery construction - Example Eduardo Mena (III’98) User ontology: WN [name pages] for (AND book (FILLS creator “Carl Sagan”)) Target ontology: Stanford-I Integrated ontology WN-Stanford-I [title number-of-pages] for (AND book (FILLS doc-author-name “Carl Sagan”)) Ontologies sites: http://www.cogsci.princeton.edu/~wn/w3wn.htmlOntologies sites: http://www.cogsci.princeton.edu/~wn/w3wn.html http://www-ksl.stanford.edu/knowledge-sharing/ontologies/html/bibliographic-data/http://www-ksl.stanford.edu/knowledge-sharing/ontologies/html/bibliographic-data/ WN ontology and user query
  37. 37. Estimating information loss for multi-ontology basedEstimating information loss for multi-ontology based query processing in the OBSERVER/InfoQuilt systemquery processing in the OBSERVER/InfoQuilt system Estimating the loss of informationEstimating the loss of information Eduardo Mena (III’98) • To choose the plan with the least loss • To present a level of confidence in the answer • Based on intensional information (terminological difference) • Based on extensional information (precision and recall) Plans in the examplePlans in the example User Query: (AND book (FILLS doc-author-name “Carl Sagan”)) Plan 1: (AND document (FILLS doc-author-name “Carl Sagan”)) Plan 2: (AND periodical-publication (FILLS doc-author-name “Carl Sagan”)) Plan 3: (AND journal (FILLS doc-author-name “Carl Sagan”)) Plan 4: (AND UNION(book, proceedings, thesis, misc-publication, technical-report) (FILLS doc-author-name “Carl Sagan”))
  38. 38. Estimating information loss for multi-ontology basedEstimating information loss for multi-ontology based query processing in the OBSERVER/InfoQuilt systemquery processing in the OBSERVER/InfoQuilt system Loss of information based on intensional informationLoss of information based on intensional information Eduardo Mena (III’98) User Query: (AND book (FILLS doc-author-name “Carl Sagan”)) Plan 1: (AND document (FILLS doc-author-name “Carl Sagan”)) book:=(AND publication (AT-LEAST 1 ISBN)) publication:=(AND document (AT-LEAST 1 place-of-publication)) Loss: “Instead of books written by Carl Sagan, OBSERVER is providing all the documents written by Carl Sagan (even if they do not have an ISBN and place of publication)”
  39. 39. Estimating information loss for multi-ontology basedEstimating information loss for multi-ontology based query processing in the OBSERVER/InfoQuilt systemquery processing in the OBSERVER/InfoQuilt system Example: loss for the plansExample: loss for the plans Eduardo Mena (III’98) Plan 1: (AND document (FILLS doc-author-name “Carl Sagan”)) [case 2] 91.57% < (1-Loss) < 91.75% Plan 2: (AND periodical-publication (FILLS doc-author-name “Carl Sagan”)) 94.03% < (1-Loss) < 100% [case 3] Plan 3: (AND journal (FILLS doc-author-name “Carl Sagan”)) [case 3] 98.56% < (1-Loss) < 100% Plan 4: (AND UNION(book, proceedings, thesis, misc-publication, technical- report) (FILLS doc-author-name “Carl Sagan”)) [case 1] 0% < (1-Loss) < 7.22%
  40. 40. SummarySummary TextText Structured DatabasesStructured Databases DataData Syntax,Syntax, SystemSystem Federated DBFederated DB Semi-structuredSemi-structured MetadataMetadata Structural,Structural, SchematicSchematic Mediator,Mediator, Federated ISFederated IS Visual,Visual, Scientific/Eng.Scientific/Eng. KnowledgeKnowledge SemanticSemantic Knowledge Mgmt.,Knowledge Mgmt., InformationInformation Brokering,Brokering, Cooperative ISCooperative IS
  41. 41. Agenda for researchAgenda for research • Interoperation not at systems level, but at informational and possibly knowledge level – traditional database and information retrieval solutions do not suffice – need to understand context; measures of similarities • Need to increase impetus on semantic level issues involving terminological and contextual differences, possible perceptual or cognitive differences in future – information systems and humans need to cooperate, possible involving a coordination and collaborative processes
  42. 42. http://lsdis.cs.uga.eduhttp://lsdis.cs.uga.edu [See publications on Metadata, Semantics,Context,[See publications on Metadata, Semantics,Context, InfoHarness/InfoQuilt]InfoHarness/InfoQuilt] amit@cs.uga.eduamit@cs.uga.edu Acknowledgements:Acknowledgements: Tarcisio LimaTarcisio Lima Vipul KashyapVipul Kashyap Related ReadingRelated Reading • Books: → Information Brokering for Digital Media, Kashyap and Sheth, Kluwer, 1999 (to appear) → Multimedia Data Management: Using Metadata to Integrate and Apply Digital Media, Sheth and Klas Eds, McGraw-Hill, 1998 → Cooperative Information Systems, Papazoglou and Schlageter Eds., Academic Press, 1998 → Management of Heterogeneous and Autonomous Database Systems, Elmagarmid, Rusinkiewica, Sheth Eds, Morgan Kaufmann, 1998. • Special Issues and Proceedings: → Formal Ontologies in Information Systems, Guarino Ed., IOS Press, 1998 → Semantic Interoperability in Global Information Systems, Ouksel and Sheth, SIGMOD Record, March 1999.

×