Patterns of Semantic IntegrationDan McCrearyPresidentDan McCreary & Associatesdan@danmccreary.com(952) 931-9198MDMetadata Solutions
Licensed Under Creative Commons 3.02Creative Commons 3.0Attribution. You must attribute the work in the manner specified by the author or licensor.  Noncommercial. You may not use this work for commercial purposes.  Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one.BY:$
Patterns of Semantic IntegrationOur ever increasing understanding of solid-state physics has allowed Moore’s Law to proceed unabated for the last 40 years.  Exciting developments in quantum physics, nanotechnology and molecular self-assembly will continue this trend for the foreseeable future.  But why is it that an instructor can’t quickly import a database of 10,000 subject-appropriate lesson plans and quiz items into their learning-management system and dynamically adjust classroom content and assessments to individual student learning styles and interests?  The key to this and other computer-to-computer interoperability challenges lie in the difficulty computer systems have in finding and precisely exchanging data.  Enter the Semantic Web.  The designers of the current world-wide-web realized that the gateway to this does not require faster computers and networks but instead lies in the careful publishing and exchange of data semantics (or meaning) and the precise publishing data-that-describes-data (metadata) in a machine-readable structure.  This presentation will review patterns that researches around the world are using to make the job of computer integration easier allowing even ultimate frisbee™ coaches access to vast amounts of structured information.3
Background for Dan McCrearyCarleton Class of ’82Physics MajorFirst year of “Computer Science Concentrations” ever granted to a Carleton graduateWorked in computer center and Carleton Library with Les Lacroix doing VMS/RMS programming to create first on-line card catalog for science libraryHelped blow up lab equipment for Bruce ThomasSemantic Solutions Consultant in Minneapolis4
5
6Physics 123… intended to give students some perspective on the kinds of work done by people with a physics background…discuss their work and work-related experiencesPhysics taught me how to create and use precise models of the world and to discover underlying patternsComputer to computer communication also requires precise models the discovery of underlying patterns
7AgendaThe steps required for precise exchange of information between computer systemsDefine “semantics” and key concepts in the semantic webHTML, XML, RDFDiscuss limitations of current HTML web and XMLShow how Semantic Web technologies attempts to solve many of these problemsSemantic patternsPredictionsReferences
8Bruce’s Integration ChallengeThe PDP-8GammaRaySpectrometerUranium samples from Columbia minesOhio Scientific6502CarletonVAX1024 ChannelAccumulatorFFT(Fortran)Tektronics4014Terminal8=bitteletype portRS-232port
91970 Sci-Fi Classic: “The Forbin Project”A NewIntersystemLanguage!Lesson: Before you take over the world you mustexchange semantically precise metadata!
10Moore’s LawNote:LogScaleCreative Commons 1.0 Courtesy of Ray Kurzweil and Kurzweil Technologies, Inc
11Thesis: We Need SemanticsFor the next revolution in computingWe don’t need faster CPUsWe don’t need larger hard drivesWe don’t need faster networksWe don’t need more HTML linkingWe need to link our concepts using semantic technologiesThere are standard patterns that are used to solve these problems
12Patterns“Design Patterns” were developed by Christopher Alexander in 1979 in the building architecture domainApplied by “Gang of Four” to object-oriented software in 1994Each pattern has:Name, IconProblem DescriptionSolution DescriptionDiagramsExamplesRelated Patterns
13The Agent VisionThe Semantic Web will bring structure to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can readily carry out sophisticated tasks for users.The Semantic Web  A new form of Web content that is meaningful tocomputers will unleash a revolution of new possibilities  By Tim Berners-Lee, James Hendler and Ora Lassila
Overlapping TerminologyData MiningStatistical AnalysisHTML WebPatternDiscoveryBusiness SemanticsData DictionaryData WarehouseEnterpriseApplication Integration(EAI)SemanticWebRelational DatabaseMetadataMetadataDiscovery14
XMLGUIProc(i1, i2, o1)Object-orientedProgrammingDO I=1, 100I=I+1StructuredProgrammingMOV R0, A1BNE F32CFORTRAN10100101AssemblyLanguageMachineLanguageComputer Science Is About AbstractionLevel ofAbstractionTime15
16Person to Person DialoghigherabstractionProblem SolvingConversationSentencesConceptsWordsSound
17Computer to Computer DialogYou AreHereAgentsSemantic IntegrationGraphs/Ontologies/RDF/OWLDocuments/XML SchemaXML TagsInternet
18Semantic TriangleA pattern of neural activity in our brainConceptRefers ToSymbolizesSymbolReferent“cat”“gato” (Spanish)Stands For“katze” (German)Physical ObjectsOgden, C. K., & Richards, I. A. (1923) The Meaning of Meaning
19Symbols Can Only Directly Link to ConceptsThe link between a symbol is an INDIRECT linkThe referent MUST pass through the ConceptOnly symbols can be transmitted between computersConceptReferentSymbol“cat”Ogden, C. K., & Richards, I. A. (1923) The Meaning of Meaning
20The Problem of Semantic Ambiguitycontext=hardwarecontext=foodDid you say you were looking for mixed nuts?People use context to derive the correct meaning.
2159 meanings of "run"Contexttally"the Yankees scored a run in the bottom of the 9th"test"The experiment ran for over an hour"footrace"she broke mile run record"18 noun"senses"streak"her run of luck was just starting"play"the football 3rd down play was a run"…"13 other noun meanings…""run""the kids ran to the store"move fastscat"I would run from a ticking bomb."41 verb"senses"go"The path runs up the hill."operate"you need training to run this machine."has form"the movie plot runs like this."…"36 other verb meanings…"Source:WordNet at http://wordnet.princeton.edu/
22Analogy: English DictionaryTermMetadata (data about data)DefinitionsNote: people usecontext to findthe correct meaning.source: www.m-w.com
23Word Sensesfootracestreakdurationplaytestgooperatetallymove fasthas formscatA single word mapsTo many concepts“run”
24Synonym RingJoe SmithRefers ToSymbolizesMany symbols forthe same objectStands For<Person>Joe Smith<Person><Individual>Joe Smith<Individual><Human>Joe Smith<Human>
25I’m Thinking of an Animal…Note: since “concepts” are neural patterns in the brain theconcept of “exact” is difficult to measureIt has four legsIt has furIt has whiskersIt chases miceIt goes “meow”If you describe enough of the properties of a concept, you can havereasonable assurances that they are the same
26Concept LinkingsymbolQuestion: How can you tell if two concepts are the same if twosystems don’t share the same symbol?Answer: If they have the same properties (and relationships)you can assume with reasonable probability they arethe same concepts
27Concept OverlapRobo-CatCatKitten
28Semantics is About Concept LinkingWouldn’t it be nice…If computers could name things internally or on a web site however they liked (keep using the current web)But we could always link those names back to a centralized database of conceptsComputers could do this automatically just like they translate domain names (www.google.com) into IP addresses (64.233.187.99)Then we could communicate precisely without dictating the names that are used inside a computer system or on a web page
29HTML Sample<title>The Problem of Semantics</title><p>This is a standard document that is sent between two computers using the <a href="http://w3c.org/Protocols">HTTP<a> protocol.  Note that other then the markup tags like <b>bold</b> there is very little that a computer can do to understand the meaning of the text.</p>Unless computers "understand" the words in the English language it will be very difficult for them to understand the meaning or semantics of the web.
30What Computers "See" Today<title>The Problem of Semantics</title><p>This is a standard document that is sent between two computers using the <a href="http://w3c.org">HTTP<a>protocol.  Note that other then the markup tags like <b>bold</b>there is very little that a computer can do to understand the meaning of the text.</p>Today computers see the web as linked opaque strings with keywords
Unless computers "understand" the words in the English language it will be very difficult for them to understand the meaning or semantics of the web31XML allows you to create new “tags”<tag></tag>data<PersonGivenName>Joe</PersonGivenName><PersonFamilyName>Smith</PersonFamilyName><Address>123 Main Street</Address><City>Anytown</City><State>Minnesota</State><Phone>(651) 555-1234</Phone>Without a data dictionary, it is difficult to know what the meaning of the data elements is.  The tags appear in patterns but what they "mean" is still a mystery to a computer.
32Which external computers may not understand<PersonGivenName>Dan</PersonGivenName><PersonFamilyName>McCreary</PersonFamilyName><Address>123 Main Street</Address><City>Minneapolis</City><Phone>(651) 555-1234</Phone>Without a “data dictionary”, it is difficult to know what the meaning of the data elements is.  The tags appear in patterns but what they mean is still a mystery to a computer.
33MetadataMetadata & OntologiesMetadata is any data that describes other dataMetadata is itself data and is stored in specialized structures (directed graphs) to aid comparison with other metadataA controlled store of metadata is called a “registry”Complex directed graphs can evolve into “ontologies”describesDatasource-codeRDBMSweb navigationtablesorg-chartcolumnsdocument keywordsproduct-specs
34Hypertext Links and Data Element LinksThe Hypertext WebMetadataRegistry AMetadataRegistry BThe Semantic WebThe semantic web is about linking conceptual data elements in published metadata registriesThe current HTML web is focused on linking published documents with HTML
35Enter the URI…Today's web allows documents to be accessed by people if people put links in between documents – the hypertext webBut it is very difficult for machines to "understand" what we are saying and what we mean and what to do with the dataBut machines CAN determine if two URIs match:<SurName>Smith<SurName><LastName>Smith</LastName>Hey, you both “mean” the same thing!http://www.shared_dictionary.com/PersonGivenNameMDR
36Subject-Verb-Object TriplePersonHas-a-Given-NameThe person is named “Joe”.“Joe”<PersonGivenName>Joe</PersonGivenName>
37Triples are Almost all URIshttp://MyDictionay/DataElement/Personhttp://MyDictionay/DataElement/PersonGivenName“Dan”The “type” of link.URIs can point to a standard location in a metadata registry.
38Sample RDF Document<?xml version="1.0"?><RDF><Descriptionabout="http://www.danmccreary.com/Training/Classes/Semantic_Web"><author>Dan McCreary</author><created>2006-01-01</created><modified> 2006-03-15</modified></Description></RDF>
39Massive Databases of "Triple Stores"RDF "Triple Store"Triple store is:- A database with just 3 Columns- but millions/billions of rowsMay require specialized hardwareKey Metrics: - Time to load triples into application - Time to save triples into database - Time to browse to an element - Time to configure systemSample Projects:Kowari
3Store
SesameSee: http://simile.mit.edu/reports/stores/
40Semantic Web Standards StackTrusted Semantic WebProofLogicRules/QuerySignatureEncryptionOntology (OWL)RDF Model & SyntaxXML QueryXML SchemaXMLNamespacesURI/IRIUnicodeSource: Tim Berners-Lee www.w3c.orghttp://www.w3.org/Consortium/Offices/Presentations/SemanticWeb/34.html
41Example of Metadata Registry
42Hub and SpokesGoal: create semantic maps to a few metadata standard, not many standardsR1R1R2RNR2RNESBR3R3R7R7R4R6R4R6R5R5Mapping from one to many metadata registry to N other metadata registries: The O(N2) problemMapping to one metadata registryThe O(N) problem(ESB-Enterprise Service Bus)
43Metaphor: The Translator AgentComingright up!May I have a beer?Me gusteria una cervezaTranslationService(Speaks Spanishand English)InternalServer(English Only)Customer(Spanish Only)
44Metadata RegistryMetadataTranslationServiceRDFQueriesMetadata MappingsXMLResultsModel AModel BSQL or XMLAQueriesIn ModelBData Warehouse (RDBMS)XMLResponseIn ModelATDSIn ModelBSemantic Mappers and Semantic BrokersReportRequestIn ModelAXMLA: XML for AnalysisGartner: Vocabulary-based transformation
45Wikipedia Rocks!Knowledge is growing at an exponential rateThe more there is out there, the more need there is to re-use rather that reinvent knowledgeTools can extract 50M RDF triplesHow many instructors share their database of exam questions and the effectiveness of each question?See: Wikipedia: “Semantic Wiki”
46Open Source Learning Mgmt. System
47Retrieving Data: An EvolutionIncreasing Responsiveness Monthly “Green Bar” ReportsBrowseableGraphical Interface(PivotTables, Cognos)Shorten the time-to-report intervalAllow users to "browse" data sets interactivelyRemove programmers with "backlogs" of reportsUsers frequently waited days, weeks for months to get a custom report created
48Metadata DiscoveryTools that “scan” data sources and create new ontologies or mappings to existing ontologiesRelational DatabaseMetadata RegistryData Source Mappings
49Classification and CategorizationWhenever we decide to break the continuous observable world into a predefined list of categories when each category has a label we call this a categorical value.  These will then become the "dimensions" of our cube.Discrete breaks in continuous values become “rules”"green""red""blue"Note: NO OVERLAP!$500$0“normal expense"“large expense“ (requires supervisor approval)George Lakoff: Women, Fire and Other Dangerous Things: What Categories Revel about the Mind
50Federated OntologiesWhat do you do when you have more than one Ontology?1) Combine2) Map3) FederateTools for combination and federation
“Linking is Power”Multiple Overlapping Ontologies
51Cost of Poor SemanticsInformation Technology Departments can spend 40-60% of their costs on Integration90% of integration costs are due to poor semanticsIf every application used and "published" a machine readable ontology with mappings to published ontologies integration could be almost "automatic"
52GartnerMetadata cast into formal logics will drive interoperability, automation, cost cutting, better search capabilities and new business opportunities.Semantic Web Drives Data Management, Automation and Knowledge and DiscoveryAlexander LinderMarch 2005G00125145
53Semantic SpectrumHighSemanticPrecisionStrongSemanticsOntologiesTaxonomiesOWLEnterprise Data ModelsConcept MapsControlled VocabulariesRDFThesaurusUML, XMIGlossariesXML, XSLTWord/HTMLWeakSemanticsTime/MoneySee also: Wikipedia/semantic spectrum
54Structures for Increased SemanticsHTML   PDF     Word PowerPoint Excel Access Server  XML    RDBMS        RDF     Taxonomies OntologiesSOAWSDLIncreased Semantic PrecisionSource: Network Inference
55Friend of a FriendA "Proof of Concept for RDF"
Requires each person to put an RDF file on their web pages
System in place to prevent spammers from getting e-mail accounts
Sample RDF vocabulary

Semantic Integration Patterns

  • 1.
    Patterns of SemanticIntegrationDan McCrearyPresidentDan McCreary & Associatesdan@danmccreary.com(952) 931-9198MDMetadata Solutions
  • 2.
    Licensed Under CreativeCommons 3.02Creative Commons 3.0Attribution. You must attribute the work in the manner specified by the author or licensor. Noncommercial. You may not use this work for commercial purposes. Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one.BY:$
  • 3.
    Patterns of SemanticIntegrationOur ever increasing understanding of solid-state physics has allowed Moore’s Law to proceed unabated for the last 40 years.  Exciting developments in quantum physics, nanotechnology and molecular self-assembly will continue this trend for the foreseeable future.  But why is it that an instructor can’t quickly import a database of 10,000 subject-appropriate lesson plans and quiz items into their learning-management system and dynamically adjust classroom content and assessments to individual student learning styles and interests?  The key to this and other computer-to-computer interoperability challenges lie in the difficulty computer systems have in finding and precisely exchanging data.  Enter the Semantic Web.  The designers of the current world-wide-web realized that the gateway to this does not require faster computers and networks but instead lies in the careful publishing and exchange of data semantics (or meaning) and the precise publishing data-that-describes-data (metadata) in a machine-readable structure.  This presentation will review patterns that researches around the world are using to make the job of computer integration easier allowing even ultimate frisbee™ coaches access to vast amounts of structured information.3
  • 4.
    Background for DanMcCrearyCarleton Class of ’82Physics MajorFirst year of “Computer Science Concentrations” ever granted to a Carleton graduateWorked in computer center and Carleton Library with Les Lacroix doing VMS/RMS programming to create first on-line card catalog for science libraryHelped blow up lab equipment for Bruce ThomasSemantic Solutions Consultant in Minneapolis4
  • 5.
  • 6.
    6Physics 123… intendedto give students some perspective on the kinds of work done by people with a physics background…discuss their work and work-related experiencesPhysics taught me how to create and use precise models of the world and to discover underlying patternsComputer to computer communication also requires precise models the discovery of underlying patterns
  • 7.
    7AgendaThe steps requiredfor precise exchange of information between computer systemsDefine “semantics” and key concepts in the semantic webHTML, XML, RDFDiscuss limitations of current HTML web and XMLShow how Semantic Web technologies attempts to solve many of these problemsSemantic patternsPredictionsReferences
  • 8.
    8Bruce’s Integration ChallengeThePDP-8GammaRaySpectrometerUranium samples from Columbia minesOhio Scientific6502CarletonVAX1024 ChannelAccumulatorFFT(Fortran)Tektronics4014Terminal8=bitteletype portRS-232port
  • 9.
    91970 Sci-Fi Classic:“The Forbin Project”A NewIntersystemLanguage!Lesson: Before you take over the world you mustexchange semantically precise metadata!
  • 10.
    10Moore’s LawNote:LogScaleCreative Commons1.0 Courtesy of Ray Kurzweil and Kurzweil Technologies, Inc
  • 11.
    11Thesis: We NeedSemanticsFor the next revolution in computingWe don’t need faster CPUsWe don’t need larger hard drivesWe don’t need faster networksWe don’t need more HTML linkingWe need to link our concepts using semantic technologiesThere are standard patterns that are used to solve these problems
  • 12.
    12Patterns“Design Patterns” weredeveloped by Christopher Alexander in 1979 in the building architecture domainApplied by “Gang of Four” to object-oriented software in 1994Each pattern has:Name, IconProblem DescriptionSolution DescriptionDiagramsExamplesRelated Patterns
  • 13.
    13The Agent VisionTheSemantic Web will bring structure to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can readily carry out sophisticated tasks for users.The Semantic Web A new form of Web content that is meaningful tocomputers will unleash a revolution of new possibilities By Tim Berners-Lee, James Hendler and Ora Lassila
  • 14.
    Overlapping TerminologyData MiningStatisticalAnalysisHTML WebPatternDiscoveryBusiness SemanticsData DictionaryData WarehouseEnterpriseApplication Integration(EAI)SemanticWebRelational DatabaseMetadataMetadataDiscovery14
  • 15.
    XMLGUIProc(i1, i2, o1)Object-orientedProgrammingDOI=1, 100I=I+1StructuredProgrammingMOV R0, A1BNE F32CFORTRAN10100101AssemblyLanguageMachineLanguageComputer Science Is About AbstractionLevel ofAbstractionTime15
  • 16.
    16Person to PersonDialoghigherabstractionProblem SolvingConversationSentencesConceptsWordsSound
  • 17.
    17Computer to ComputerDialogYou AreHereAgentsSemantic IntegrationGraphs/Ontologies/RDF/OWLDocuments/XML SchemaXML TagsInternet
  • 18.
    18Semantic TriangleA patternof neural activity in our brainConceptRefers ToSymbolizesSymbolReferent“cat”“gato” (Spanish)Stands For“katze” (German)Physical ObjectsOgden, C. K., & Richards, I. A. (1923) The Meaning of Meaning
  • 19.
    19Symbols Can OnlyDirectly Link to ConceptsThe link between a symbol is an INDIRECT linkThe referent MUST pass through the ConceptOnly symbols can be transmitted between computersConceptReferentSymbol“cat”Ogden, C. K., & Richards, I. A. (1923) The Meaning of Meaning
  • 20.
    20The Problem ofSemantic Ambiguitycontext=hardwarecontext=foodDid you say you were looking for mixed nuts?People use context to derive the correct meaning.
  • 21.
    2159 meanings of"run"Contexttally"the Yankees scored a run in the bottom of the 9th"test"The experiment ran for over an hour"footrace"she broke mile run record"18 noun"senses"streak"her run of luck was just starting"play"the football 3rd down play was a run"…"13 other noun meanings…""run""the kids ran to the store"move fastscat"I would run from a ticking bomb."41 verb"senses"go"The path runs up the hill."operate"you need training to run this machine."has form"the movie plot runs like this."…"36 other verb meanings…"Source:WordNet at http://wordnet.princeton.edu/
  • 22.
    22Analogy: English DictionaryTermMetadata(data about data)DefinitionsNote: people usecontext to findthe correct meaning.source: www.m-w.com
  • 23.
    23Word Sensesfootracestreakdurationplaytestgooperatetallymove fasthasformscatA single word mapsTo many concepts“run”
  • 24.
    24Synonym RingJoe SmithRefersToSymbolizesMany symbols forthe same objectStands For<Person>Joe Smith<Person><Individual>Joe Smith<Individual><Human>Joe Smith<Human>
  • 25.
    25I’m Thinking ofan Animal…Note: since “concepts” are neural patterns in the brain theconcept of “exact” is difficult to measureIt has four legsIt has furIt has whiskersIt chases miceIt goes “meow”If you describe enough of the properties of a concept, you can havereasonable assurances that they are the same
  • 26.
    26Concept LinkingsymbolQuestion: Howcan you tell if two concepts are the same if twosystems don’t share the same symbol?Answer: If they have the same properties (and relationships)you can assume with reasonable probability they arethe same concepts
  • 27.
  • 28.
    28Semantics is AboutConcept LinkingWouldn’t it be nice…If computers could name things internally or on a web site however they liked (keep using the current web)But we could always link those names back to a centralized database of conceptsComputers could do this automatically just like they translate domain names (www.google.com) into IP addresses (64.233.187.99)Then we could communicate precisely without dictating the names that are used inside a computer system or on a web page
  • 29.
    29HTML Sample<title>The Problemof Semantics</title><p>This is a standard document that is sent between two computers using the <a href="http://w3c.org/Protocols">HTTP<a> protocol. Note that other then the markup tags like <b>bold</b> there is very little that a computer can do to understand the meaning of the text.</p>Unless computers "understand" the words in the English language it will be very difficult for them to understand the meaning or semantics of the web.
  • 30.
    30What Computers "See"Today<title>The Problem of Semantics</title><p>This is a standard document that is sent between two computers using the <a href="http://w3c.org">HTTP<a>protocol. Note that other then the markup tags like <b>bold</b>there is very little that a computer can do to understand the meaning of the text.</p>Today computers see the web as linked opaque strings with keywords
  • 31.
    Unless computers "understand"the words in the English language it will be very difficult for them to understand the meaning or semantics of the web31XML allows you to create new “tags”<tag></tag>data<PersonGivenName>Joe</PersonGivenName><PersonFamilyName>Smith</PersonFamilyName><Address>123 Main Street</Address><City>Anytown</City><State>Minnesota</State><Phone>(651) 555-1234</Phone>Without a data dictionary, it is difficult to know what the meaning of the data elements is. The tags appear in patterns but what they "mean" is still a mystery to a computer.
  • 32.
    32Which external computersmay not understand<PersonGivenName>Dan</PersonGivenName><PersonFamilyName>McCreary</PersonFamilyName><Address>123 Main Street</Address><City>Minneapolis</City><Phone>(651) 555-1234</Phone>Without a “data dictionary”, it is difficult to know what the meaning of the data elements is. The tags appear in patterns but what they mean is still a mystery to a computer.
  • 33.
    33MetadataMetadata & OntologiesMetadatais any data that describes other dataMetadata is itself data and is stored in specialized structures (directed graphs) to aid comparison with other metadataA controlled store of metadata is called a “registry”Complex directed graphs can evolve into “ontologies”describesDatasource-codeRDBMSweb navigationtablesorg-chartcolumnsdocument keywordsproduct-specs
  • 34.
    34Hypertext Links andData Element LinksThe Hypertext WebMetadataRegistry AMetadataRegistry BThe Semantic WebThe semantic web is about linking conceptual data elements in published metadata registriesThe current HTML web is focused on linking published documents with HTML
  • 35.
    35Enter the URI…Today'sweb allows documents to be accessed by people if people put links in between documents – the hypertext webBut it is very difficult for machines to "understand" what we are saying and what we mean and what to do with the dataBut machines CAN determine if two URIs match:<SurName>Smith<SurName><LastName>Smith</LastName>Hey, you both “mean” the same thing!http://www.shared_dictionary.com/PersonGivenNameMDR
  • 36.
    36Subject-Verb-Object TriplePersonHas-a-Given-NameThe personis named “Joe”.“Joe”<PersonGivenName>Joe</PersonGivenName>
  • 37.
    37Triples are Almostall URIshttp://MyDictionay/DataElement/Personhttp://MyDictionay/DataElement/PersonGivenName“Dan”The “type” of link.URIs can point to a standard location in a metadata registry.
  • 38.
    38Sample RDF Document<?xmlversion="1.0"?><RDF><Descriptionabout="http://www.danmccreary.com/Training/Classes/Semantic_Web"><author>Dan McCreary</author><created>2006-01-01</created><modified> 2006-03-15</modified></Description></RDF>
  • 39.
    39Massive Databases of"Triple Stores"RDF "Triple Store"Triple store is:- A database with just 3 Columns- but millions/billions of rowsMay require specialized hardwareKey Metrics: - Time to load triples into application - Time to save triples into database - Time to browse to an element - Time to configure systemSample Projects:Kowari
  • 40.
  • 41.
  • 42.
    40Semantic Web StandardsStackTrusted Semantic WebProofLogicRules/QuerySignatureEncryptionOntology (OWL)RDF Model & SyntaxXML QueryXML SchemaXMLNamespacesURI/IRIUnicodeSource: Tim Berners-Lee www.w3c.orghttp://www.w3.org/Consortium/Offices/Presentations/SemanticWeb/34.html
  • 43.
  • 44.
    42Hub and SpokesGoal:create semantic maps to a few metadata standard, not many standardsR1R1R2RNR2RNESBR3R3R7R7R4R6R4R6R5R5Mapping from one to many metadata registry to N other metadata registries: The O(N2) problemMapping to one metadata registryThe O(N) problem(ESB-Enterprise Service Bus)
  • 45.
    43Metaphor: The TranslatorAgentComingright up!May I have a beer?Me gusteria una cervezaTranslationService(Speaks Spanishand English)InternalServer(English Only)Customer(Spanish Only)
  • 46.
    44Metadata RegistryMetadataTranslationServiceRDFQueriesMetadata MappingsXMLResultsModelAModel BSQL or XMLAQueriesIn ModelBData Warehouse (RDBMS)XMLResponseIn ModelATDSIn ModelBSemantic Mappers and Semantic BrokersReportRequestIn ModelAXMLA: XML for AnalysisGartner: Vocabulary-based transformation
  • 47.
    45Wikipedia Rocks!Knowledge isgrowing at an exponential rateThe more there is out there, the more need there is to re-use rather that reinvent knowledgeTools can extract 50M RDF triplesHow many instructors share their database of exam questions and the effectiveness of each question?See: Wikipedia: “Semantic Wiki”
  • 48.
  • 49.
    47Retrieving Data: AnEvolutionIncreasing Responsiveness Monthly “Green Bar” ReportsBrowseableGraphical Interface(PivotTables, Cognos)Shorten the time-to-report intervalAllow users to "browse" data sets interactivelyRemove programmers with "backlogs" of reportsUsers frequently waited days, weeks for months to get a custom report created
  • 50.
    48Metadata DiscoveryTools that“scan” data sources and create new ontologies or mappings to existing ontologiesRelational DatabaseMetadata RegistryData Source Mappings
  • 51.
    49Classification and CategorizationWheneverwe decide to break the continuous observable world into a predefined list of categories when each category has a label we call this a categorical value. These will then become the "dimensions" of our cube.Discrete breaks in continuous values become “rules”"green""red""blue"Note: NO OVERLAP!$500$0“normal expense"“large expense“ (requires supervisor approval)George Lakoff: Women, Fire and Other Dangerous Things: What Categories Revel about the Mind
  • 52.
    50Federated OntologiesWhat doyou do when you have more than one Ontology?1) Combine2) Map3) FederateTools for combination and federation
  • 53.
    “Linking is Power”MultipleOverlapping Ontologies
  • 54.
    51Cost of PoorSemanticsInformation Technology Departments can spend 40-60% of their costs on Integration90% of integration costs are due to poor semanticsIf every application used and "published" a machine readable ontology with mappings to published ontologies integration could be almost "automatic"
  • 55.
    52GartnerMetadata cast intoformal logics will drive interoperability, automation, cost cutting, better search capabilities and new business opportunities.Semantic Web Drives Data Management, Automation and Knowledge and DiscoveryAlexander LinderMarch 2005G00125145
  • 56.
    53Semantic SpectrumHighSemanticPrecisionStrongSemanticsOntologiesTaxonomiesOWLEnterprise DataModelsConcept MapsControlled VocabulariesRDFThesaurusUML, XMIGlossariesXML, XSLTWord/HTMLWeakSemanticsTime/MoneySee also: Wikipedia/semantic spectrum
  • 57.
    54Structures for IncreasedSemanticsHTML PDF Word PowerPoint Excel Access Server XML RDBMS RDF Taxonomies OntologiesSOAWSDLIncreased Semantic PrecisionSource: Network Inference
  • 58.
    55Friend of aFriendA "Proof of Concept for RDF"
  • 59.
    Requires each personto put an RDF file on their web pages
  • 60.
    System in placeto prevent spammers from getting e-mail accounts
  • 61.