Semantic Integration Patterns
Upcoming SlideShare
Loading in...5
×
 

Semantic Integration Patterns

on

  • 3,210 views

How software developers need to manage metadata and data dictionaries to make software integration faster and more cost effective. This presentation is a general overview of the concepts around data ...

How software developers need to manage metadata and data dictionaries to make software integration faster and more cost effective. This presentation is a general overview of the concepts around data semantics for college-level students. This presentation was originally created for a seminar at Carleton College.

Statistics

Views

Total Views
3,210
Slideshare-icon Views on SlideShare
3,207
Embed Views
3

Actions

Likes
6
Downloads
49
Comments
0

1 Embed 3

http://www.linkedin.com 3

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Semantic Integration Patterns Semantic Integration Patterns Presentation Transcript

    • Patterns of Semantic Integration
      Dan McCreary
      President
      Dan McCreary & Associates
      dan@danmccreary.com
      (952) 931-9198
      M
      D
      Metadata Solutions
    • Licensed Under Creative Commons 3.0
      2
      Creative Commons 3.0
      Attribution. You must attribute the work in the manner specified by the author or licensor.
      Noncommercial. You may not use this work for commercial purposes.
      Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under a license identical to this one.
      BY:
      $
    • Patterns of Semantic Integration
      Our ever increasing understanding of solid-state physics has allowed Moore’s Law to proceed unabated for the last 40 years.  Exciting developments in quantum physics, nanotechnology and molecular self-assembly will continue this trend for the foreseeable future.  But why is it that an instructor can’t quickly import a database of 10,000 subject-appropriate lesson plans and quiz items into their learning-management system and dynamically adjust classroom content and assessments to individual student learning styles and interests?  The key to this and other computer-to-computer interoperability challenges lie in the difficulty computer systems have in finding and precisely exchanging data.  Enter the Semantic Web.  The designers of the current world-wide-web realized that the gateway to this does not require faster computers and networks but instead lies in the careful publishing and exchange of data semantics (or meaning) and the precise publishing data-that-describes-data (metadata) in a machine-readable structure.  This presentation will review patterns that researches around the world are using to make the job of computer integration easier allowing even ultimate frisbee™ coaches access to vast amounts of structured information.
      3
    • Background for Dan McCreary
      Carleton Class of ’82
      Physics Major
      First year of “Computer Science Concentrations” ever granted to a Carleton graduate
      Worked in computer center and Carleton Library with Les Lacroix doing VMS/RMS programming to create first on-line card catalog for science library
      Helped blow up lab equipment for Bruce Thomas
      Semantic Solutions Consultant in Minneapolis
      4
    • 5
    • 6
      Physics 123
      … intended to give students some perspective on the kinds of work done by people with a physics background…discuss their work and work-related experiences
      Physics taught me how to create and use precise models of the world and to discover underlying patterns
      Computer to computer communication also requires precise models the discovery of underlying patterns
    • 7
      Agenda
      The steps required for precise exchange of information between computer systems
      Define “semantics” and key concepts in the semantic web
      • HTML, XML, RDF
      Discuss limitations of current HTML web and XML
      Show how Semantic Web technologies attempts to solve many of these problems
      Semantic patterns
      Predictions
      References
    • 8
      Bruce’s Integration Challenge
      The PDP-8
      Gamma
      Ray
      Spectrometer
      Uranium samples from Columbia mines
      Ohio Scientific
      6502
      Carleton
      VAX
      1024 ChannelAccumulator
      FFT
      (Fortran)
      Tektronics
      4014
      Terminal
      8=bitteletype port
      RS-232
      port
    • 9
      1970 Sci-Fi Classic: “The Forbin Project”
      A New
      Intersystem
      Language!
      Lesson: Before you take over the world you mustexchange semantically precise metadata!
    • 10
      Moore’s Law
      Note:
      Log
      Scale
      Creative Commons 1.0 Courtesy of Ray Kurzweil and Kurzweil Technologies, Inc
    • 11
      Thesis: We Need Semantics
      For the next revolution in computing
      We don’t need faster CPUs
      We don’t need larger hard drives
      We don’t need faster networks
      We don’t need more HTML linking
      We need to link our concepts using semantic technologies
      There are standard patterns that are used to solve these problems
    • 12
      Patterns
      “Design Patterns” were developed by Christopher Alexander in 1979 in the building architecture domain
      Applied by “Gang of Four” to object-oriented software in 1994
      Each pattern has:
      Name, Icon
      Problem Description
      Solution Description
      Diagrams
      Examples
      Related Patterns
    • 13
      The Agent Vision
      The Semantic Web will bring structure to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can readily carry out sophisticated tasks for users.
      The Semantic Web
      A new form of Web content that is meaningful tocomputers will unleash a revolution of new possibilities
      By Tim Berners-Lee, James Hendler and Ora Lassila
    • Overlapping Terminology
      Data Mining
      Statistical Analysis
      HTML Web
      PatternDiscovery
      Business Semantics
      Data Dictionary
      Data Warehouse
      Enterprise
      Application Integration
      (EAI)
      SemanticWeb
      Relational Database
      Metadata
      Metadata
      Discovery
      14
    • XML
      GUI
      Proc(i1, i2, o1)
      Object-orientedProgramming
      DO I=1, 100I=I+1
      StructuredProgramming
      MOV R0, A1BNE F32C
      FORTRAN
      10100101
      AssemblyLanguage
      MachineLanguage
      Computer Science Is About Abstraction
      Level ofAbstraction
      Time
      15
    • 16
      Person to Person Dialog
      higherabstraction
      Problem Solving
      Conversation
      Sentences
      Concepts
      Words
      Sound
    • 17
      Computer to Computer Dialog
      You Are
      Here
      Agents
      Semantic Integration
      Graphs/Ontologies/RDF/OWL
      Documents/XML Schema
      XML Tags
      Internet
    • 18
      Semantic Triangle
      A pattern of neural activity in our brain
      Concept
      Refers To
      Symbolizes
      Symbol
      Referent
      “cat”
      “gato” (Spanish)
      Stands For
      “katze” (German)
      Physical Objects
      Ogden, C. K., & Richards, I. A. (1923) The Meaning of Meaning
    • 19
      Symbols Can Only Directly Link to Concepts
      The link between a symbol is an INDIRECT link
      The referent MUST pass through the Concept
      Only symbols can be transmitted between computers
      Concept
      Referent
      Symbol
      “cat”
      Ogden, C. K., & Richards, I. A. (1923) The Meaning of Meaning
    • 20
      The Problem of Semantic Ambiguity
      context=hardware
      context=food
      Did you say you were looking for mixed nuts?
      People use context to derive the correct meaning.
    • 21
      59 meanings of "run"
      Context
      tally
      "the Yankees scored a run in the bottom of the 9th"
      test
      "The experiment ran for over an hour"
      footrace
      "she broke mile run record"
      18 noun
      "senses"
      streak
      "her run of luck was just starting"
      play
      "the football 3rd down play was a run"

      "13 other noun meanings…"
      "run"
      "the kids ran to the store"
      move fast
      scat
      "I would run from a ticking bomb."
      41 verb
      "senses"
      go
      "The path runs up the hill."
      operate
      "you need training to run this machine."
      has form
      "the movie plot runs like this."

      "36 other verb meanings…"
      Source:WordNet at http://wordnet.princeton.edu/
    • 22
      Analogy: English Dictionary
      Term
      Metadata (data about data)
      Definitions
      Note: people use
      context to find
      the correct meaning.
      source: www.m-w.com
    • 23
      Word Senses
      footrace
      streak
      duration
      play
      test
      go
      operate
      tally
      move fast
      has form
      scat
      A single word maps
      To many concepts
      “run”
    • 24
      Synonym Ring
      Joe Smith
      Refers To
      Symbolizes
      Many symbols forthe same object
      Stands For
      <Person>Joe Smith<Person>
      <Individual>Joe Smith<Individual>
      <Human>Joe Smith<Human>
    • 25
      I’m Thinking of an Animal…
      Note: since “concepts” are neural patterns in the brain theconcept of “exact” is difficult to measure
      It has four legs
      It has fur
      It has whiskers
      It chases mice
      It goes “meow”
      If you describe enough of the properties of a concept, you can havereasonable assurances that they are the same
    • 26
      Concept Linking
      symbol
      Question: How can you tell if two concepts are the same if twosystems don’t share the same symbol?
      Answer: If they have the same properties (and relationships)
      you can assume with reasonable probability they arethe same concepts
    • 27
      Concept Overlap
      Robo-Cat
      Cat
      Kitten
    • 28
      Semantics is About Concept Linking
      Wouldn’t it be nice…
      If computers could name things internally or on a web site however they liked (keep using the current web)
      But we could always link those names back to a centralized database of concepts
      Computers could do this automatically just like they translate domain names (www.google.com) into IP addresses (64.233.187.99)
      Then we could communicate precisely without dictating the names that are used inside a computer system or on a web page
    • 29
      HTML Sample
      <title>The Problem of Semantics</title>
      <p>This is a standard document that is sent between two computers using the <a href="http://w3c.org/Protocols">HTTP<a> protocol. Note that other then the markup tags like <b>bold</b> there is very little that a computer can do to understand the meaning of the text.</p>
      Unless computers "understand" the words in the English language it will be very difficult for them to understand the meaning or semantics of the web.
    • 30
      What Computers "See" Today
      <title>The Problem of Semantics</title>
      <p>This is a standard document that is sent between two computers using the <a href="http://w3c.org">HTTP<a>protocol. Note that other then the markup tags like <b>bold</b>there is very little that a computer can do to understand the meaning of the text.</p>
      • Today computers see the web as linked opaque strings with keywords
      • Unless computers "understand" the words in the English language it will be very difficult for them to understand the meaning or semantics of the web
    • 31
      XML allows you to create new “tags”
      <tag>
      </tag>
      data
      <PersonGivenName>Joe</PersonGivenName>
      <PersonFamilyName>Smith</PersonFamilyName>
      <Address>123 Main Street</Address>
      <City>Anytown</City>
      <State>Minnesota</State>
      <Phone>(651) 555-1234</Phone>
      Without a data dictionary, it is difficult to know what the meaning of the data elements is. The tags appear in patterns but what they "mean" is still a mystery to a computer.
    • 32
      Which external computers may not understand
      <PersonGivenName>Dan</PersonGivenName>
      <PersonFamilyName>McCreary</PersonFamilyName>
      <Address>123 Main Street</Address>
      <City>Minneapolis</City>
      <Phone>(651) 555-1234</Phone>
      Without a “data dictionary”, it is difficult to know what the meaning of the data elements is. The tags appear in patterns but what they mean is still a mystery to a computer.
    • 33
      Metadata
      Metadata & Ontologies
      Metadata is any data that describes other data
      Metadata is itself data and is stored in specialized structures (directed graphs) to aid comparison with other metadata
      A controlled store of metadata is called a “registry”
      Complex directed graphs can evolve into “ontologies”
      describes
      Data
      source-code
      RDBMS
      web navigation
      tables
      org-chart
      columns
      document keywords
      product-specs
    • 34
      Hypertext Links and Data Element Links
      The Hypertext Web
      MetadataRegistry A
      MetadataRegistry B
      The Semantic Web
      The semantic web is about linking conceptual data elements in published metadata registries
      The current HTML web is focused on linking published documents with HTML
    • 35
      Enter the URI…
      Today's web allows documents to be accessed by people if people put links in between documents – the hypertext web
      But it is very difficult for machines to "understand" what we are saying and what we mean and what to do with the data
      But machines CAN determine if two URIs match:
      <SurName>Smith<SurName>
      <LastName>Smith</LastName>
      Hey, you both “mean” the same thing!
      http://www.shared_dictionary.com/PersonGivenName
      MDR
    • 36
      Subject-Verb-Object Triple
      Person
      Has-a-Given-Name
      The person is named “Joe”.
      “Joe”
      <PersonGivenName>Joe</PersonGivenName>
    • 37
      Triples are Almost all URIs
      http://MyDictionay/DataElement/Person
      http://MyDictionay/DataElement/PersonGivenName
      “Dan”
      The “type” of link.
      URIs can point to a standard location in a metadata registry.
    • 38
      Sample RDF Document
      <?xml version="1.0"?>
      <RDF>
      <Descriptionabout="http://www.danmccreary.com/Training/Classes/Semantic_Web">
      <author>Dan McCreary</author>
      <created>2006-01-01</created>
      <modified> 2006-03-15</modified>
      </Description>
      </RDF>
    • 39
      Massive Databases of "Triple Stores"
      RDF "Triple Store"
      Triple store is:
      - A database with just 3 Columns
      - but millions/billions of rows
      May require specialized hardware
      Key Metrics:
      - Time to load triples into application
      - Time to save triples into database
      - Time to browse to an element
      - Time to configure system
      Sample Projects:
      • Kowari
      • 3Store
      • Sesame
      See: http://simile.mit.edu/reports/stores/
    • 40
      Semantic Web Standards Stack
      Trusted Semantic Web
      Proof
      Logic
      Rules/Query
      Signature
      Encryption
      Ontology (OWL)
      RDF Model & Syntax
      XML Query
      XML Schema
      XML
      Namespaces
      URI/IRI
      Unicode
      Source: Tim Berners-Lee www.w3c.org
      http://www.w3.org/Consortium/Offices/Presentations/SemanticWeb/34.html
    • 41
      Example of Metadata Registry
    • 42
      Hub and Spokes
      Goal: create semantic maps to a few metadata standard, not many standards
      R1
      R1
      R2
      RN
      R2
      RN
      ESB
      R3
      R3
      R7
      R7
      R4
      R6
      R4
      R6
      R5
      R5
      Mapping from one to many metadata registry to N other metadata registries: The O(N2) problem
      Mapping to one metadata registryThe O(N) problem
      (ESB-Enterprise Service Bus)
    • 43
      Metaphor: The Translator Agent
      Coming
      right up!
      May I have a beer?
      Me gusteria una cerveza
      Translation
      Service
      (Speaks Spanishand English)
      Internal
      Server
      (English Only)
      Customer
      (Spanish Only)
    • 44
      Metadata Registry
      Metadata
      Translation
      Service
      RDF
      Queries
      Metadata Mappings
      XML
      Results
      Model A
      Model B
      SQL or XMLA
      Queries
      In ModelB
      Data Warehouse (RDBMS)
      XMLResponse
      In Model
      A
      TDS
      In ModelB
      Semantic Mappers and Semantic Brokers
      Report
      Request
      In Model
      A
      XMLA: XML for Analysis
      Gartner: Vocabulary-based transformation
    • 45
      Wikipedia Rocks!
      Knowledge is growing at an exponential rate
      The more there is out there, the more need there is to re-use rather that reinvent knowledge
      Tools can extract 50M RDF triples
      How many instructors share their database of exam questions and the effectiveness of each question?
      See: Wikipedia: “Semantic Wiki”
    • 46
      Open Source Learning Mgmt. System
    • 47
      Retrieving Data: An Evolution
      Increasing Responsiveness
      Monthly “Green Bar” Reports
      BrowseableGraphical Interface
      (PivotTables, Cognos)
      Shorten the time-to-report interval
      Allow users to "browse" data sets interactively
      Remove programmers with "backlogs" of reports
      Users frequently waited days, weeks for months to get a custom report created
    • 48
      Metadata Discovery
      Tools that “scan” data sources and create new ontologies or mappings to existing ontologies
      Relational Database
      Metadata Registry
      Data Source Mappings
    • 49
      Classification and Categorization
      Whenever we decide to break the continuous observable world into a predefined list of categories when each category has a label we call this a categorical value. These will then become the "dimensions" of our cube.
      Discrete breaks in continuous values become “rules”
      "green"
      "red"
      "blue"
      Note: NO OVERLAP!
      $500
      $0
      “normal expense"
      “large expense“ (requires supervisor approval)
      George Lakoff: Women, Fire and Other Dangerous Things: What Categories Revel about the Mind
    • 50
      Federated Ontologies
      What do you do when you have more than one Ontology?
      1) Combine
      2) Map
      3) Federate
      • Tools for combination and federation
      • “Linking is Power”
      Multiple Overlapping Ontologies
    • 51
      Cost of Poor Semantics
      Information Technology Departments can spend 40-60% of their costs on Integration
      90% of integration costs are due to poor semantics
      If every application used and "published" a machine readable ontology with mappings to published ontologies integration could be almost "automatic"
    • 52
      Gartner
      Metadata cast into formal logics will drive interoperability, automation, cost cutting, better search capabilities and new business opportunities.
      Semantic Web Drives Data Management, Automation and Knowledge and Discovery
      Alexander Linder
      March 2005
      G00125145
    • 53
      Semantic Spectrum
      HighSemanticPrecision
      StrongSemantics
      Ontologies
      Taxonomies
      OWL
      Enterprise Data Models
      Concept Maps
      Controlled Vocabularies
      RDF
      Thesaurus
      UML, XMI
      Glossaries
      XML, XSLT
      Word/HTML
      WeakSemantics
      Time/Money
      See also: Wikipedia/semantic spectrum
    • 54
      Structures for Increased Semantics
      HTML PDF Word PowerPoint Excel Access Server XML RDBMS RDF Taxonomies Ontologies
      SOA
      WSDL
      Increased Semantic Precision
      Source: Network Inference
    • 55
      Friend of a Friend
      • A "Proof of Concept for RDF"
      • Requires each person to put an RDF file on their web pages
      • System in place to prevent spammers from getting e-mail accounts
      • Sample RDF vocabulary
      • Sample FoaF file:
      <foaf:Person> <foaf:name>Dan McCreary</foaf:name> <foaf:knows> <foaf:Person> <foaf:name>Bill Titus</foaf:name> </foaf:Person> </foaf:knows></foaf:Person>
    • 56
      Ontology Architectures
      One "big" ontology (see CycCorp cyc.com)
      Using a single "Uber-Ontology"
      Akin to "Boiling the Ocean"
      Compared to:
      Many smaller ontologies
      Micro-formats (RDF/A)
      How to combine?
      CYC contains over
      3 Million "assertions"
      Source: cyc.com
    • 57
      If You Give A Kid A Hammer…
      …the whole world becomes a nail
      People solve problems with the tools they know
      Semantics are new tools for solving computer-to-computer communication problems
      Intelligent agents will be prevalent when we teach organization to publish their metadata
      Example: Procedural vs. Declarative Programming
    • 58
      Cognitive Styles
      The way we solve problems is dependant on the tools we know how to use.
      Shoshana Zuboff (1988)
      In the Age of the Smart Machine
      Technology creates:
      - new ways of thinking
      - new ways of approaching and solving problems
      - new sets of "Cognitive Styles"
      It is only if we share these cognitive styles that we will be able to create a coherent technology strategy that everyone understands
    • 59
      Metadata
      Publishing
      Open The Door To The Semantic Web!
      Agents
      Metadata publishing is hard
      It is a foundation upon which the Semantic Web will be built
      The benefits are indirect and need strong executive sponsorship
      Metadata publishing is no “silver bullet”
      I believe it is the most direct way to get to the Semantic Web
      This will be the most practical way to build intelligent agents
    • 60
      Top AI Researchers Agree…
      If software is ever going to be able to effectively inter-operate (in ways that were not explicitly preconceived and engineered), it will be because applications share enough of the semantics of their data elements.
      Doug Lenat, Cycorp
      Semantic Technology Conference
      2005
    • Thank You
      Questions…
      Copyright Dan McCreary & Associates
      61