Ontologies and Semantic Web
STANLEY WANG
SOLUTION ARCHITECT, TECH LEAD
@SWANG68
http://www.linkedin.com/in/stanley-wang-a2b143b
Ontologies and Semantic Web
• In general, an ontology describes formally a domain of
discourse and consists of a finite list of terms and the
relationships between the terms;
• The terms denote important concepts, classes of objects of
the domain, e.g. in a University Model, staff members,
students, courses, modules, lecture theatres, and schools are
some important concepts;
In the context of the Web,
ontologies provide a shared
understanding of a domain,
which is necessary to overcome
the difference in terminology.
Ontology Engineering
Ontological Vision of Semantic Web
• An ontology is document or file that formally and in a
standardized way defines the hierarchy of classes within
the domain, semantic relations among terms and
inference rules;
• Sharing semantics of your data across complex
distributed applications: Gene Ontology, Glycomics,
Pharmaceutical Drug, Treatment-Diagnosis, Repertoire
Management, Equity Markets, Anti-Money Laundering,
Suspicious Activity Monitoring, OFAC, Financial Risk,
Terrorism, Customer Profile, etc;
Ontology model can be Public, Government,
Limited Availability, Commercial.
Formal, explicit specification of a shared conceptualization
Machine
readable
Concepts, properties,
functions, axioms
are explicitly defined
Consensual
knowledge
Abstract model of
some phenomena
in the world
What is an ontology?
What is an Ontology?
A model of (some aspect of) the world
• Introduces vocabulary
relevant to domain, e.g.:
o Anatomy
8
What is an Ontology?
A model of (some aspect of) the world
• Introduces vocabulary
relevant to domain, e.g.:
o Anatomy
o Cellular biology
9
What is an Ontology?
A model of (some aspect of) the world
• Introduces vocabulary
relevant to domain, e.g.:
o Anatomy
o Cellular biology
o Aerospace
What is an Ontology?
A model of (some aspect of) the world
• Introduces vocabulary
relevant to domain, e.g.:
o Anatomy
o Cellular biology
o Aerospace
o Dogs
What is an Ontology?
A model of (some aspect of) the world
• Introduces vocabulary
relevant to domain, e.g.:
o Anatomy
o Cellular biology
o Aerospace
o Dogs
o Hotdogs
o …
What is an Ontology?
A model of (some aspect of) the world
• Introduces vocabulary
relevant to domain
• Specifies meaning of terms
Heart is a muscular organ that
is part of the circulatory system
What is an Ontology?
A model of (some aspect of) the world
• Introduces vocabulary
relevant to domain
• Specifies meaning of terms
Heart is a muscular organ that
is part of the circulatory system
• Formalised using suitable logic
15
PhD Student AssProf
AcademicStaff
rdfs:subClassOfrdfs:subClassOf
cooperate_with
rdfs:rangerdfs:domain
<swrc:AssProf rdf:ID="sst">
<swrc:name>Steffen Staab
</swrc:name>
...
</swrc:AssProf>
http://www.aifb.uni-karlsruhe.de/WBS/sst
Anno-
tation
<swrc:PhD_Student rdf:ID="sha">
<swrc:name>Siegfried
Handschuh</swrc:name>
...
</swrc:PhD_Student>
Web
Page
http://www.aifb.uni-karlsruhe.de/WBS/shaURL
<swrc:cooperate_with rdf:resource =
"http://www.aifb.uni-
karlsruhe.de/WBS/sst#sst"/>
instance of
instance
of
Cooperate_with
Ontology and Annotation
Links have explicit meanings!
Ontology Model Example - Agency Partnership
These ontologies accessed
at remote locations
Ontology Model Example: Terrorist Organization
Ontology
Personalization:
is mechanism, which
allows users to have
own conceptual view
and be able to use it for
semantic querying of
search facilities.
“Driver”
“Driver”
“Driver”
“Driver”
“Driver”
Common ontology
Search
Ontology Model Example: Customer Profile
19
OntologyF-Logic
similar
OntologyF-Logic
similar
PhD StudentDoktoral Student
Object
Person Topic Document
Tel
PhD StudentPhD Student
Semantics
knows described_in
writes
Affiliation
described_in is_about
knowsP writes D is_about T P T
DT T D
Rules
subTopicOf
ResearcherStudent
instance_of
is_a
is_a
is_a
Affiliation
Affiliation
Siggi
AIFB+49 721 608 6554
Ontology Model Example: University Research
A Typical Enterprise Semantic Application Lifecycle
 Build Ontology
• Build Schema(model level representation)
• Populate with Knowledgebase (people, location,
organizations, events)
 Automatic Semantic Annotation (Extract Semantic
Metadata)
• Any type of document, multiple sources of documents
• Metadata can be stored with or sparely from
documents
 Applications: semantic search (ranked list of documents),
portal integration, summarize & explain, analyze, make
decisions;
• Reasoning Techniques: Graph Analysis, Logic Inference
Ontology
Semantic Query
Server
1. Ontology Model Creation (Description)
2. Knowledge Agent Creation
3. Automatic aggregation of Knowledge
4. Querying the Ontology
Ontology Creation and Maintenance
22
Ontology Editors and Environments
• Protégé, SWOOP, GrOWL, TopBraid, Ontotrack, SemanticWorks, ..
JENA
• Jena is a Java framework for building Semantic Web
applications. It provides a programmatic environment for RDF,
RDFS and OWL, including a rule-based inference engine.
• Jena is open source and grown out of work with the HP Labs
Semantic Web Program.
• The Jena Framework includes:
• A RDF API
• Reading and writing RDF in RDF/XML, N3 and N-Triples
• An OWL API
• In-memory and persistent storage
• RDQL – a query language for RDF
 Jena is one of the most widely used Java APIs for RDF and
OWL, providing services for model representation, parsing,
database persistence, querying and some visualization tools.
Protege-OWL always had a close relationship with Jena. The
Jena ARP parser is still used in the Protege-OWL parser, and
various other services such as species validation and datatype
handling have been reused from Jena. It was furthermore possible
to convert a Protege OWLModel into a Jena OntModel, to get a
static snapshot of the model at run time. This model, however had
to be rebuild after each change in the model.
 As of August 2005, Protege-OWL is now much closer integrated
with Jena. This integration allows programmers to user certain
Jena functions at run-time, without having to go through the slow
rebuild process each time. The architecture of this integration is
illustrated on the next slide…
Jena Integration of Protégé-OWL
25
Jena Integration of Protégé-OWL
The OWLModel API has a new method getJenaModel() to access a Jena view of the Protege model at
run-time. This can be used by Protege plugin developers. Many other Jena services can be wrapped into
Protege plugins this way, by providing them a pointer to the Model created by Protege.
The key to this integration is the fact
that both systems operate on a low-
level "triple" representation of the
model. Protege has its native frame
store mechanism, which has been
wrapped in Protege-OWL with the
TripleStore classes. In the Jena
world, the corresponding interfaces
are called Graph and Model. The
Protege TripleStore has been
wrapped into a Jena Graph, so that
any read access from the Jena API in
fact operates on the Protege triples.
In order to modify these triples, the
conventional Protege-OWL API must
be used. However, this mechanisms
allows to use Jena methods for
querying while the ontology is
edited inside Protege.
26
Joseki - a SPARQL Server for Jena
 Joseki: The Jena RDF Server. Joseki is a server for publishing
RDF models on the web. Models have URLs and they can be
access by HTTP GET. Joseki is part of the Jena RDF framework.
 Joseki is an HTTP and SOAP engine supports the SPARQL
Protocol and the SPARQL RDF Query language. SPARQL is
developed by the W3C RDF Data Access Working Group.
 Joseki Features:
 RDF Data from files and databases
 HTTP (GET and POST) implementation of the SPARQL protocol
 SOAP implementation of the SPARQL protocol
Real Life Example: Semantic Application in a
Global Bank
• Goal
 Legislation (PATRIOT ACT) requires banks to identify ‘who’ they are
doing business with;
• Problem
 Volume of internal and external data needed to be accessed
 Complex name matching and disambiguation criteria
 Requirement to ‘risk score’ certain attributes of this data
• Approach
 Creation of a ‘risk ontology’ populated from trusted sources OFAC ;
 Sophisticated entity disambiguation
 Semantic querying, Rules specification & processing
• Solution
 Rapid and accurate KYC checks
 Risk scoring of relationships allowing for prioritisation of results;
 Full visibility of sources and trustworthiness
28
Watch List Organization
Company
Hamas
WorldCom
FBI Watch List
Ahmed Yaseer
appears on Watchlist
member of organization
works for Company
Ahmed Yaseer:
• Appears on
Watchlist ‘FBI’
• Works for Company
‘WorldCom’
• Member of
organization ‘Hamas’
Process from Business Perspective
29
World Wide
Web content
Public
Records
BLOGS,
RSS
Un-structure text, Semi-structured Data
Watch Lists
Law
Enforcement Regulators
Semi-structured Government Data
Establishing
New Account
Fraud Prevention Application using Semantics
User will be able to navigate
the ontology using a number
of different interfaces
Ontology Model
Semantic Technology in Summary
• Semantic Web is not only a technology as many
used to name it;
• Semantic Web is not only an environment as many
naming it now;
• Semantic Web it is a new context within which one
should rethink and re-interpret the existing
businesses, resources, services, technologies,
processes, environments, products etc. to raise
them to totally new level of performance…

Ontologies and semantic web

  • 1.
    Ontologies and SemanticWeb STANLEY WANG SOLUTION ARCHITECT, TECH LEAD @SWANG68 http://www.linkedin.com/in/stanley-wang-a2b143b
  • 2.
    Ontologies and SemanticWeb • In general, an ontology describes formally a domain of discourse and consists of a finite list of terms and the relationships between the terms; • The terms denote important concepts, classes of objects of the domain, e.g. in a University Model, staff members, students, courses, modules, lecture theatres, and schools are some important concepts; In the context of the Web, ontologies provide a shared understanding of a domain, which is necessary to overcome the difference in terminology.
  • 3.
  • 4.
    Ontological Vision ofSemantic Web • An ontology is document or file that formally and in a standardized way defines the hierarchy of classes within the domain, semantic relations among terms and inference rules; • Sharing semantics of your data across complex distributed applications: Gene Ontology, Glycomics, Pharmaceutical Drug, Treatment-Diagnosis, Repertoire Management, Equity Markets, Anti-Money Laundering, Suspicious Activity Monitoring, OFAC, Financial Risk, Terrorism, Customer Profile, etc; Ontology model can be Public, Government, Limited Availability, Commercial.
  • 5.
    Formal, explicit specificationof a shared conceptualization Machine readable Concepts, properties, functions, axioms are explicitly defined Consensual knowledge Abstract model of some phenomena in the world What is an ontology?
  • 7.
    What is anOntology? A model of (some aspect of) the world • Introduces vocabulary relevant to domain, e.g.: o Anatomy
  • 8.
    8 What is anOntology? A model of (some aspect of) the world • Introduces vocabulary relevant to domain, e.g.: o Anatomy o Cellular biology
  • 9.
    9 What is anOntology? A model of (some aspect of) the world • Introduces vocabulary relevant to domain, e.g.: o Anatomy o Cellular biology o Aerospace
  • 10.
    What is anOntology? A model of (some aspect of) the world • Introduces vocabulary relevant to domain, e.g.: o Anatomy o Cellular biology o Aerospace o Dogs
  • 11.
    What is anOntology? A model of (some aspect of) the world • Introduces vocabulary relevant to domain, e.g.: o Anatomy o Cellular biology o Aerospace o Dogs o Hotdogs o …
  • 12.
    What is anOntology? A model of (some aspect of) the world • Introduces vocabulary relevant to domain • Specifies meaning of terms Heart is a muscular organ that is part of the circulatory system
  • 13.
    What is anOntology? A model of (some aspect of) the world • Introduces vocabulary relevant to domain • Specifies meaning of terms Heart is a muscular organ that is part of the circulatory system • Formalised using suitable logic
  • 14.
    15 PhD Student AssProf AcademicStaff rdfs:subClassOfrdfs:subClassOf cooperate_with rdfs:rangerdfs:domain <swrc:AssProfrdf:ID="sst"> <swrc:name>Steffen Staab </swrc:name> ... </swrc:AssProf> http://www.aifb.uni-karlsruhe.de/WBS/sst Anno- tation <swrc:PhD_Student rdf:ID="sha"> <swrc:name>Siegfried Handschuh</swrc:name> ... </swrc:PhD_Student> Web Page http://www.aifb.uni-karlsruhe.de/WBS/shaURL <swrc:cooperate_with rdf:resource = "http://www.aifb.uni- karlsruhe.de/WBS/sst#sst"/> instance of instance of Cooperate_with Ontology and Annotation Links have explicit meanings!
  • 15.
    Ontology Model Example- Agency Partnership
  • 16.
    These ontologies accessed atremote locations Ontology Model Example: Terrorist Organization
  • 17.
    Ontology Personalization: is mechanism, which allowsusers to have own conceptual view and be able to use it for semantic querying of search facilities. “Driver” “Driver” “Driver” “Driver” “Driver” Common ontology Search Ontology Model Example: Customer Profile
  • 18.
    19 OntologyF-Logic similar OntologyF-Logic similar PhD StudentDoktoral Student Object PersonTopic Document Tel PhD StudentPhD Student Semantics knows described_in writes Affiliation described_in is_about knowsP writes D is_about T P T DT T D Rules subTopicOf ResearcherStudent instance_of is_a is_a is_a Affiliation Affiliation Siggi AIFB+49 721 608 6554 Ontology Model Example: University Research
  • 19.
    A Typical EnterpriseSemantic Application Lifecycle  Build Ontology • Build Schema(model level representation) • Populate with Knowledgebase (people, location, organizations, events)  Automatic Semantic Annotation (Extract Semantic Metadata) • Any type of document, multiple sources of documents • Metadata can be stored with or sparely from documents  Applications: semantic search (ranked list of documents), portal integration, summarize & explain, analyze, make decisions; • Reasoning Techniques: Graph Analysis, Logic Inference
  • 20.
    Ontology Semantic Query Server 1. OntologyModel Creation (Description) 2. Knowledge Agent Creation 3. Automatic aggregation of Knowledge 4. Querying the Ontology Ontology Creation and Maintenance
  • 21.
    22 Ontology Editors andEnvironments • Protégé, SWOOP, GrOWL, TopBraid, Ontotrack, SemanticWorks, ..
  • 22.
    JENA • Jena isa Java framework for building Semantic Web applications. It provides a programmatic environment for RDF, RDFS and OWL, including a rule-based inference engine. • Jena is open source and grown out of work with the HP Labs Semantic Web Program. • The Jena Framework includes: • A RDF API • Reading and writing RDF in RDF/XML, N3 and N-Triples • An OWL API • In-memory and persistent storage • RDQL – a query language for RDF
  • 23.
     Jena isone of the most widely used Java APIs for RDF and OWL, providing services for model representation, parsing, database persistence, querying and some visualization tools. Protege-OWL always had a close relationship with Jena. The Jena ARP parser is still used in the Protege-OWL parser, and various other services such as species validation and datatype handling have been reused from Jena. It was furthermore possible to convert a Protege OWLModel into a Jena OntModel, to get a static snapshot of the model at run time. This model, however had to be rebuild after each change in the model.  As of August 2005, Protege-OWL is now much closer integrated with Jena. This integration allows programmers to user certain Jena functions at run-time, without having to go through the slow rebuild process each time. The architecture of this integration is illustrated on the next slide… Jena Integration of Protégé-OWL
  • 24.
    25 Jena Integration ofProtégé-OWL The OWLModel API has a new method getJenaModel() to access a Jena view of the Protege model at run-time. This can be used by Protege plugin developers. Many other Jena services can be wrapped into Protege plugins this way, by providing them a pointer to the Model created by Protege. The key to this integration is the fact that both systems operate on a low- level "triple" representation of the model. Protege has its native frame store mechanism, which has been wrapped in Protege-OWL with the TripleStore classes. In the Jena world, the corresponding interfaces are called Graph and Model. The Protege TripleStore has been wrapped into a Jena Graph, so that any read access from the Jena API in fact operates on the Protege triples. In order to modify these triples, the conventional Protege-OWL API must be used. However, this mechanisms allows to use Jena methods for querying while the ontology is edited inside Protege.
  • 25.
    26 Joseki - aSPARQL Server for Jena  Joseki: The Jena RDF Server. Joseki is a server for publishing RDF models on the web. Models have URLs and they can be access by HTTP GET. Joseki is part of the Jena RDF framework.  Joseki is an HTTP and SOAP engine supports the SPARQL Protocol and the SPARQL RDF Query language. SPARQL is developed by the W3C RDF Data Access Working Group.  Joseki Features:  RDF Data from files and databases  HTTP (GET and POST) implementation of the SPARQL protocol  SOAP implementation of the SPARQL protocol
  • 26.
    Real Life Example:Semantic Application in a Global Bank • Goal  Legislation (PATRIOT ACT) requires banks to identify ‘who’ they are doing business with; • Problem  Volume of internal and external data needed to be accessed  Complex name matching and disambiguation criteria  Requirement to ‘risk score’ certain attributes of this data • Approach  Creation of a ‘risk ontology’ populated from trusted sources OFAC ;  Sophisticated entity disambiguation  Semantic querying, Rules specification & processing • Solution  Rapid and accurate KYC checks  Risk scoring of relationships allowing for prioritisation of results;  Full visibility of sources and trustworthiness
  • 27.
    28 Watch List Organization Company Hamas WorldCom FBIWatch List Ahmed Yaseer appears on Watchlist member of organization works for Company Ahmed Yaseer: • Appears on Watchlist ‘FBI’ • Works for Company ‘WorldCom’ • Member of organization ‘Hamas’ Process from Business Perspective
  • 28.
    29 World Wide Web content Public Records BLOGS, RSS Un-structuretext, Semi-structured Data Watch Lists Law Enforcement Regulators Semi-structured Government Data Establishing New Account Fraud Prevention Application using Semantics User will be able to navigate the ontology using a number of different interfaces Ontology Model
  • 29.
    Semantic Technology inSummary • Semantic Web is not only a technology as many used to name it; • Semantic Web is not only an environment as many naming it now; • Semantic Web it is a new context within which one should rethink and re-interpret the existing businesses, resources, services, technologies, processes, environments, products etc. to raise them to totally new level of performance…