DODDLE-OWL: A Domain Ontology Construction Tool with OWL

DODDLE-OWL: A Domain Ontology
Construction Tool with OWL
Takeshi Morita 1)
,Naoki Fukuta 2)
, Noriaki Izumi 3)
,
and Takahira Yamaguchi 1)
1) Keio University, Japan
2) Shizuoka University, Japan
3) National Institute of AIST, Japan

Contents
• Motivation
• Related Works
• DODDLE-OWL Overview
• Implementation Architecture
• Case Studies
• Demonstration
• Conclusions

Motivation
• Background
– The role of domain ontologies is important for the
Semantic Web
– Sharing common understanding among people and
software agents
– Finding appropriate information on the web
• Issues of large cost with building up domain
ontologies
– Many concepts in a domain
– Each concept has high specific meaning
– We need knowledge of domain experts
– Cost-benefit performance of domain ontologies is
lower than that of general ontologies (e.g. WordNet,
EDR)

Semi-Automatic Construction
Set of
Concept Pairs
Quality Refinement
Domain
Specific
Documents
(English or
Japanese)
A Domain Ontology
(OWL format)
Initial
Concept Hierarchy
Translation
EDR
(general)
EDR
(technical)
WordNet
General
Ontologies
Our Goal
User (domain expert)
Taxonomic
Relationships
Non-Taxonomic
Relationships
DODDLE-OWL:
a Domain Ontology rapiD
DeveLopment Environment
– OWL extension
Focusing the quality refinement
phase of ontology construction

Related WorksMehrnoush Shamsfard, Ahmad Abdollahzadeh Barforoush,
The State of the Art in Ontology Learning: A Framework for Comparison
Learning System Element (s)
Learned
Prior Knowledge Input
DODDLE-OWL
(Keio University)
Taxonomic and non-taxonomic
conceptual Relations
WordNet, EDR Unstructured domain specific
texts (English and Japanese)
ASIUM
(Paris-Sud
University ）
Verb subcat.
Frames + hierarchies
Linguistic K. Unstructured
(corpora) （ French ）
HASTI
(Amir Kabir University
of Technology ）
Words, Concepts, Taxonomic
and Non-Taxonomic
conceptual relations, axioms
Almost empty
(small kernel)
Unstructured NL texts
(Persian ）
SVETLAN’
(CNRS laboratory)
Noun Classes Structured ＋ Unstructured
input to SEGAPSITH
(French ）
SYNDIKATE
(University of Albert-
Ludwigs)
Words ， Concepts, Taxonomic
and Non-Taxonomic
conceptual relations
Generic and
domain lexicons
and ontologies
Unstructured 　 NL texts
(German ）
TEXT-TO-ONTO
(University of
Karlsruhe)
Concepts, Taxonomic and Non-
Taxonomic conceptual
relations
Lexical DB +
domain lexicon
NL texts ， Web docs,
Semi-structured (XML, DTD)
and structured
(German ， HTML ， XML,
DTD ）
WEB→KB Instances of classes and The ontology for An ontology + Training
Support to construct
Taxonomic and Non-Taxonomic
Relationships.

Learning System Degree of
Automation
DODDLE-OWL
(Keio University)
User Interaction,
Hand-made
modification
ASIUM
(Paris-Sud University ）
Cooperative
HASTI
(Amir Kabir University
of Technology ）
Both automatic and
cooperative modes
SVETLAN’
(CNRS laboratory)
Automatic
SYNDIKATE
(University of Albert-
Ludwigs)
Automatic
TEXT-TO-ONTO
(University of
Karlsruhe)
Semi-automatic
interactive,
balanced
cooperative
WEB→KB
(Distributed Systems
Technology Centre)
Automatic
Related Works (cont.)Mehrnoush Shamsfard, Ahmad Abdollahzadeh Barforoush,
The State of the Art in Ontology Learning: A Framework for Comparison
Many ontology learning systems
are focusing on automatic
ontology construction.
The user is difficult to refine the
automatic generated ontologies.
However
Our system is focusing on
high-level support
for user interaction.
The user is easy to refine semi-
automatic generated ontologies
and constructs high quality
domain ontologies.
Therefore

Set of
Concept Pairs
A Domain Ontology
(OWL format)
Initial
Concept Hierarchy
EDR
(general)
EDR
(technical)
WordNet
General
Ontologies
System Overview
Taxonomic
Relationships
Non-Taxonomic
Relationships
Semi-Automatic Construction
Quality Refinement
Domain
Specific
Documents
(English or
Japanese)
Translation
DODDLE-OWL:
a Domain Ontology rapiD
DeveLopment Environment
– OWL extension
Focusing the quality refinement
phase of ontology construction
Documents
Construction Module
Refinement Module
Visualization
Module
Input Module
Translation
Module

Set of
Concept Pairs
A Domain Ontology
(OWL format)
Initial
Concept Hierarchy
EDR
(general)
EDR
(technical)
WordNet
General
Ontologies
System Overview
Taxonomic
Relationships
Non-Taxonomic
Relationships
Documents
Construction Module
Refinement Module
Visualization
Module
Input Module
Translation
Module

Procedure of Input Module
WordNet
EDR
(general)
EDR
(technical)
Ontology Selection
Domain specific documents
(English or Japanese)
Input Word Selection
Word POS TF IDF TF-IDF
W1 Noun ….. ….. …..
W2 Complex Word ….. ….. …..
Morphological Analysis
Complex Word Extraction
Document Selection
………………………………………..
Disambiguation
Input Word Set
Input Concepts
W1
W2
W3
EDR (general): Ci
WordNet: Cj
EDR (technical): Ck
………………………………………..
Input Word Set Input Concept Set
Select significant
words for the domain
(input words)
Identify the sense of
input words to map
those words to concepts
in the general ontologies

A Domain Ontology
(OWL format)
EDR
(general)
EDR
(technical)
WordNet
General
Ontologies
Taxonomic
Relationships
Non-Taxonomic
Relationships
Documents
Refinement Module
Visualization
Module
Translation
Module
statistic
methods
matching
& trimming
Relationship
Construction
Construction Module
Set of Concept Pairs
Input Concept Selection
Hierarchy
Construction
Association Rule WordSpace
Initial Concept Hierarchy
Input concepts
Input Module

Construction Module
statistic
methods
matching
& trimming
Relationship
Construction
Hierarchy
Construction
Association
Rule
WordSpace
Documents
Set of Concept PairsInitial Concept Hierarchy
Refinement Module
Input Module
Input concepts
EDR
(general)
EDR
(technical)
WordNet
General
Ontologies
Initial Domain Ontology

Hierarchy Construction Module
Taxonomic Relationships
in the general ontologies
Merging
Input Concepts
Trimming
Root Root
unnecessary Internal Node
Best Matched Node
Salient Internal Node
Initial Model initial concept hierarchy
get paths related
to input concepts
generate
an initial model
trimming

Extract Concept Pairs
by different methods
Relationship Construction Module
Input concepts
Matching
Documents
WordSpace
Association
Rule
Set of concept pairs
method based on
context similarity
Popular method
in the field of data mining

• Words and phrases in documents can be expressed
by vector representation containing co-occurrence
　 statistics
WordSpace Method
WordSpace 　（ Marti A. Hearst, Hinrich Schutze ）
… wi … wj …C1 … wk …
… wi … wj …C2 … wk …
Context Similarity between concepts C1 and C2
• Inner products among the vectors work as
the similarity between the words and phrases.
High similarity
Significant related concept pair for the domain

Association Rule
• Find associations between items in a set of
transaction
• In our research
– Each item is an input concept appearing in the
document
– One transaction is one sentence in the document
• Parameters
– Support = contain X and Y / All transaction
– Confidence = contain X and Y / contain X
X and Y: input concepts

statistic
methods
matching
& trimming
Relationship
Construction
Construction Module
Set of Concept Pairs
Input Concept Selection
Hierarchy
Construction
Refinement Module
Association Rule WordSpace
Concept
Specification
Template
Documents
Visualization
Module
A Domain Ontology
(OWL format)
Input Module
Initial Concept Hierarchy
Matched Result
Analysis
Hierarchy
Refinement
Trimmed Result
Analysis
Translation
Module
Input conceptsEDR
(general)
EDR
(technical)
WordNet
General
Ontologies
The value of
co-concurrency
Relationship
Refinement

Concept
Specification
Hierarchy Refinement
Module
Relationship Refinement
Module
Refinement Module
Concept
Hierarchy
Concept Drift
Management
Better performance
by changing parameters
with interaction of a userVisualization Module:
MR3
RDF&RDFS Visual Editing
Translation
Module
Set of Concept PairsInitial Concept Hierarchy
support to refine
the initial concept
hierarchy graphically

Hierarchy Refinement Module -
Concept Drift
A Domain Ontologygeneral ontologies
Reusable Part
No reusable part
because of
concept drift
an initial concept
hierarchy
constructed
The position of particular
concepts changes depending
on the domain.
adjust the initial concept hierarchy
to the specific domain
Concept Drift

Hierarchy Refinement Module
strategy 1 Matched Result Analysis
Point out differences
of abstraction level
among sibling nodes
according to the
Trimmed Result
MOVE
MOVE
MOVE
STAY
A
B C
D
Trimming
Area
Initial Model
0
0 3
B C D
Trimmed Model
A
B C
A
D
strategy 2 Trimmed Result Analysis
Divide the initial concept
hierarchy into reusable area
and not reusable area
according to the position of
Best Matched Nodes
suggest a user to move not
reusable area
Trimming
Reconstructed
by User
Best Matched Node
Internal Node
Reconstructed
by User

Concept Drift Management
Matched Result Analysis
Trimmed Result Analysis
Visualization Module
Visualization Module
Parts of Modification
are highlighted based on
Matched result analysis
and Trimmed Result analysis

Relationship Refinement Module
Non-Taxonomic Relationship
Learning
Non-Taxonomic Relationships
Identify correct pairs from
generated candidates
Setting parameters for
WordSpace and
Association Rule
Construct non-taxonomic
relationships by
considering the relation
with each concept pair

Translation Module
owl:Class
owl:Class owl:Class
owl:Class
rdfs:subClassOf
owl:ObjectProperty
rdfs:domain rdfs:range
Taxonomic Relationships Non-Taxonomic Relationships
Export constructed domain ontology in OWL format
relationship
rdfs:subClassOf rdfs:subClassOf

Implementation Architecture
Java Virtual Machine
Jena2MR3
Visualization
Module
Translation
Module
Construction and
Refinement Module
Input Module
Java WordNet Library (JWNL)
Gensen Sen SS-Tagger
JWNL: http://jwordnet.sourceforge.net/
Gensen: A Complex Word Extraction Tool
Sen: A Japanese morpheme analyzer, http://ultimania.org/sen/
SS-Tagger: English Tagger
MR3
: an RDF & RDFS graphical editor, http://mmm.semanticweb.org/mr3/
Jena Semantic Web Tool Framework: HP Labs, http://jena.sourceforge.net/

Case Studies
• Purpose
– To check DODDLE-OWL can support the user in
constructing taxonomic and non-taxonomic relationships
• Target Field 　
– Particular field of business 　
– xCBL (XML Common Business Library)
• http://www.xcbl.org/
• Domain Specific Document
– xCBL Document Description
– about 150 sentences and 2500 words
• Input Concepts
– 57 business concepts from the document
• User
– Not an expert but has business knowledge

Results and Evaluation for
Taxonomic Relationships Construction
Initial
Concept Hierarcy
Business Ontology
Get paths related
to input concepts
from WordNet
Trimming Modification
57
Concepts
Precision Recall
ST1: Matched Result Analysis 5/25(=0.2)
　
5/7(=0.71)
Evaluation of two strategies by the user
82
Concepts
152
Concepts
83
Concepts
The number of concepts in each model
Input Concepts Initial Model
Constructed with Hierarchy Construction Module
Constructed with Hierarchy
Refinement Module

Non-Taxonomic Relationships Construction
WS AR The union of WS & AR
# Extracted
concept pairs
40 39 66
# Accepted
concept pairs
30 20 39
# Rejected
concept pairs
10 19 27
Precision 0.75 (30/40) 0.51 (20/39) 0.59 (39/66)
Association Rule （ 20/39 ）WordSpace （ 30/40 ）
8 17
1119 9
2
The union of WS & AR （ 39/66 ）
Frequency of
Extracted 4-gram
Context Scope
（ before ：
after ）
threshold
of similarity
2 10:10 0.6
Minimum
Support
Minimum
Confidence
0.7 ％ 55 ％
For WordSpace parametersFor Association Rule parameters

Non-Taxonomic Relationships Acquisition
WS AR The union of WS & AR
# Extracted
concept pairs
40 39 66
# Accepted
concept pairs
30 20 39
# Rejected
concept pairs
10 19 27
Precision 0.75 (30/40) 0.51 (20/39) 0.59 (39/66)
Association Rule
（ 20/39 ）
8 17
1119 9
WordSpace （ 30/40 ）
2
The union of WS & AR （ 39/66 ） The precision of WS method is good,
but the WS method has its bias
so we cannot get certain types of
concept pairs from it.
we combine two different methods for
getting wider range of concept pairs.

Demonstration
• Construct taxonomic and non-taxonomic
relationships for xCBL

Conclusions
• Summary
– DODDLE-OWL: a Domain Ontology rapiD DeveLopment
Environment – OWL extension
• Focusing the quality refinement phase of ontology construction
– Case studies
• construct a domain ontology for xCBL
• Support the user in constructing and refining the domain ontology
• Future Work
– Reuse existing (domain) ontologies in any forms
– Apply DODDLE-OWL to large scale domain ontology
construction
• Rocket operation ontology
• About 40,000 concepts

Thank you for your attention.
DODDLE-OWL has been released ．
Please visit this web site, if you like it.
about 100 user now
http://mmm.semanticweb.org/doddle/

DODDLE-OWL: A Domain Ontology Construction Tool with OWL

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (9)

Similar to DODDLE-OWL: A Domain Ontology Construction Tool with OWL

Similar to DODDLE-OWL: A Domain Ontology Construction Tool with OWL (20)

DODDLE-OWL: A Domain Ontology Construction Tool with OWL

Editor's Notes