SlideShare a Scribd company logo
1 of 68
Ontology Engineering
for Big Data
Kouji Kozaki
The Institute of Scientific and Industrial Research (I.S.I.R),
Osaka University, Japan
2013/09/03 1
Ontology and Semantic Web for Big Data
(ONSD2013) Workshop in the 2013
International Computer Science and
Engineering Conference
(ICSEC2013), Bangkok, Thailand, 5th
Sep. 2013
ONSD2013@ICEC2013
Self introduction: Kouji KOZAKI
 Brief biography
 2002 Received Ph.D. from Graduate School of Engineering, Osaka
University.
 2002- Assistant Professor, 2008- Associate Professor in ISIR, Osaka
University.
 Specialty
 Ontological Engineering
 Main research topics
 Fundamental theories of ontological engineering
2013/09/03 2ONSD2013@ICEC2013
Ontological topics
 Some examples of topics which I work on
 Definition of disease
 What’s “disease” ?
 What’s “causal chain” ?
 Is it a object or process ?
 Role theory
 What’s ontological difference among the following concepts?
 Person
 Teacher
 Walker
 Murderer
 Mother
2013/09/03 3
…. Natural type
Role (dependent concept)
ONSD2013@ICEC2013
Self introduction: Kouji KOZAKI
 Brief biography
 2002 Received Ph.D. from Graduate School of Engineering, Osaka University.
 2002- Assistant Professor, 2008- Associate Professor in ISIR, Osaka University.
 Specialty
 Ontological Engineering
 Main research topics
 Fundamental theories of ontological engineering
 Ontology development tool based on the ontological theories
 Ontology development in several domains and ontology-based application
 Hozo(法造) -an environment for ontology building/using- (1996- )
 A software to support ontology(=法) building(=造) and
use
 It’s available at http://www.hozo.jp as a free software
 Registered Users:3,500 (June 2012)
 Java API for application development is provided.
 Support formats: Original format, RDF(S), OWL.
 Linked Data publishing support is coming soon.
2013/09/03 4ONSD2013@ICEC2013
My history on Ontology Building
 2002-2007 Nano technology ontology
 Supported by NEDO(New Energy and Industrial Technology Development Organization)
 2006- Clinical Medical ontology
 Supported by Ministry of Health, Labour and Welfare, Japan
 Cooperated with: Graduate School of Medicine, The University of Tokyo.
 2007-2009 Sustainable Science ontology
 Cooperated with: Research Institute for Sustainability Science, Osaka Univ.
 2007-2010 IBMD(Integrated Bio Medical Database)
 Supported by MEXT through "Integrated Database Project".
 Cooperated with: Tokyo Medical and Dental University, Graduate School of Medicine, Osaka U.
 2008-2012 Protein Experiment Protocol ontology
 Cooperated with: Institute for Protein Research, Osaka Univ.
 2008-2010 Bio Fuel ontology
 Supported by the Ministry of Environment, Japan.
 2009-2012 Disaster Risk ontology
 Cooperated with: NIED (National Research Institute for Earth Science and Disaster Prevention)
 2012- Bio mimetic ontology
 Supported by JSPS KAKENHI Grant-in-Aid for Scientific Research on Innovative Areas
 2012- Ontology of User Action on Web
 Cooperated with: Consumer first Corp.
 2013- Information Literacy ontology
 Supported by JSPS KAKENHI
2013/09/03 5ONSD2013@ICEC2013
Agenda
 (1) Motivation
 Ontology vs. Big Data
 How we can use ontology for big data?
 (2) Case Studies towards Ontology Engineering
for Big Data
 Ontology Exploration according to the users viewpoints
 A Disease Ontology developed in Japanese Medical
Ontology Project
 (3) Concluding Remarks
2013/09/03 6ONSD2013@ICEC2013
Ontology vs. Big Data
 Question
 Is Ontology useful for Big Data?
 My answer:(I believe) Yes
 Combination of ontology and Big Data could
provide new solutions for many problem.
2013/09/03 7
 Ontology
 Not so big.
(someone is big)
 Built by hands.
 Used based on
semantics by reasoning.
 Big Data
 Very big.
 Collected automatically.
 Used without semantics
by Machine Learning or
Data mining.
ONSD2013@ICEC2013
How to combine
Ontology and Big Data
 Basic technology
 Mapping ontology to database
 Mapping classes (concepts) defined in ontology to database
schema
 Mapping classes/instances defined in ontology to data in DB
 Add metadata on data using vocabulary defined in
ontology
 e.g. annotation on document such as webpage, paper etc.
 Convert database (e.g. RDB) to ontology-based
(RDF) database
 e.g. linked data such as DBPedia, some bioinformatics DB,
etc.
 You can choose some of these technology
according to your purpose
2013/09/03 ONSD2013@ICEC2013 8
How to combine
Ontology and Big Data
 Basic technology
 Mapping ontology to database
 Mapping classes (concepts) defined in ontology to database
schema
 Mapping classes/instances defined in ontology to data in DB
 Add metadata on data using vocabulary defined in
ontology
 e.g. annotation on document such as webpage, paper etc.
 Convert database (e.g. RDB) to ontology-based
(RDF) database
 e.g. linked data such as DBPedia, some bioinformatics DB,
etc.
 You can choose some of these technology
according to your purpose
2013/09/03 ONSD2013@ICEC2013 9
Case Study
A method for mapping Abnormality Ontology (in medical
domain) to medical database
hypertension
Classification of Abnormality
Representations 1
blood pressure
200 mmHg
blood pressure is high
Various types of abnormality representations
are used in medical domain
blood glucose level
150 mm/dL
blood glucose level is high
hyperglycemia
2013/09/03 10
ONSD2013@ICEC2013
☑
11
Classification of Abnormality
representations 2
※Based on quality and quantity ontologies in the Upper Ontology “YAMATO”.
Property
representation
Quantitative
representation
blood pressure
200 mmHg
blood glucose
level 150 mm/dL
Qualitative
representation
blood pressure
is high
blood glucose
level is high
hypertension
hyperglycemia
☑Diagnosis
Identify a concrete
value for each
patient in clinical
tests
☑Definition of
disease
2013/09/03 ONSD2013@ICEC2013
Abnormality
Ontology
Medical
Database
Mapping
Structural
abnormality
Size
abnormality
Formational
abnormality
Conformational
abnormality
Small in
size
Small in
line
Small in
area
Small in
volume
Narrowing tube
Vascular stenosis Gastrointestinal
tract stenosis
Arterial stenosis …
Intestinal
stenosis
Layer 1:
Generic Abnormal
States (Object-
independent)
Layer 3:
Specific context-
dependent
Abnormal States
Coronary stenosis
in
Angina pectoris
Coronary stenosis
in
Arteriosclerosis
Intestinal stenosis
in
Ileus
Esophageal stenosis
in
Esophagitis
Esophageal
stenosis
is-a
Material
abnormality
Large
in size
disease
dependent
Blood vessel
dependent
Topological
abnormality
……
…
Is-a hierarchy of Abnormality Ontology
12
Tube-
dependent…
Narrowing
of valve
Layer2:
Object-dependent
Abnormal States
…
…
…
Coronary stenosis
2013/09/03
How can we deal with
clinical test data ?
•In hospitals, huge volume of diagnostic/clinical test data
have been accumulated.
•Most are quantitative data:
e.g., blood prresure 180mmHg, blood cross-sectional area
40 mmx2,
Quantitative value Qualitative value
180mmHg (Vqt) high (Vql)
Quantitative
value:180 mmhg
Threshold value
blood pressure
high
13
high
e.g., 140mmhg
2013/09/03
blood pressure
Attribute (A)
high
Value (V)
Basic policy for definition of
abnormal states
hypertension
Property (P)
A property is decomposed into a tuple:
<Attribute (A), Attribute Value (V)> in a qualitative form.
14
Qualitative representation can be converted into a
Property representation.
2013/09/03
Quantity
Property
blood pressure
180 mmhg
cross-section area
xxcmx2
abnormality
knowledge
Clinical test
data
blood pressure
high
cross-section area
small
Hypertension
Narrowing
Quality
Our model enables
“Interoperability” from Clinical test
data to conceptual knowledge about
abnormal States.
15
Qualitative representation can be
converted Quantitative data to
Property representation.
2013/09/03
How to combine
Ontology and Big Data
 Basic technology
 Mapping ontology to database
 Mapping classes (concepts) defined in ontology to database
schema
 Mapping classes/instances defined in ontology to data in DB
 Add metadata on data using vocabulary defined in
ontology
 e.g. annotation on document such as webpage, paper etc.
 Convert database (e.g. RDB) to ontology-based
(RDF) database
 e.g. linked data such as DBPedia, some bioinformatics DB,
etc.
 You can choose some of these technology
according to your purpose
2013/09/03 ONSD2013@ICEC2013 16
Case Study
Annotation on web browsing history of users based on
Web User Action Ontology
0
5
10
15
20
25
30
35
40
会議毎の利用タイプの推移
Theamount ofpaperssurveyedin each conference
9 19 18 24 25 11 23 26 17 18
Theamountsoftypesofusage
Web browsing history
(access logs) of users
List of all URLs the user accessed
for 130M users × 2 year
s
Web User
Action Ontology
Analysis of
consumption
behavior
Annotation on web browsing
history of users based on ontology
This is collaborative work with Consumer first, Inc.
Basic Idea
 The format of the access logs (Web browsing history) of
users provided by Consumer first, Inc.
 User id, access date and time, URL …
 Problem
 URL is meaning less string for human while someone guess its contents
if it is famous site.
 Diversity of access logs.
 In order to analyze them, we need consistent meaning.
 Annotations on the access log
 We tried to add metadata which present human understandable
meaning of each URL
 We also developed a prototype of automatic annotation
 Its recall and relevance rate is almost 0.7 ~0.9
 We think this result is not bad for statistical analysis.
2013/09/03 ONSD2013@ICEC2013 18
Ontology Engineering
for Big Data
 Basic technology
= How to combine Ontology and Big Data
 Mapping ontology to database
 Add metadata on data using vocabulary defined in
ontology
 Convert database (e.g. RDB) to ontology-based
(RDF) database
 How to use Combinations of Ontology and Big Data
 Ontology can provide semantics to add raw data.
 Generalized concepts in ontology can connect data in
various concept levels across domains.
 We can use ontology as given (and authorized) knowledge
to analysis big data.
2013/09/03 19ONSD2013@ICEC2013
Ontology Engineering
for Big Data
 Features of ontology in class level
 It reflects understanding of the target world.
 Well organized ontologies have generalized rich knowledge
based on consistent semantics.
 Ontologies are systematized knowledge of domains.
 Combination of ontology and big data
 Ontology can provide semantics to add raw data.
 Generalized concepts in ontology can connect data in
various concept levels across domains.
 We can use ontology as given (and authorized) knowledge
to analysis big data.
2013/09/03 20ONSD2013@ICEC2013
Two possible way to use
ontology for big data
Metadata
...
LOD(Linked Open Data)
Ontology
Big Data
Ontology
Use ontology to bridge
datasets across domains
Use ontology to combine deep
domain knowledge and raw data
2013/09/03 21ONSD2013@ICEC2013
Case studies
 Use ontology to bridge datasets across
domains
 Understanding an Ontology through Divergent
Exploration
 Presented at ESWC2011
 Use ontology to combine deep domain
knowledge and raw data
 Japanese Medical Ontology project
 Disease ontology and Ontology of Abnormal
State
 presented at ICBO (International Conference on Biomedical
Ontology) 2011, 2012 and 2013
2013/09/03 22ONSD2013@ICEC2013
Use ontology to bridge datasets
across domains
 Basic technology
 Terms (classes/instances) defined in ontology are used as common
vocabulary for search data.
 If the ontology has mapping to Multiple DBs, the user can search
across them.
 Motivation and Issue
 Combinations of multiple datasets
could be valuable for Big Data Analysis.
 e.g. climate and agriculture,
healthcare and life science, etc.
 However, to get all combinations across
multiple Big Data is not realistic for their size.
 Requests by the users are also very different
according to their interests.
 It is important to consider efficient method
to obtain meaningful combinations.
2013/09/03 ONSD2013@ICEC2013 23
O ntology
Docum ents / Law D ata
Search
Search across
multiple DBs
Common Vocabulary
Raw
A method to obtain meaningful
combinations using ontology
exploration
2013/09/03 24
Problem Setting
Problem Solution
Innovation
Layer 0
Layer 1
Layer 2
Layer 3
Layer 4
Contents Management
using the Metadata
Map Generation
Depending on
Viewpoints
Comparison and
Convergence
of multiple Maps
Context Based
Convergence
Divergent
Exploration
Ontology-based
Information
Retrieval
An ontology presents an
explicit essential understanding
of the target world.
It provides a base knowledge
to be shared among the
users.
They explore the ontology
according to their viewpoint
and generate conceptual
maps as the result.
These maps represent
understanding from the their
own viewpoints.
They can use the maps as
viewpoints (combinations) to
get data from multiple DBs.
ONSD2013@ICEC2013
(Divergent)
Ontology exploration tool
Exploration of an ontology
“Hozo” – Ontology Editor
Multi-perspective conceptual chains
represent the explorer’s understanding of
ontology from the specific viewpoint. Conceptual maps
Visualizations as
conceptual maps from
different view points
1) Exploration of multi-perspective conceptual chains
2) Visualizations of conceptual chains
2013/09/03 25ONSD2013@ICEC2013
Referring to
another concept
2013/09/03 26
Node represents
a concept
(=rdfs:Class)
slot represents
a relationship
(=rdf:Property)
Is-a (sub-class-of)
relationshp
ONSD2013@ICEC2013
272013/09/03 ONSD2013@ICEC2013
2013/09/03 28
Aspect dialog
constriction
tracing classes
Option settings for
exploration
property
names
Conceptual map visualizer
Kinds of aspects
Selected relationships
are traced and shown as
links in conceptual map
ONSD2013@ICEC2013
29
Explore the focused
(selected) path.
2013/09/03 ONSD2013@ICEC2013
Functions for ontology
exploration
 Exploration using the aspect dialog:
 Divergent exploration from one concept using the aspect
dialog for each step
 Search path:
 Exploration of paths from stating point and ending points.
 The tool allows users to post-hoc editing for extracting
only interesting portions of the map.
 Change view:
 The tool has a function to highlight specified paths of
conceptual chains on the generated map according to given
viewpoints.
 Comparison of maps:
 The system can compare generated maps and show the
common conceptual chains both of the maps.
2013/09/03 30
Manual exploration
Machine exploration
ONSD2013@ICEC2013
2013/09/03 31
Ending point (1)
Ending point (3)
Ending point (2)
Search
Path
Starting point
Selecting of ending points
Finding all possible
paths from stating
point to ending points
ONSD2013@ICEC2013
2013/09/03 32
Search
Path
Selected ending points
ONSD2013@ICEC2013
2013/09/03 33
What does the result mean?
Selected ending points
ONSD2013@ICEC2013
Problem
Kinds of method to solve the problem
Possible combination of them
DEMO: Ontology
Exploration
2013/09/03 34ONSD2013@ICEC2013
Usage and evaluation of
ontology exploration tool
 Step 1: Usage for knowledge structuring in
sustainability science
 Step 2: Verification of exploring the abilities of the
ontology exploration tool
 Step 3: Experiments for evaluating the ontology
exploration tool
2013/09/03 35ONSD2013@ICEC2013
Sustainability Science
 Sustainability Science probes interactions
between global, social, and human systems,
the complex mechanisms that lead to
degradation of these systems, and
concomitant risks to human well-being.
 The journal provides a platform for building
sustainability science as a new academic
discipline.
 These include endeavors to simultaneously
understand phenomena and solve problems,
uncertainty and application of the
precautionary principle, the co-evolution of
knowledge and recognition of problems, and
trade-offs between global and local problem
solving.
Volume 1 / 2006 - Volume 8 / 2013
Editor-in-Chief: Kazuhiko Takeuchi
Managing Editor: Osamu Saito
ISSN: 1862-4065 (print version)
ISSN: 1862-4057 (electronic version)
36
Knowledge Structuring in Sustainability Science
 Sustainability Science (SS)
– We aimed at establishing a new interdisciplinary
scheme that serves as a basis for constructing a
vision that will lead global society to a sustainable
one.
– It is required an integrated understanding of the
entire field instead of domain-wise knowledge
structuring.
 Sustainability science ontology
– Developed in collaboration with domain expert in
Osaka University Research Institute for
Sustainability Science (RISS).
– Number of concepts:649, Number of slots:
1,075
 Usage of the ontology exploration tool
– It was confirmed that the exploration was fun for
them and the tool had a certain utility for achieving
knowledge structuring in sustainability science.
[Kumazawa 2009]
http://en.ir3s.u-tokyo.ac.jp/about_sus
Sustainability Science
37
Biofuel Use Strategies for Sustainable Development
(BforSD, FY2008-FY2010)
Development of the ontology-based
mapping system which create
comprehensive views of problems and
policy measures on biofuel
(1) Structuring biofuel problems: Develop the
biofuel ontology which explicitly
conceptualizes biofuel problems through
literature review and interviews
(2)Develop an ontology exploration tool
which interactively generates conceptual
maps with paths between concepts in the
biofuel ontology
(3)In collaboration with other sub-themes,
develop an application method of this map
tool for policy making support to find,
frame and prioritize relevant problems and
policy measures.
(source) US DOE
38
One of the sub-themes
Usage and evaluation of
ontology exploration tool
 Step 1: Usage for knowledge structuring in
sustainability science
 Step 2: Verification of exploring the abilities of the
ontology exploration tool
 Step 3: Experiments for evaluating the ontology
exploration tool
2013/09/03 39ONSD2013@ICEC2013
Verification of Ontology Exploration Tool
 Verification methods
1) Enrichment of SS ontology
We enriched the SS ontology on the basis of
29 typical scenarios (cases) structured by
domain experts in biofuel through literature
review and interviews
29 scenarios
(cases)
27 conceptual
maps
40
1) Energy
services for the
poor
(+/−) Competition of biomass energy systems with the present use of biomass resources (such as agricultural residues) in applications
such as animal feed and bedding, fertilizer, and construction materials1
(−) In many developing countries, small-scale biomass energy projects face challenges obtaining finance from traditional financing
institutions1
(−) Liquid biofuels are likely to replace only a small share of global energy supplies and cannot alone eliminate our dependence on fossil
fuels2
2) Agro-
industrial
development
and job creation
(+) Biofuel is powering new small- and large-scale agro-industrial development and spawning new industries in industrialized and
developing countries1
(+/−) In the short-to-medium term, bioenergy use will depend heavily on feedstock costs and reliability of supply, cost and availability of
competing energy sources, and government policy decisions1
(+) In the longer term, the economics of biofuel will probably improve as agricultural productivity and agro-industrial efficiency improve,
more supportive agricultural and energy policies are adopted, carbon markets mature and expand, and new methodologies for carbon
sequestration accounting are developed1
(+) In the longer term, expanded demand and increased prices for agricultural commodities may represent opportunities for agricultural
and rural development2
(+) Biofuel industries create jobs, including highly skilled science, engineering, and business-related employment; medium-level
technical staff; low-skill industrial plant jobs; and unskilled agricultural labor1
(+/−) Small-scale and labor intensive production often lead to trade-offs between production efficiency and economic competitiveness1
3) Health and
gender
(−) Market opportunities cannot overcome existing social and institutional barriers to equitable growth, with exclusion factors such as
gender, ethnicity, and political powerless, and may even worsen them2
(−) Forest burning for development of feedstock plantation and sugarcane burning to facilitate manual harvesting result in air pollution,
higher surface water runoff, soil erosion, and unintended forest fires3,4
(−) Exploitation of cheap labor (plantation and migrant workers)4
(−) Increased use of pesticides could create health hazards for labors and communities living near areas of feedstock production1,3
4) Agricultural
structure
(−) The demand for land to grow biofuel crops could put pressure on competing land usage for food crops, resulting in an increase in food
prices1,2
(+/−) Significant economies of scale can be gained from processing and distributing biofuels on a large scale. The transition to liquid
biofuels can be harmful to farmers who do not own their own land, and to the rural and urban poor who are net buyers of food1
(−) While global market forces could lead to new and stable income streams, they could also increase marginalization of poor and
indigenous people and affect traditional ways of living if they end up driving small farmers without clear titles from their land and
destroying their livelihood1
(+): Positive effects,(−): Negative effects,(+/−): Both positive and negative effects
(Source) 1: UN-Energy (2007), 2: FAO (2008), 3: CBD (2008), 4: Martinelli et al. (2008)
Positive and negative effects of biofuel
41
5) Food security (−) Demand for agricultural feedstock for liquid biofuels will be a significant factor for agricultural markets and world agriculture over
the next decade and perhaps beyond2
(−) Rapid growing demand for biofuel feedstock has contributed to higher food prices, which poses an immediate threat to the food
security of poor net food buyers in both urban and rural areas2
(+/−) The effect of biofuels on food security is context-specific, depending on the particular technology and country characteristics
involved1
6) Government
budget
(−) Because ethanol is used largely as a substitute for gasoline, providing a large tax reduction for blending ethanol and gasoline reduces
government revenue from this tax, mainly targeting the non-poor1
(−) Production of biofuels in many countries, except sugarcane-based ethanol production in Brazil, is not currently economically viable
without subsidies, given existing agricultural production and biofuel-processing technologies and recent relative prices of commodity
feedstock and crude oil2
(−) Policy intervention, especially in the form of subsidies and mandated blending of biofuels with fossil fuels, are driving the rush to
liquid biofuels, which leads to high economic, social, and environmental costs in both developed and developing countries2
7) Trade, foreign
exchange
balance, and
energy security
(+) Diversifying global fuel supplies could have beneficial effects on the global oil market and many developing countries because fossil
fuel dependence has become a major risk for many developing economies1
(+/−) Rapidly rising demand for ethanol has had an impact on the price of sugar and maize in recent years, bringing substantial rewards to
farmers not only in Brazil and the United States but around the world1,2
(−) Linking of agricultural prices to the vicissitudes of the world oil market clearly presents risks; however, it is an essential transition to
the development of a biofuel industry that does not rely on major food commodity crops1
8) Biodiversity
and natural
resource
management
(+/−) Depending on the types of crop grown, what they replaced, and the methods of cultivation and harvesting, biofuels can have
negative and positive effects on land use, soil and water quality, and biodiversity1,3
(−) Problems with water availability and use may represent a limitation on agricultural biofuel production1,3
(−) Introduction of criteria, standards, and certification schemes for biofuels may generate indirect negative environmental and
biodiversity effects, passively in other countries3
(−) If the production of biofuel feedstock requires increased fertilizer and pesticide use, there could be additional detrimental effects such
as increase in GHGs emission and eutrophicating nutrients and biodiversity loss3
(−) Wild biodiversity is threatened by loss of habitat when the area under crop production is expanded, whereas agricultural biodiversity
is vulnerable in the case of large-scale monocropping, which is based on a narrow pool of genetic material, and can also lead to reduced
use of traditional varieties2,3
(+) If crops are grown on degraded or abandoned land, such as previously deforested areas or degraded crop- and grasslands, and if soil
disturbances are minimized, feedstock production for biofuels can have a positive impact on biodiversity by restoring or conserving
habitat and ecosystem function3
9) Climate
change
(+/−) Full lifecycle GHG emissions of biofuel vary widely based on land use changes, choice of feedstock, agricultural practices, refining
or conversion processes, and end-use practices1,2
(−) Land use change associated with production of biofuel feedstock can affect GHG emissions; draining wetlands and clearing land with
fire are detrimental with regard to GHG emissions and air quality2,3
(−) The greatest potential for reducing GHG emission comes from replacement of coal rather than petroleum fuels1
(+) Biofuels offer the only realistic near-term renewable option for displacing and supplementing liquid transport fuels1
(+): Positive effects,(−): Negative effects,(+/−): Both positive and negative effects
(Source) 1: UN-Energy (2007), 2: FAO (2008), 3: CBD (2008), 4: Martinelli et al. (2008) 42
Verification of Ontology Exploration Tool
burn agriculture=(deforestation, soil deterioration caused by farmland development for
biofuel crops)⇒ harvest sugarcanes (air pollution caused by intentional burn),disruption of
ecosystem caused by deforestation(water pollution)
The concepts appearing in these
scenarios were extracted and
generalized to add into the ontology
Example: Air pollution, cause of forest fire, soil deterioration, water pollution are attributed
to intentional burn when forest is logged or sugarcanes are harvested in the
farmland development for biofuel crops.
43
Verification of Ontology Exploration Tool
 Verification methods
1) Enrichment of SS ontology
We enriched the SS ontology on the basis of
29 typical scenarios (cases) structured by
domain experts in biofuel through literature
review and interviews
2) Verification of scenario reproducing
operations
We verified whether the ontology exploration
tool could generate conceptual maps which
represent original scenarios.
 Result:
– 93% (27/29) of the scenarios were
successfully reproduced as conceptual maps.
29 scenarios
(cases)
27 conceptual
maps
44
Usage and evaluation of
ontology exploration tool
 Step 1: Usage for knowledge structuring in
sustainability science
 Step 2: Verification of exploring the abilities of the
ontology exploration tool
 Step 3: Experiments for evaluating the ontology
exploration tool
 1) Whether meaningful maps for domain experts were obtained.
 2) Whether meaningful maps other than anticipated maps were
obtained.
2013/09/03 45
Maps which are representing the contents of the scenarios anticipated
by ontology developers at the time of ontology construction.
Note: the subjects don’t know what scenarios are anticipated.
ONSD2013@ICEC2013
Experiment for evaluating
ontology exploration tool
 Experimental method
1) The four experts to generated
conceptual maps with the tool in
accordance with condition settings of
given tasks.
2) They remove paths that were
apparently inappropriate from the
paths of conceptual chains included in
the generated maps.
3) They select paths according to their
interests and enter a four-level general
evaluation with free comments.
2013/09/03 46
The subjects:
4 experts in different fields.
A: Agricultural economics
B: Social science
(stakeholder analysis)
C: Risk analysis
D: Metropolitan environmental
planning
A: Interesting
B: Important but ordinary
C: Neither good or poor
D: Obviously wrong
ONSD2013@ICEC2013
Experimental results (1)
2013/09/03 47
Table.2 Experimental results.
A B C D
Expert A 2 2
Expert A
(second time) 1 1
Expert B 7 4 1 2
Expert B
(second time) 6 3 3
Expert C 8 1 5 2
Expert D 3 1 1 1
Expert A 1 1
Expert B 6 5 1
Expert C 7 2 4 1
Expert D 5 3 1 1
Expert B 8 4 2 2
Expert C 4 2 2
Expert D 3 3
61 30 22 8 1
Task 3
Total
Number of
selected paths
Path distribution based on general evaluation
Task 1
Task 2
l
a
E
n
in
c
n
p
ONSD2013@ICEC2013
Experimental results (1)
2013/09/03 48
Table.2 Experimental results.
A B C D
Expert A 2 2
Expert A
(second time) 1 1
Expert B 7 4 1 2
Expert B
(second time) 6 3 3
Expert C 8 1 5 2
Expert D 3 1 1 1
Expert A 1 1
Expert B 6 5 1
Expert C 7 2 4 1
Expert D 5 3 1 1
Expert B 8 4 2 2
Expert C 4 2 2
Expert D 3 3
61 30 22 8 1
Task 3
Total
Number of
selected paths
Path distribution based on general evaluation
Task 1
Task 2
l
a
E
n
in
c
n
p
Number of maps
generated: 13
Number of paths
evaluated: 61
Number of paths evaluated: 61
A: Interesting 30 (49%)
B: Important but ordinary 22 (36%)
C: Neither good or poor 8(13%)
D: Obviously wrong 1(2%)
We can conclude that the tool could generate
maps or paths sufficiently meaningful for experts.
85%
ONSD2013@ICEC2013
Experimental results (2)
 Quantitatively comparison of the anticipated maps with the
maps generated by the subjects
2013/09/03 49
(N) Nodes and links
included in the paths
of anticipated maps
(M) Nodes and links included
in the paths of generated and
selected by the experts
50 15050
N∩M About 75% of paths in the
generated maps are new paths
which is not anticipated from
the typical scenarios .
It is meaningful enough to claim a positive support for the developed tool.
This suggests that the tool has a sufficient possibility of presenting
unexpected contents and stimulating conception by the user.
About half (50%) of the paths
included in the anticipated maps
were included in the maps
generated by the experts.
ONSD2013@ICEC2013
Summery: Use ontology to
bridge datasets across domains
 Basic technology
 Terms (classes/instances) defined in ontology are used as common
vocabulary for search data.
 If the ontology has mapping to Multiple DBs, the user can search
across them.
 Motivation and Issue
 Combinations of multiple datasets could be valuable for Big Data
Analysis.
 However, to get all combinations across multiple Big Data is not
realistic for their size.
 Requests by the users are very different according to their interests.
 Ontology Engineering for Big Data to Solve the issue
 Ontology Exploration contribute to obtain meaningful
combinations (= viewpoints) according to the users’
interests.
2013/09/03 ONSD2013@ICEC2013 50
Case studies
 Use ontology to bridge datasets across
domains
 Understanding an Ontology through Divergent
Exploration
 Presented at ESWC2011
 Use ontology to combine deep domain
knowledge and raw data
 Japanese Medical Ontology project
 Disease ontology and Ontology of Abnormal
State
 presented at ICBO (International Conference on Biomedical
Ontology) 2011, 2012 and 2013
2013/09/03 52ONSD2013@ICEC2013
Medical ontology project in Japan
 Developed ontologies
 Disease ontology:
 Definitions of diseases as causal chains of abnormal state.
 6000+ diseases
 Anatomy ontology:
 Connections between blood vessel, nerves, bones : 10,000+
 It based on ontological frameworks (upper level
ontology) which can apply to other domains
 Models for causal chains
 Abnormal state ontology for data integration
 General framework to define complicated structures
2013/09/03 53ONSD2013@ICEC2013
Disease Ontology
 Definition of the disease ontology
 How to connect the disease
ontology to medical database
2013/09/03 54ONSD2013@ICEC2013
An example of causal chain
constituted diabetes.
2013/09/03 55
Disorder (nodes)
Causal Relationship
Core causal chain of a disease
(each color represents a disease)
Legends
loss of sight
Elevated level
of glucose in
the blood
Type I diabetes
Diabetes-related
Blindness
Steroid diabetes
Diabetes
…
…
…
…
…
…
…
… … …
…
possible causes and effects
Destruction of
pancreatic
beta cells
Lack of insulin I
in the blood
Long-term steroid
treatment
Deficiency
of insulin
Is-a relation between diseases
using chain-inclusion relationship
between causal chains
ONSD2013@ICEC2013
Structural
abnormality
Size
abnormality
Formational
abnormality
Conformational
abnormality
Small in
size
Small in
line
Small in
area
Small in
volume
Narrowing tube
Vascular stenosis Gastrointestinal
tract stenosis
Arterial stenosis …
Intestinal
stenosis
Layer 1:
Generic Abnormal
States (Object-
independent)
Layer 3:
Specific context-
dependent
Abnormal States
Coronary stenosis
in
Angina pectoris
Coronary stenosis
in
Arteriosclerosis
Intestinal stenosis
in
Ileus
Esophageal stenosis
in
Esophagitis
Esophageal
stenosis
is-a
Material
abnormality
Large
in size
disease
dependent
Blood vessel
dependent
Topological
abnormality
……
…
Is-a hierarchy of Abnormality Ontology
56
Tube-
dependent…
Narrowing
of valve
Layer2:
Object-dependent
Abnormal States
…
…
…
Coronary stenosis
2013/09/03
ONSD2013@ICEC2013
Medical Department No. of
Abnormal
states
No. of
Diseases
Allergy and Rheumatoid 1,195 87
Cardiovascular Medicine 3,052 546
Diabetes and Metabolic
Diseases
1,989 445
Orthopedic Surgery 1,883 198
Nephrology and
Endocrinology
1,706 198
Neurology 2,960 396
Digestive Medicine 1,125 233
Respiratory Medicine 1,739 788
Ophthalmology 1,306 561
Hematology and Oncology 354 415
Dermatology 908 1,086
Pediatrics 2,334 879
Otorhinolaryngology 1,118 470
Total 21,669 6,302
Disease chains Graphical Tool
Hozo-Ontology Editor
Clinicians from 13 medical
departments describe
causal chains of diseases :
• 6,302 diseases
•21,669 abnormal states
2013/09/03
ONSD2013@ICEC2013
Medical Department No. of
Abnormal
state
No. of
Disease
Allergy and Rheumatoid 1,195 87
Cardiovascular Medicine 3,052 546
Diabetes and Metabolic
Diseases
1,989 445
Orthopedic Surgery 1,883 198
Nephrology and
Endocrinology
1,706 198
Neurology 2,960 396
Digestive Medicine 1,125 233
Respiratory Medicine 1,739 788
Ophthalmology 1,306 561
Hematology and Oncology 354 415
Dermatology 908 1,086
Pediatrics 2,334 879
Otorhinolaryngology 1,118 470
Total 21,669 6,302
Each Clinician defines diseases in terms of
causal chains at his/her division
Causal Relationship
Abnormal States
Myocardial Infarction (disease)
2013/09/03
Each Clinician defines diseases in terms of
causal chains at his/her division
Causal Relationship
Abnormal States
Myocardial Infarction (disease)
•Using three layer-model of abnormality ontology
•Combining causal chains including the same or related
abnormal states by consulting is-a hierarchy
⇒Generic causal chains can be generated. 59
Layer 3
Layer 2
Layer 1
Causal Relationship
Abnormal States
Myocardial Infarction (disease)
Layer 3
Layer 2
Layer 1
Each Clinician describes the definition of disease
(causal chains of disease)at particular department 60
From 13medical divisions
All 21,000 abnormal states
can be visualized with
possible causal relationships
•Using three layer-model of abnormality ontology
•Combining causal chains including the same or related
abnormal states by consulting is-a hierarchy
⇒Generic causal chains can be generated.
Knowledge provided by
the Disease Ontology
 Definition of disease
 It can answer the following questions;
 What abnormal state could be a cause of which
diseases?
 What condition may be occur on a patient of the
disease?
 That is it can provide base knowledge to
analysis big data related to disease.
2013/09/03 ONSD2013@ICEC2013 61
DEMO:
 Visualization of abnormal state ontology
with possible causal relationships
 Java client application Developed by HOZO API.
 Disease Chain LOD
 Linked Open Data converted from the disease ontology.
 SPARQL endpoint (web API for query) and Visualization
Tool of Disease Chains by HTML5.
 http://lodc.med-ontology.jp/
2013/09/03 62ONSD2013@ICEC2013
SPARQL Endpoint
(c)The user can also browse
connected triples by clicking
rectangles that represent the objects.
(a)The user can make simple
SPARQL queries by selecting
a property and an object from
lists.
(b) When the user selects a resource
shown as a query result, triples
connected the resource are visualized.
2013/09/03 63ONSD2013@ICEC2013
2013/09/03 64ONSD2013@ICEC2013
Abnormal state
Is-a hierarchy
Clinical DB
knowledge
data
attribute⇔property
interoperability
65
Anomaly
representation
Abnormal states
Layers
Generic Chains
Disease
chains
2013/09/03
Summary(2):Disease Ontology
 Disease Ontology
 Provides domain knowledge described by medical
experts.
 Medical DB (Big Data)
 Provides evidential data from medial information system
such as electronic medical records.
It could be a good example to combine
Ontology and Big Data.
2013/09/03 66
Existing Knowledge Evidence /
New Knowledge
ONSD2013@ICEC2013
Concluding Remarks
 Ontology Engineering for Big Data
 Combination of them are good!
 Basic technology: how to combine ontology to big data
 Mapping ontology to database
 Add metadata on data using vocabulary defined in ontology
 Convert database (e.g. RDB) to ontology-based (RDF) database
 How to use Combinations of Ontology and Big Data:
Two possible approaches
 Use ontology to bridge datasets across domains
 Ontology exploration method to obtain meaningful combinations (=
viewpoints)
 Use ontology to combine deep domain knowledge and raw data
 Future Plan
 Generalizing our approaches and feedback them as new function of
Hozo
2013/09/03 67ONSD2013@ICEC2013
Acknowledgements
 A part of this work was supported by JSPS KAKENHI
Grant Numbers 24120002 and 22240011.
 A part of research on medical ontology is supported
by the Ministry of Health, Labor and Welfare, Japan,
through its “Research and development of medical
knowledge base databases for medical information
systems” and by the Japan Society for the Promotion
of Science (JSPS) through its “Funding Program for
World-Leading Innovative R&D on Science and
Technology (FIRST Program)”.
 I’m also grateful to all collaborator of each study.
2013/09/03 ONSD2013@ICEC2013 68
Acknowledgement
2013/09/03
Thank you for your attention!
Hozo Support Site:
http://www.hozo.jp/
Contact:
kozaki@ei.sanken.oaka-u.ac.jp
69ONSD2013@ICEC2013

More Related Content

What's hot

Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationAnomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationImpetus Technologies
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceDenodo
 
Ontology Tutorial: Semantic Technology for Intelligence, Defense and Security
Ontology Tutorial: Semantic Technology for Intelligence, Defense and SecurityOntology Tutorial: Semantic Technology for Intelligence, Defense and Security
Ontology Tutorial: Semantic Technology for Intelligence, Defense and SecurityBarry Smith
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
Data science applications and usecases
Data science applications and usecasesData science applications and usecases
Data science applications and usecasesSreenatha Reddy K R
 
The Role of Data Lakes in Healthcare
The Role of Data Lakes in HealthcareThe Role of Data Lakes in Healthcare
The Role of Data Lakes in HealthcarePerficient, Inc.
 
Data Mining: Applying data mining
Data Mining: Applying data miningData Mining: Applying data mining
Data Mining: Applying data miningDataminingTools Inc
 
Data Mining & Applications
Data Mining & ApplicationsData Mining & Applications
Data Mining & ApplicationsFazle Rabbi Ador
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data MeshLibbySchulze
 
Data Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesData Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesDATAVERSITY
 
ONTOLOGY BASED DATA ACCESS
ONTOLOGY BASED DATA ACCESSONTOLOGY BASED DATA ACCESS
ONTOLOGY BASED DATA ACCESSKishan Patel
 
Elsevier: Empowering Knowledge Discovery in Research with Graphs
Elsevier: Empowering Knowledge Discovery in Research with GraphsElsevier: Empowering Knowledge Discovery in Research with Graphs
Elsevier: Empowering Knowledge Discovery in Research with GraphsNeo4j
 
Introduction to Data mining
Introduction to Data miningIntroduction to Data mining
Introduction to Data miningHadi Fadlallah
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
 

What's hot (20)

Graph databases
Graph databasesGraph databases
Graph databases
 
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationAnomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and Governance
 
Ontology Tutorial: Semantic Technology for Intelligence, Defense and Security
Ontology Tutorial: Semantic Technology for Intelligence, Defense and SecurityOntology Tutorial: Semantic Technology for Intelligence, Defense and Security
Ontology Tutorial: Semantic Technology for Intelligence, Defense and Security
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Tesxt mining
Tesxt miningTesxt mining
Tesxt mining
 
Data mesh
Data meshData mesh
Data mesh
 
Data science applications and usecases
Data science applications and usecasesData science applications and usecases
Data science applications and usecases
 
The Role of Data Lakes in Healthcare
The Role of Data Lakes in HealthcareThe Role of Data Lakes in Healthcare
The Role of Data Lakes in Healthcare
 
Data Mining: Applying data mining
Data Mining: Applying data miningData Mining: Applying data mining
Data Mining: Applying data mining
 
Data Mining
Data MiningData Mining
Data Mining
 
Topic Modeling
Topic ModelingTopic Modeling
Topic Modeling
 
Data Mining & Applications
Data Mining & ApplicationsData Mining & Applications
Data Mining & Applications
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
 
Data Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesData Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & Approaches
 
Text mining
Text miningText mining
Text mining
 
ONTOLOGY BASED DATA ACCESS
ONTOLOGY BASED DATA ACCESSONTOLOGY BASED DATA ACCESS
ONTOLOGY BASED DATA ACCESS
 
Elsevier: Empowering Knowledge Discovery in Research with Graphs
Elsevier: Empowering Knowledge Discovery in Research with GraphsElsevier: Empowering Knowledge Discovery in Research with Graphs
Elsevier: Empowering Knowledge Discovery in Research with Graphs
 
Introduction to Data mining
Introduction to Data miningIntroduction to Data mining
Introduction to Data mining
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 

Similar to Ontology Engineering for Big Data

Presentation to the J. Craig Venter Institute, Dec. 2014
Presentation to the J. Craig Venter Institute, Dec. 2014Presentation to the J. Craig Venter Institute, Dec. 2014
Presentation to the J. Craig Venter Institute, Dec. 2014Mark Wilkinson
 
Acting as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeActing as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeLizLyon
 
Scientific Knowledge Graphs: an Overview
Scientific Knowledge Graphs: an OverviewScientific Knowledge Graphs: an Overview
Scientific Knowledge Graphs: an OverviewAngelo Salatino
 
Ophthalmology & Optometry 2.0
Ophthalmology & Optometry 2.0Ophthalmology & Optometry 2.0
Ophthalmology & Optometry 2.0PetteriTeikariPhD
 
Semantic Web Technologies: A Paradigm for Medical Informatics
Semantic Web Technologies: A Paradigm for Medical InformaticsSemantic Web Technologies: A Paradigm for Medical Informatics
Semantic Web Technologies: A Paradigm for Medical InformaticsChimezie Ogbuji
 
[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...
[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...
[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...DataScienceConferenc1
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the partsCarole Goble
 
Jim Gray Award Lecture
Jim Gray Award LectureJim Gray Award Lecture
Jim Gray Award LecturePhilip Bourne
 
Biomedical Entity Linking - Introduction, approaches, challenges
Biomedical Entity Linking - Introduction, approaches, challengesBiomedical Entity Linking - Introduction, approaches, challenges
Biomedical Entity Linking - Introduction, approaches, challengesAnja Pilz
 
1312020 Originality Reporthttpsucumberlands.blackboar.docx
1312020 Originality Reporthttpsucumberlands.blackboar.docx1312020 Originality Reporthttpsucumberlands.blackboar.docx
1312020 Originality Reporthttpsucumberlands.blackboar.docxaulasnilda
 
A Reason Able View To The Web Of Pathway Data
A Reason Able View To The Web Of Pathway DataA Reason Able View To The Web Of Pathway Data
A Reason Able View To The Web Of Pathway Dataguest9fc5f3
 
UpSkills: Research Data Management for the Sciences
UpSkills: Research Data Management for the SciencesUpSkills: Research Data Management for the Sciences
UpSkills: Research Data Management for the Sciencesstevage
 
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Natsuko Nicholls
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactDr. Sunil Kr. Pandey
 
Mind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and PracticeMind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and PracticeLizLyon
 
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Amit Sheth
 
Becoming Datacentric
Becoming DatacentricBecoming Datacentric
Becoming DatacentricTimothy Cook
 

Similar to Ontology Engineering for Big Data (20)

Presentation to the J. Craig Venter Institute, Dec. 2014
Presentation to the J. Craig Venter Institute, Dec. 2014Presentation to the J. Craig Venter Institute, Dec. 2014
Presentation to the J. Craig Venter Institute, Dec. 2014
 
Acting as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeActing as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decade
 
Scientific Knowledge Graphs: an Overview
Scientific Knowledge Graphs: an OverviewScientific Knowledge Graphs: an Overview
Scientific Knowledge Graphs: an Overview
 
Ophthalmology & Optometry 2.0
Ophthalmology & Optometry 2.0Ophthalmology & Optometry 2.0
Ophthalmology & Optometry 2.0
 
Semantic Web Technologies: A Paradigm for Medical Informatics
Semantic Web Technologies: A Paradigm for Medical InformaticsSemantic Web Technologies: A Paradigm for Medical Informatics
Semantic Web Technologies: A Paradigm for Medical Informatics
 
[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...
[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...
[DSC Croatia 22] Writing scientific papers about data science projects - Mirj...
 
Demo Presentation Wageningen Text Mining Workshop 2007
Demo Presentation Wageningen Text Mining Workshop 2007Demo Presentation Wageningen Text Mining Workshop 2007
Demo Presentation Wageningen Text Mining Workshop 2007
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
Jim Gray Award Lecture
Jim Gray Award LectureJim Gray Award Lecture
Jim Gray Award Lecture
 
Biomedical Entity Linking - Introduction, approaches, challenges
Biomedical Entity Linking - Introduction, approaches, challengesBiomedical Entity Linking - Introduction, approaches, challenges
Biomedical Entity Linking - Introduction, approaches, challenges
 
1312020 Originality Reporthttpsucumberlands.blackboar.docx
1312020 Originality Reporthttpsucumberlands.blackboar.docx1312020 Originality Reporthttpsucumberlands.blackboar.docx
1312020 Originality Reporthttpsucumberlands.blackboar.docx
 
A Reason Able View To The Web Of Pathway Data
A Reason Able View To The Web Of Pathway DataA Reason Able View To The Web Of Pathway Data
A Reason Able View To The Web Of Pathway Data
 
UpSkills: Research Data Management for the Sciences
UpSkills: Research Data Management for the SciencesUpSkills: Research Data Management for the Sciences
UpSkills: Research Data Management for the Sciences
 
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
 
Linked Open Data and Ontotext Projects
Linked Open Data and Ontotext ProjectsLinked Open Data and Ontotext Projects
Linked Open Data and Ontotext Projects
 
Mind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and PracticeMind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and Practice
 
British Library Datasets Programme Feb 2011
British Library Datasets Programme Feb 2011British Library Datasets Programme Feb 2011
British Library Datasets Programme Feb 2011
 
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
 
Becoming Datacentric
Becoming DatacentricBecoming Datacentric
Becoming Datacentric
 

More from Kouji Kozaki

Linked Open Data(LOD)の基本的な使い方
Linked Open Data(LOD)の基本的な使い方Linked Open Data(LOD)の基本的な使い方
Linked Open Data(LOD)の基本的な使い方Kouji Kozaki
 
オントロジー工学に基づくセマンティック技術(1)オントロジー工学入門
オントロジー工学に基づくセマンティック技術(1)オントロジー工学入門オントロジー工学に基づくセマンティック技術(1)オントロジー工学入門
オントロジー工学に基づくセマンティック技術(1)オントロジー工学入門Kouji Kozaki
 
オントロジー工学に基づくセマンティック技術(2)ナレッジグラフ入門
オントロジー工学に基づくセマンティック技術(2)ナレッジグラフ入門オントロジー工学に基づくセマンティック技術(2)ナレッジグラフ入門
オントロジー工学に基づくセマンティック技術(2)ナレッジグラフ入門Kouji Kozaki
 
オープンデータを「世界」に発信するウィキデータ・ソン【IODD2019大阪】
オープンデータを「世界」に発信するウィキデータ・ソン【IODD2019大阪】オープンデータを「世界」に発信するウィキデータ・ソン【IODD2019大阪】
オープンデータを「世界」に発信するウィキデータ・ソン【IODD2019大阪】Kouji Kozaki
 
オントロジーとは?
オントロジーとは?オントロジーとは?
オントロジーとは?Kouji Kozaki
 
オープンデータとLOD~オープンデータって何?~
オープンデータとLOD~オープンデータって何?~オープンデータとLOD~オープンデータって何?~
オープンデータとLOD~オープンデータって何?~Kouji Kozaki
 
オントロジー研究20年の歩みと今後の展望
オントロジー研究20年の歩みと今後の展望オントロジー研究20年の歩みと今後の展望
オントロジー研究20年の歩みと今後の展望Kouji Kozaki
 
Linked Open Data(LOD)を使うと“うれしい”3つの理由
Linked Open Data(LOD)を使うと“うれしい”3つの理由Linked Open Data(LOD)を使うと“うれしい”3つの理由
Linked Open Data(LOD)を使うと“うれしい”3つの理由Kouji Kozaki
 
公共データをオープンデータ公開することによる効果
公共データをオープンデータ公開することによる効果公共データをオープンデータ公開することによる効果
公共データをオープンデータ公開することによる効果Kouji Kozaki
 
オープンデータの広がりと今後の課題ー関西での活動を中心にー
オープンデータの広がりと今後の課題ー関西での活動を中心にーオープンデータの広がりと今後の課題ー関西での活動を中心にー
オープンデータの広がりと今後の課題ー関西での活動を中心にーKouji Kozaki
 
書誌データのLOD化: データソン的デモンストレーション
書誌データのLOD化: データソン的デモンストレーション書誌データのLOD化: データソン的デモンストレーション
書誌データのLOD化: データソン的デモンストレーションKouji Kozaki
 
Linked Open Data(LOD)の基本理念と基盤となる技術
Linked Open Data(LOD)の基本理念と基盤となる技術Linked Open Data(LOD)の基本理念と基盤となる技術
Linked Open Data(LOD)の基本理念と基盤となる技術Kouji Kozaki
 
Linked Dataとオントロジーによるセマンティック技術の実際
Linked Dataとオントロジーによるセマンティック技術の実際Linked Dataとオントロジーによるセマンティック技術の実際
Linked Dataとオントロジーによるセマンティック技術の実際Kouji Kozaki
 
Linked Open Data(LOD)の基本理念から考える, ハッカソンのヒント
Linked Open Data(LOD)の基本理念から考える, ハッカソンのヒントLinked Open Data(LOD)の基本理念から考える, ハッカソンのヒント
Linked Open Data(LOD)の基本理念から考える, ハッカソンのヒントKouji Kozaki
 
Wikidataを編集してみよう!
Wikidataを編集してみよう!Wikidataを編集してみよう!
Wikidataを編集してみよう!Kouji Kozaki
 
大阪市オープンデータポータルAPI(SPARQL)勉強会
大阪市オープンデータポータルAPI(SPARQL)勉強会大阪市オープンデータポータルAPI(SPARQL)勉強会
大阪市オープンデータポータルAPI(SPARQL)勉強会Kouji Kozaki
 
関西地域でのオープンデータ活動の 広がりと今後の展望 -LOD(Linked Open Data)普及活動を通して-
関西地域でのオープンデータ活動の広がりと今後の展望-LOD(Linked Open Data)普及活動を通して-関西地域でのオープンデータ活動の広がりと今後の展望-LOD(Linked Open Data)普及活動を通して-
関西地域でのオープンデータ活動の 広がりと今後の展望 -LOD(Linked Open Data)普及活動を通して-Kouji Kozaki
 
Wikidata Edit-a-thon -Wikidataを編集してみよう!-
Wikidata Edit-a-thon -Wikidataを編集してみよう!-Wikidata Edit-a-thon -Wikidataを編集してみよう!-
Wikidata Edit-a-thon -Wikidataを編集してみよう!-Kouji Kozaki
 
オープンデータをLOD化するデータソン in 高槻
オープンデータをLOD化するデータソン in 高槻オープンデータをLOD化するデータソン in 高槻
オープンデータをLOD化するデータソン in 高槻Kouji Kozaki
 

More from Kouji Kozaki (20)

Linked Open Data(LOD)の基本的な使い方
Linked Open Data(LOD)の基本的な使い方Linked Open Data(LOD)の基本的な使い方
Linked Open Data(LOD)の基本的な使い方
 
オントロジー工学に基づくセマンティック技術(1)オントロジー工学入門
オントロジー工学に基づくセマンティック技術(1)オントロジー工学入門オントロジー工学に基づくセマンティック技術(1)オントロジー工学入門
オントロジー工学に基づくセマンティック技術(1)オントロジー工学入門
 
オントロジー工学に基づくセマンティック技術(2)ナレッジグラフ入門
オントロジー工学に基づくセマンティック技術(2)ナレッジグラフ入門オントロジー工学に基づくセマンティック技術(2)ナレッジグラフ入門
オントロジー工学に基づくセマンティック技術(2)ナレッジグラフ入門
 
オープンデータを「世界」に発信するウィキデータ・ソン【IODD2019大阪】
オープンデータを「世界」に発信するウィキデータ・ソン【IODD2019大阪】オープンデータを「世界」に発信するウィキデータ・ソン【IODD2019大阪】
オープンデータを「世界」に発信するウィキデータ・ソン【IODD2019大阪】
 
オントロジーとは?
オントロジーとは?オントロジーとは?
オントロジーとは?
 
オープンデータとLOD~オープンデータって何?~
オープンデータとLOD~オープンデータって何?~オープンデータとLOD~オープンデータって何?~
オープンデータとLOD~オープンデータって何?~
 
オントロジー研究20年の歩みと今後の展望
オントロジー研究20年の歩みと今後の展望オントロジー研究20年の歩みと今後の展望
オントロジー研究20年の歩みと今後の展望
 
WikidataとOSM
WikidataとOSMWikidataとOSM
WikidataとOSM
 
Linked Open Data(LOD)を使うと“うれしい”3つの理由
Linked Open Data(LOD)を使うと“うれしい”3つの理由Linked Open Data(LOD)を使うと“うれしい”3つの理由
Linked Open Data(LOD)を使うと“うれしい”3つの理由
 
公共データをオープンデータ公開することによる効果
公共データをオープンデータ公開することによる効果公共データをオープンデータ公開することによる効果
公共データをオープンデータ公開することによる効果
 
オープンデータの広がりと今後の課題ー関西での活動を中心にー
オープンデータの広がりと今後の課題ー関西での活動を中心にーオープンデータの広がりと今後の課題ー関西での活動を中心にー
オープンデータの広がりと今後の課題ー関西での活動を中心にー
 
書誌データのLOD化: データソン的デモンストレーション
書誌データのLOD化: データソン的デモンストレーション書誌データのLOD化: データソン的デモンストレーション
書誌データのLOD化: データソン的デモンストレーション
 
Linked Open Data(LOD)の基本理念と基盤となる技術
Linked Open Data(LOD)の基本理念と基盤となる技術Linked Open Data(LOD)の基本理念と基盤となる技術
Linked Open Data(LOD)の基本理念と基盤となる技術
 
Linked Dataとオントロジーによるセマンティック技術の実際
Linked Dataとオントロジーによるセマンティック技術の実際Linked Dataとオントロジーによるセマンティック技術の実際
Linked Dataとオントロジーによるセマンティック技術の実際
 
Linked Open Data(LOD)の基本理念から考える, ハッカソンのヒント
Linked Open Data(LOD)の基本理念から考える, ハッカソンのヒントLinked Open Data(LOD)の基本理念から考える, ハッカソンのヒント
Linked Open Data(LOD)の基本理念から考える, ハッカソンのヒント
 
Wikidataを編集してみよう!
Wikidataを編集してみよう!Wikidataを編集してみよう!
Wikidataを編集してみよう!
 
大阪市オープンデータポータルAPI(SPARQL)勉強会
大阪市オープンデータポータルAPI(SPARQL)勉強会大阪市オープンデータポータルAPI(SPARQL)勉強会
大阪市オープンデータポータルAPI(SPARQL)勉強会
 
関西地域でのオープンデータ活動の 広がりと今後の展望 -LOD(Linked Open Data)普及活動を通して-
関西地域でのオープンデータ活動の広がりと今後の展望-LOD(Linked Open Data)普及活動を通して-関西地域でのオープンデータ活動の広がりと今後の展望-LOD(Linked Open Data)普及活動を通して-
関西地域でのオープンデータ活動の 広がりと今後の展望 -LOD(Linked Open Data)普及活動を通して-
 
Wikidata Edit-a-thon -Wikidataを編集してみよう!-
Wikidata Edit-a-thon -Wikidataを編集してみよう!-Wikidata Edit-a-thon -Wikidataを編集してみよう!-
Wikidata Edit-a-thon -Wikidataを編集してみよう!-
 
オープンデータをLOD化するデータソン in 高槻
オープンデータをLOD化するデータソン in 高槻オープンデータをLOD化するデータソン in 高槻
オープンデータをLOD化するデータソン in 高槻
 

Recently uploaded

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 

Ontology Engineering for Big Data

  • 1. Ontology Engineering for Big Data Kouji Kozaki The Institute of Scientific and Industrial Research (I.S.I.R), Osaka University, Japan 2013/09/03 1 Ontology and Semantic Web for Big Data (ONSD2013) Workshop in the 2013 International Computer Science and Engineering Conference (ICSEC2013), Bangkok, Thailand, 5th Sep. 2013 ONSD2013@ICEC2013
  • 2. Self introduction: Kouji KOZAKI  Brief biography  2002 Received Ph.D. from Graduate School of Engineering, Osaka University.  2002- Assistant Professor, 2008- Associate Professor in ISIR, Osaka University.  Specialty  Ontological Engineering  Main research topics  Fundamental theories of ontological engineering 2013/09/03 2ONSD2013@ICEC2013
  • 3. Ontological topics  Some examples of topics which I work on  Definition of disease  What’s “disease” ?  What’s “causal chain” ?  Is it a object or process ?  Role theory  What’s ontological difference among the following concepts?  Person  Teacher  Walker  Murderer  Mother 2013/09/03 3 …. Natural type Role (dependent concept) ONSD2013@ICEC2013
  • 4. Self introduction: Kouji KOZAKI  Brief biography  2002 Received Ph.D. from Graduate School of Engineering, Osaka University.  2002- Assistant Professor, 2008- Associate Professor in ISIR, Osaka University.  Specialty  Ontological Engineering  Main research topics  Fundamental theories of ontological engineering  Ontology development tool based on the ontological theories  Ontology development in several domains and ontology-based application  Hozo(法造) -an environment for ontology building/using- (1996- )  A software to support ontology(=法) building(=造) and use  It’s available at http://www.hozo.jp as a free software  Registered Users:3,500 (June 2012)  Java API for application development is provided.  Support formats: Original format, RDF(S), OWL.  Linked Data publishing support is coming soon. 2013/09/03 4ONSD2013@ICEC2013
  • 5. My history on Ontology Building  2002-2007 Nano technology ontology  Supported by NEDO(New Energy and Industrial Technology Development Organization)  2006- Clinical Medical ontology  Supported by Ministry of Health, Labour and Welfare, Japan  Cooperated with: Graduate School of Medicine, The University of Tokyo.  2007-2009 Sustainable Science ontology  Cooperated with: Research Institute for Sustainability Science, Osaka Univ.  2007-2010 IBMD(Integrated Bio Medical Database)  Supported by MEXT through "Integrated Database Project".  Cooperated with: Tokyo Medical and Dental University, Graduate School of Medicine, Osaka U.  2008-2012 Protein Experiment Protocol ontology  Cooperated with: Institute for Protein Research, Osaka Univ.  2008-2010 Bio Fuel ontology  Supported by the Ministry of Environment, Japan.  2009-2012 Disaster Risk ontology  Cooperated with: NIED (National Research Institute for Earth Science and Disaster Prevention)  2012- Bio mimetic ontology  Supported by JSPS KAKENHI Grant-in-Aid for Scientific Research on Innovative Areas  2012- Ontology of User Action on Web  Cooperated with: Consumer first Corp.  2013- Information Literacy ontology  Supported by JSPS KAKENHI 2013/09/03 5ONSD2013@ICEC2013
  • 6. Agenda  (1) Motivation  Ontology vs. Big Data  How we can use ontology for big data?  (2) Case Studies towards Ontology Engineering for Big Data  Ontology Exploration according to the users viewpoints  A Disease Ontology developed in Japanese Medical Ontology Project  (3) Concluding Remarks 2013/09/03 6ONSD2013@ICEC2013
  • 7. Ontology vs. Big Data  Question  Is Ontology useful for Big Data?  My answer:(I believe) Yes  Combination of ontology and Big Data could provide new solutions for many problem. 2013/09/03 7  Ontology  Not so big. (someone is big)  Built by hands.  Used based on semantics by reasoning.  Big Data  Very big.  Collected automatically.  Used without semantics by Machine Learning or Data mining. ONSD2013@ICEC2013
  • 8. How to combine Ontology and Big Data  Basic technology  Mapping ontology to database  Mapping classes (concepts) defined in ontology to database schema  Mapping classes/instances defined in ontology to data in DB  Add metadata on data using vocabulary defined in ontology  e.g. annotation on document such as webpage, paper etc.  Convert database (e.g. RDB) to ontology-based (RDF) database  e.g. linked data such as DBPedia, some bioinformatics DB, etc.  You can choose some of these technology according to your purpose 2013/09/03 ONSD2013@ICEC2013 8
  • 9. How to combine Ontology and Big Data  Basic technology  Mapping ontology to database  Mapping classes (concepts) defined in ontology to database schema  Mapping classes/instances defined in ontology to data in DB  Add metadata on data using vocabulary defined in ontology  e.g. annotation on document such as webpage, paper etc.  Convert database (e.g. RDB) to ontology-based (RDF) database  e.g. linked data such as DBPedia, some bioinformatics DB, etc.  You can choose some of these technology according to your purpose 2013/09/03 ONSD2013@ICEC2013 9 Case Study A method for mapping Abnormality Ontology (in medical domain) to medical database
  • 10. hypertension Classification of Abnormality Representations 1 blood pressure 200 mmHg blood pressure is high Various types of abnormality representations are used in medical domain blood glucose level 150 mm/dL blood glucose level is high hyperglycemia 2013/09/03 10 ONSD2013@ICEC2013
  • 11. ☑ 11 Classification of Abnormality representations 2 ※Based on quality and quantity ontologies in the Upper Ontology “YAMATO”. Property representation Quantitative representation blood pressure 200 mmHg blood glucose level 150 mm/dL Qualitative representation blood pressure is high blood glucose level is high hypertension hyperglycemia ☑Diagnosis Identify a concrete value for each patient in clinical tests ☑Definition of disease 2013/09/03 ONSD2013@ICEC2013 Abnormality Ontology Medical Database Mapping
  • 12. Structural abnormality Size abnormality Formational abnormality Conformational abnormality Small in size Small in line Small in area Small in volume Narrowing tube Vascular stenosis Gastrointestinal tract stenosis Arterial stenosis … Intestinal stenosis Layer 1: Generic Abnormal States (Object- independent) Layer 3: Specific context- dependent Abnormal States Coronary stenosis in Angina pectoris Coronary stenosis in Arteriosclerosis Intestinal stenosis in Ileus Esophageal stenosis in Esophagitis Esophageal stenosis is-a Material abnormality Large in size disease dependent Blood vessel dependent Topological abnormality …… … Is-a hierarchy of Abnormality Ontology 12 Tube- dependent… Narrowing of valve Layer2: Object-dependent Abnormal States … … … Coronary stenosis 2013/09/03
  • 13. How can we deal with clinical test data ? •In hospitals, huge volume of diagnostic/clinical test data have been accumulated. •Most are quantitative data: e.g., blood prresure 180mmHg, blood cross-sectional area 40 mmx2, Quantitative value Qualitative value 180mmHg (Vqt) high (Vql) Quantitative value:180 mmhg Threshold value blood pressure high 13 high e.g., 140mmhg 2013/09/03
  • 14. blood pressure Attribute (A) high Value (V) Basic policy for definition of abnormal states hypertension Property (P) A property is decomposed into a tuple: <Attribute (A), Attribute Value (V)> in a qualitative form. 14 Qualitative representation can be converted into a Property representation. 2013/09/03
  • 15. Quantity Property blood pressure 180 mmhg cross-section area xxcmx2 abnormality knowledge Clinical test data blood pressure high cross-section area small Hypertension Narrowing Quality Our model enables “Interoperability” from Clinical test data to conceptual knowledge about abnormal States. 15 Qualitative representation can be converted Quantitative data to Property representation. 2013/09/03
  • 16. How to combine Ontology and Big Data  Basic technology  Mapping ontology to database  Mapping classes (concepts) defined in ontology to database schema  Mapping classes/instances defined in ontology to data in DB  Add metadata on data using vocabulary defined in ontology  e.g. annotation on document such as webpage, paper etc.  Convert database (e.g. RDB) to ontology-based (RDF) database  e.g. linked data such as DBPedia, some bioinformatics DB, etc.  You can choose some of these technology according to your purpose 2013/09/03 ONSD2013@ICEC2013 16 Case Study Annotation on web browsing history of users based on Web User Action Ontology
  • 17. 0 5 10 15 20 25 30 35 40 会議毎の利用タイプの推移 Theamount ofpaperssurveyedin each conference 9 19 18 24 25 11 23 26 17 18 Theamountsoftypesofusage Web browsing history (access logs) of users List of all URLs the user accessed for 130M users × 2 year s Web User Action Ontology Analysis of consumption behavior Annotation on web browsing history of users based on ontology This is collaborative work with Consumer first, Inc.
  • 18. Basic Idea  The format of the access logs (Web browsing history) of users provided by Consumer first, Inc.  User id, access date and time, URL …  Problem  URL is meaning less string for human while someone guess its contents if it is famous site.  Diversity of access logs.  In order to analyze them, we need consistent meaning.  Annotations on the access log  We tried to add metadata which present human understandable meaning of each URL  We also developed a prototype of automatic annotation  Its recall and relevance rate is almost 0.7 ~0.9  We think this result is not bad for statistical analysis. 2013/09/03 ONSD2013@ICEC2013 18
  • 19. Ontology Engineering for Big Data  Basic technology = How to combine Ontology and Big Data  Mapping ontology to database  Add metadata on data using vocabulary defined in ontology  Convert database (e.g. RDB) to ontology-based (RDF) database  How to use Combinations of Ontology and Big Data  Ontology can provide semantics to add raw data.  Generalized concepts in ontology can connect data in various concept levels across domains.  We can use ontology as given (and authorized) knowledge to analysis big data. 2013/09/03 19ONSD2013@ICEC2013
  • 20. Ontology Engineering for Big Data  Features of ontology in class level  It reflects understanding of the target world.  Well organized ontologies have generalized rich knowledge based on consistent semantics.  Ontologies are systematized knowledge of domains.  Combination of ontology and big data  Ontology can provide semantics to add raw data.  Generalized concepts in ontology can connect data in various concept levels across domains.  We can use ontology as given (and authorized) knowledge to analysis big data. 2013/09/03 20ONSD2013@ICEC2013
  • 21. Two possible way to use ontology for big data Metadata ... LOD(Linked Open Data) Ontology Big Data Ontology Use ontology to bridge datasets across domains Use ontology to combine deep domain knowledge and raw data 2013/09/03 21ONSD2013@ICEC2013
  • 22. Case studies  Use ontology to bridge datasets across domains  Understanding an Ontology through Divergent Exploration  Presented at ESWC2011  Use ontology to combine deep domain knowledge and raw data  Japanese Medical Ontology project  Disease ontology and Ontology of Abnormal State  presented at ICBO (International Conference on Biomedical Ontology) 2011, 2012 and 2013 2013/09/03 22ONSD2013@ICEC2013
  • 23. Use ontology to bridge datasets across domains  Basic technology  Terms (classes/instances) defined in ontology are used as common vocabulary for search data.  If the ontology has mapping to Multiple DBs, the user can search across them.  Motivation and Issue  Combinations of multiple datasets could be valuable for Big Data Analysis.  e.g. climate and agriculture, healthcare and life science, etc.  However, to get all combinations across multiple Big Data is not realistic for their size.  Requests by the users are also very different according to their interests.  It is important to consider efficient method to obtain meaningful combinations. 2013/09/03 ONSD2013@ICEC2013 23 O ntology Docum ents / Law D ata Search Search across multiple DBs Common Vocabulary Raw
  • 24. A method to obtain meaningful combinations using ontology exploration 2013/09/03 24 Problem Setting Problem Solution Innovation Layer 0 Layer 1 Layer 2 Layer 3 Layer 4 Contents Management using the Metadata Map Generation Depending on Viewpoints Comparison and Convergence of multiple Maps Context Based Convergence Divergent Exploration Ontology-based Information Retrieval An ontology presents an explicit essential understanding of the target world. It provides a base knowledge to be shared among the users. They explore the ontology according to their viewpoint and generate conceptual maps as the result. These maps represent understanding from the their own viewpoints. They can use the maps as viewpoints (combinations) to get data from multiple DBs. ONSD2013@ICEC2013
  • 25. (Divergent) Ontology exploration tool Exploration of an ontology “Hozo” – Ontology Editor Multi-perspective conceptual chains represent the explorer’s understanding of ontology from the specific viewpoint. Conceptual maps Visualizations as conceptual maps from different view points 1) Exploration of multi-perspective conceptual chains 2) Visualizations of conceptual chains 2013/09/03 25ONSD2013@ICEC2013
  • 26. Referring to another concept 2013/09/03 26 Node represents a concept (=rdfs:Class) slot represents a relationship (=rdf:Property) Is-a (sub-class-of) relationshp ONSD2013@ICEC2013
  • 28. 2013/09/03 28 Aspect dialog constriction tracing classes Option settings for exploration property names Conceptual map visualizer Kinds of aspects Selected relationships are traced and shown as links in conceptual map ONSD2013@ICEC2013
  • 29. 29 Explore the focused (selected) path. 2013/09/03 ONSD2013@ICEC2013
  • 30. Functions for ontology exploration  Exploration using the aspect dialog:  Divergent exploration from one concept using the aspect dialog for each step  Search path:  Exploration of paths from stating point and ending points.  The tool allows users to post-hoc editing for extracting only interesting portions of the map.  Change view:  The tool has a function to highlight specified paths of conceptual chains on the generated map according to given viewpoints.  Comparison of maps:  The system can compare generated maps and show the common conceptual chains both of the maps. 2013/09/03 30 Manual exploration Machine exploration ONSD2013@ICEC2013
  • 31. 2013/09/03 31 Ending point (1) Ending point (3) Ending point (2) Search Path Starting point Selecting of ending points Finding all possible paths from stating point to ending points ONSD2013@ICEC2013
  • 32. 2013/09/03 32 Search Path Selected ending points ONSD2013@ICEC2013
  • 33. 2013/09/03 33 What does the result mean? Selected ending points ONSD2013@ICEC2013 Problem Kinds of method to solve the problem Possible combination of them
  • 35. Usage and evaluation of ontology exploration tool  Step 1: Usage for knowledge structuring in sustainability science  Step 2: Verification of exploring the abilities of the ontology exploration tool  Step 3: Experiments for evaluating the ontology exploration tool 2013/09/03 35ONSD2013@ICEC2013
  • 36. Sustainability Science  Sustainability Science probes interactions between global, social, and human systems, the complex mechanisms that lead to degradation of these systems, and concomitant risks to human well-being.  The journal provides a platform for building sustainability science as a new academic discipline.  These include endeavors to simultaneously understand phenomena and solve problems, uncertainty and application of the precautionary principle, the co-evolution of knowledge and recognition of problems, and trade-offs between global and local problem solving. Volume 1 / 2006 - Volume 8 / 2013 Editor-in-Chief: Kazuhiko Takeuchi Managing Editor: Osamu Saito ISSN: 1862-4065 (print version) ISSN: 1862-4057 (electronic version) 36
  • 37. Knowledge Structuring in Sustainability Science  Sustainability Science (SS) – We aimed at establishing a new interdisciplinary scheme that serves as a basis for constructing a vision that will lead global society to a sustainable one. – It is required an integrated understanding of the entire field instead of domain-wise knowledge structuring.  Sustainability science ontology – Developed in collaboration with domain expert in Osaka University Research Institute for Sustainability Science (RISS). – Number of concepts:649, Number of slots: 1,075  Usage of the ontology exploration tool – It was confirmed that the exploration was fun for them and the tool had a certain utility for achieving knowledge structuring in sustainability science. [Kumazawa 2009] http://en.ir3s.u-tokyo.ac.jp/about_sus Sustainability Science 37
  • 38. Biofuel Use Strategies for Sustainable Development (BforSD, FY2008-FY2010) Development of the ontology-based mapping system which create comprehensive views of problems and policy measures on biofuel (1) Structuring biofuel problems: Develop the biofuel ontology which explicitly conceptualizes biofuel problems through literature review and interviews (2)Develop an ontology exploration tool which interactively generates conceptual maps with paths between concepts in the biofuel ontology (3)In collaboration with other sub-themes, develop an application method of this map tool for policy making support to find, frame and prioritize relevant problems and policy measures. (source) US DOE 38 One of the sub-themes
  • 39. Usage and evaluation of ontology exploration tool  Step 1: Usage for knowledge structuring in sustainability science  Step 2: Verification of exploring the abilities of the ontology exploration tool  Step 3: Experiments for evaluating the ontology exploration tool 2013/09/03 39ONSD2013@ICEC2013
  • 40. Verification of Ontology Exploration Tool  Verification methods 1) Enrichment of SS ontology We enriched the SS ontology on the basis of 29 typical scenarios (cases) structured by domain experts in biofuel through literature review and interviews 29 scenarios (cases) 27 conceptual maps 40
  • 41. 1) Energy services for the poor (+/−) Competition of biomass energy systems with the present use of biomass resources (such as agricultural residues) in applications such as animal feed and bedding, fertilizer, and construction materials1 (−) In many developing countries, small-scale biomass energy projects face challenges obtaining finance from traditional financing institutions1 (−) Liquid biofuels are likely to replace only a small share of global energy supplies and cannot alone eliminate our dependence on fossil fuels2 2) Agro- industrial development and job creation (+) Biofuel is powering new small- and large-scale agro-industrial development and spawning new industries in industrialized and developing countries1 (+/−) In the short-to-medium term, bioenergy use will depend heavily on feedstock costs and reliability of supply, cost and availability of competing energy sources, and government policy decisions1 (+) In the longer term, the economics of biofuel will probably improve as agricultural productivity and agro-industrial efficiency improve, more supportive agricultural and energy policies are adopted, carbon markets mature and expand, and new methodologies for carbon sequestration accounting are developed1 (+) In the longer term, expanded demand and increased prices for agricultural commodities may represent opportunities for agricultural and rural development2 (+) Biofuel industries create jobs, including highly skilled science, engineering, and business-related employment; medium-level technical staff; low-skill industrial plant jobs; and unskilled agricultural labor1 (+/−) Small-scale and labor intensive production often lead to trade-offs between production efficiency and economic competitiveness1 3) Health and gender (−) Market opportunities cannot overcome existing social and institutional barriers to equitable growth, with exclusion factors such as gender, ethnicity, and political powerless, and may even worsen them2 (−) Forest burning for development of feedstock plantation and sugarcane burning to facilitate manual harvesting result in air pollution, higher surface water runoff, soil erosion, and unintended forest fires3,4 (−) Exploitation of cheap labor (plantation and migrant workers)4 (−) Increased use of pesticides could create health hazards for labors and communities living near areas of feedstock production1,3 4) Agricultural structure (−) The demand for land to grow biofuel crops could put pressure on competing land usage for food crops, resulting in an increase in food prices1,2 (+/−) Significant economies of scale can be gained from processing and distributing biofuels on a large scale. The transition to liquid biofuels can be harmful to farmers who do not own their own land, and to the rural and urban poor who are net buyers of food1 (−) While global market forces could lead to new and stable income streams, they could also increase marginalization of poor and indigenous people and affect traditional ways of living if they end up driving small farmers without clear titles from their land and destroying their livelihood1 (+): Positive effects,(−): Negative effects,(+/−): Both positive and negative effects (Source) 1: UN-Energy (2007), 2: FAO (2008), 3: CBD (2008), 4: Martinelli et al. (2008) Positive and negative effects of biofuel 41
  • 42. 5) Food security (−) Demand for agricultural feedstock for liquid biofuels will be a significant factor for agricultural markets and world agriculture over the next decade and perhaps beyond2 (−) Rapid growing demand for biofuel feedstock has contributed to higher food prices, which poses an immediate threat to the food security of poor net food buyers in both urban and rural areas2 (+/−) The effect of biofuels on food security is context-specific, depending on the particular technology and country characteristics involved1 6) Government budget (−) Because ethanol is used largely as a substitute for gasoline, providing a large tax reduction for blending ethanol and gasoline reduces government revenue from this tax, mainly targeting the non-poor1 (−) Production of biofuels in many countries, except sugarcane-based ethanol production in Brazil, is not currently economically viable without subsidies, given existing agricultural production and biofuel-processing technologies and recent relative prices of commodity feedstock and crude oil2 (−) Policy intervention, especially in the form of subsidies and mandated blending of biofuels with fossil fuels, are driving the rush to liquid biofuels, which leads to high economic, social, and environmental costs in both developed and developing countries2 7) Trade, foreign exchange balance, and energy security (+) Diversifying global fuel supplies could have beneficial effects on the global oil market and many developing countries because fossil fuel dependence has become a major risk for many developing economies1 (+/−) Rapidly rising demand for ethanol has had an impact on the price of sugar and maize in recent years, bringing substantial rewards to farmers not only in Brazil and the United States but around the world1,2 (−) Linking of agricultural prices to the vicissitudes of the world oil market clearly presents risks; however, it is an essential transition to the development of a biofuel industry that does not rely on major food commodity crops1 8) Biodiversity and natural resource management (+/−) Depending on the types of crop grown, what they replaced, and the methods of cultivation and harvesting, biofuels can have negative and positive effects on land use, soil and water quality, and biodiversity1,3 (−) Problems with water availability and use may represent a limitation on agricultural biofuel production1,3 (−) Introduction of criteria, standards, and certification schemes for biofuels may generate indirect negative environmental and biodiversity effects, passively in other countries3 (−) If the production of biofuel feedstock requires increased fertilizer and pesticide use, there could be additional detrimental effects such as increase in GHGs emission and eutrophicating nutrients and biodiversity loss3 (−) Wild biodiversity is threatened by loss of habitat when the area under crop production is expanded, whereas agricultural biodiversity is vulnerable in the case of large-scale monocropping, which is based on a narrow pool of genetic material, and can also lead to reduced use of traditional varieties2,3 (+) If crops are grown on degraded or abandoned land, such as previously deforested areas or degraded crop- and grasslands, and if soil disturbances are minimized, feedstock production for biofuels can have a positive impact on biodiversity by restoring or conserving habitat and ecosystem function3 9) Climate change (+/−) Full lifecycle GHG emissions of biofuel vary widely based on land use changes, choice of feedstock, agricultural practices, refining or conversion processes, and end-use practices1,2 (−) Land use change associated with production of biofuel feedstock can affect GHG emissions; draining wetlands and clearing land with fire are detrimental with regard to GHG emissions and air quality2,3 (−) The greatest potential for reducing GHG emission comes from replacement of coal rather than petroleum fuels1 (+) Biofuels offer the only realistic near-term renewable option for displacing and supplementing liquid transport fuels1 (+): Positive effects,(−): Negative effects,(+/−): Both positive and negative effects (Source) 1: UN-Energy (2007), 2: FAO (2008), 3: CBD (2008), 4: Martinelli et al. (2008) 42
  • 43. Verification of Ontology Exploration Tool burn agriculture=(deforestation, soil deterioration caused by farmland development for biofuel crops)⇒ harvest sugarcanes (air pollution caused by intentional burn),disruption of ecosystem caused by deforestation(water pollution) The concepts appearing in these scenarios were extracted and generalized to add into the ontology Example: Air pollution, cause of forest fire, soil deterioration, water pollution are attributed to intentional burn when forest is logged or sugarcanes are harvested in the farmland development for biofuel crops. 43
  • 44. Verification of Ontology Exploration Tool  Verification methods 1) Enrichment of SS ontology We enriched the SS ontology on the basis of 29 typical scenarios (cases) structured by domain experts in biofuel through literature review and interviews 2) Verification of scenario reproducing operations We verified whether the ontology exploration tool could generate conceptual maps which represent original scenarios.  Result: – 93% (27/29) of the scenarios were successfully reproduced as conceptual maps. 29 scenarios (cases) 27 conceptual maps 44
  • 45. Usage and evaluation of ontology exploration tool  Step 1: Usage for knowledge structuring in sustainability science  Step 2: Verification of exploring the abilities of the ontology exploration tool  Step 3: Experiments for evaluating the ontology exploration tool  1) Whether meaningful maps for domain experts were obtained.  2) Whether meaningful maps other than anticipated maps were obtained. 2013/09/03 45 Maps which are representing the contents of the scenarios anticipated by ontology developers at the time of ontology construction. Note: the subjects don’t know what scenarios are anticipated. ONSD2013@ICEC2013
  • 46. Experiment for evaluating ontology exploration tool  Experimental method 1) The four experts to generated conceptual maps with the tool in accordance with condition settings of given tasks. 2) They remove paths that were apparently inappropriate from the paths of conceptual chains included in the generated maps. 3) They select paths according to their interests and enter a four-level general evaluation with free comments. 2013/09/03 46 The subjects: 4 experts in different fields. A: Agricultural economics B: Social science (stakeholder analysis) C: Risk analysis D: Metropolitan environmental planning A: Interesting B: Important but ordinary C: Neither good or poor D: Obviously wrong ONSD2013@ICEC2013
  • 47. Experimental results (1) 2013/09/03 47 Table.2 Experimental results. A B C D Expert A 2 2 Expert A (second time) 1 1 Expert B 7 4 1 2 Expert B (second time) 6 3 3 Expert C 8 1 5 2 Expert D 3 1 1 1 Expert A 1 1 Expert B 6 5 1 Expert C 7 2 4 1 Expert D 5 3 1 1 Expert B 8 4 2 2 Expert C 4 2 2 Expert D 3 3 61 30 22 8 1 Task 3 Total Number of selected paths Path distribution based on general evaluation Task 1 Task 2 l a E n in c n p ONSD2013@ICEC2013
  • 48. Experimental results (1) 2013/09/03 48 Table.2 Experimental results. A B C D Expert A 2 2 Expert A (second time) 1 1 Expert B 7 4 1 2 Expert B (second time) 6 3 3 Expert C 8 1 5 2 Expert D 3 1 1 1 Expert A 1 1 Expert B 6 5 1 Expert C 7 2 4 1 Expert D 5 3 1 1 Expert B 8 4 2 2 Expert C 4 2 2 Expert D 3 3 61 30 22 8 1 Task 3 Total Number of selected paths Path distribution based on general evaluation Task 1 Task 2 l a E n in c n p Number of maps generated: 13 Number of paths evaluated: 61 Number of paths evaluated: 61 A: Interesting 30 (49%) B: Important but ordinary 22 (36%) C: Neither good or poor 8(13%) D: Obviously wrong 1(2%) We can conclude that the tool could generate maps or paths sufficiently meaningful for experts. 85% ONSD2013@ICEC2013
  • 49. Experimental results (2)  Quantitatively comparison of the anticipated maps with the maps generated by the subjects 2013/09/03 49 (N) Nodes and links included in the paths of anticipated maps (M) Nodes and links included in the paths of generated and selected by the experts 50 15050 N∩M About 75% of paths in the generated maps are new paths which is not anticipated from the typical scenarios . It is meaningful enough to claim a positive support for the developed tool. This suggests that the tool has a sufficient possibility of presenting unexpected contents and stimulating conception by the user. About half (50%) of the paths included in the anticipated maps were included in the maps generated by the experts. ONSD2013@ICEC2013
  • 50. Summery: Use ontology to bridge datasets across domains  Basic technology  Terms (classes/instances) defined in ontology are used as common vocabulary for search data.  If the ontology has mapping to Multiple DBs, the user can search across them.  Motivation and Issue  Combinations of multiple datasets could be valuable for Big Data Analysis.  However, to get all combinations across multiple Big Data is not realistic for their size.  Requests by the users are very different according to their interests.  Ontology Engineering for Big Data to Solve the issue  Ontology Exploration contribute to obtain meaningful combinations (= viewpoints) according to the users’ interests. 2013/09/03 ONSD2013@ICEC2013 50
  • 51. Case studies  Use ontology to bridge datasets across domains  Understanding an Ontology through Divergent Exploration  Presented at ESWC2011  Use ontology to combine deep domain knowledge and raw data  Japanese Medical Ontology project  Disease ontology and Ontology of Abnormal State  presented at ICBO (International Conference on Biomedical Ontology) 2011, 2012 and 2013 2013/09/03 52ONSD2013@ICEC2013
  • 52. Medical ontology project in Japan  Developed ontologies  Disease ontology:  Definitions of diseases as causal chains of abnormal state.  6000+ diseases  Anatomy ontology:  Connections between blood vessel, nerves, bones : 10,000+  It based on ontological frameworks (upper level ontology) which can apply to other domains  Models for causal chains  Abnormal state ontology for data integration  General framework to define complicated structures 2013/09/03 53ONSD2013@ICEC2013
  • 53. Disease Ontology  Definition of the disease ontology  How to connect the disease ontology to medical database 2013/09/03 54ONSD2013@ICEC2013
  • 54. An example of causal chain constituted diabetes. 2013/09/03 55 Disorder (nodes) Causal Relationship Core causal chain of a disease (each color represents a disease) Legends loss of sight Elevated level of glucose in the blood Type I diabetes Diabetes-related Blindness Steroid diabetes Diabetes … … … … … … … … … … … possible causes and effects Destruction of pancreatic beta cells Lack of insulin I in the blood Long-term steroid treatment Deficiency of insulin Is-a relation between diseases using chain-inclusion relationship between causal chains ONSD2013@ICEC2013
  • 55. Structural abnormality Size abnormality Formational abnormality Conformational abnormality Small in size Small in line Small in area Small in volume Narrowing tube Vascular stenosis Gastrointestinal tract stenosis Arterial stenosis … Intestinal stenosis Layer 1: Generic Abnormal States (Object- independent) Layer 3: Specific context- dependent Abnormal States Coronary stenosis in Angina pectoris Coronary stenosis in Arteriosclerosis Intestinal stenosis in Ileus Esophageal stenosis in Esophagitis Esophageal stenosis is-a Material abnormality Large in size disease dependent Blood vessel dependent Topological abnormality …… … Is-a hierarchy of Abnormality Ontology 56 Tube- dependent… Narrowing of valve Layer2: Object-dependent Abnormal States … … … Coronary stenosis 2013/09/03 ONSD2013@ICEC2013
  • 56. Medical Department No. of Abnormal states No. of Diseases Allergy and Rheumatoid 1,195 87 Cardiovascular Medicine 3,052 546 Diabetes and Metabolic Diseases 1,989 445 Orthopedic Surgery 1,883 198 Nephrology and Endocrinology 1,706 198 Neurology 2,960 396 Digestive Medicine 1,125 233 Respiratory Medicine 1,739 788 Ophthalmology 1,306 561 Hematology and Oncology 354 415 Dermatology 908 1,086 Pediatrics 2,334 879 Otorhinolaryngology 1,118 470 Total 21,669 6,302 Disease chains Graphical Tool Hozo-Ontology Editor Clinicians from 13 medical departments describe causal chains of diseases : • 6,302 diseases •21,669 abnormal states 2013/09/03 ONSD2013@ICEC2013
  • 57. Medical Department No. of Abnormal state No. of Disease Allergy and Rheumatoid 1,195 87 Cardiovascular Medicine 3,052 546 Diabetes and Metabolic Diseases 1,989 445 Orthopedic Surgery 1,883 198 Nephrology and Endocrinology 1,706 198 Neurology 2,960 396 Digestive Medicine 1,125 233 Respiratory Medicine 1,739 788 Ophthalmology 1,306 561 Hematology and Oncology 354 415 Dermatology 908 1,086 Pediatrics 2,334 879 Otorhinolaryngology 1,118 470 Total 21,669 6,302 Each Clinician defines diseases in terms of causal chains at his/her division Causal Relationship Abnormal States Myocardial Infarction (disease) 2013/09/03
  • 58. Each Clinician defines diseases in terms of causal chains at his/her division Causal Relationship Abnormal States Myocardial Infarction (disease) •Using three layer-model of abnormality ontology •Combining causal chains including the same or related abnormal states by consulting is-a hierarchy ⇒Generic causal chains can be generated. 59 Layer 3 Layer 2 Layer 1
  • 59. Causal Relationship Abnormal States Myocardial Infarction (disease) Layer 3 Layer 2 Layer 1 Each Clinician describes the definition of disease (causal chains of disease)at particular department 60 From 13medical divisions All 21,000 abnormal states can be visualized with possible causal relationships •Using three layer-model of abnormality ontology •Combining causal chains including the same or related abnormal states by consulting is-a hierarchy ⇒Generic causal chains can be generated.
  • 60. Knowledge provided by the Disease Ontology  Definition of disease  It can answer the following questions;  What abnormal state could be a cause of which diseases?  What condition may be occur on a patient of the disease?  That is it can provide base knowledge to analysis big data related to disease. 2013/09/03 ONSD2013@ICEC2013 61
  • 61. DEMO:  Visualization of abnormal state ontology with possible causal relationships  Java client application Developed by HOZO API.  Disease Chain LOD  Linked Open Data converted from the disease ontology.  SPARQL endpoint (web API for query) and Visualization Tool of Disease Chains by HTML5.  http://lodc.med-ontology.jp/ 2013/09/03 62ONSD2013@ICEC2013
  • 62. SPARQL Endpoint (c)The user can also browse connected triples by clicking rectangles that represent the objects. (a)The user can make simple SPARQL queries by selecting a property and an object from lists. (b) When the user selects a resource shown as a query result, triples connected the resource are visualized. 2013/09/03 63ONSD2013@ICEC2013
  • 64. Abnormal state Is-a hierarchy Clinical DB knowledge data attribute⇔property interoperability 65 Anomaly representation Abnormal states Layers Generic Chains Disease chains 2013/09/03
  • 65. Summary(2):Disease Ontology  Disease Ontology  Provides domain knowledge described by medical experts.  Medical DB (Big Data)  Provides evidential data from medial information system such as electronic medical records. It could be a good example to combine Ontology and Big Data. 2013/09/03 66 Existing Knowledge Evidence / New Knowledge ONSD2013@ICEC2013
  • 66. Concluding Remarks  Ontology Engineering for Big Data  Combination of them are good!  Basic technology: how to combine ontology to big data  Mapping ontology to database  Add metadata on data using vocabulary defined in ontology  Convert database (e.g. RDB) to ontology-based (RDF) database  How to use Combinations of Ontology and Big Data: Two possible approaches  Use ontology to bridge datasets across domains  Ontology exploration method to obtain meaningful combinations (= viewpoints)  Use ontology to combine deep domain knowledge and raw data  Future Plan  Generalizing our approaches and feedback them as new function of Hozo 2013/09/03 67ONSD2013@ICEC2013
  • 67. Acknowledgements  A part of this work was supported by JSPS KAKENHI Grant Numbers 24120002 and 22240011.  A part of research on medical ontology is supported by the Ministry of Health, Labor and Welfare, Japan, through its “Research and development of medical knowledge base databases for medical information systems” and by the Japan Society for the Promotion of Science (JSPS) through its “Funding Program for World-Leading Innovative R&D on Science and Technology (FIRST Program)”.  I’m also grateful to all collaborator of each study. 2013/09/03 ONSD2013@ICEC2013 68
  • 68. Acknowledgement 2013/09/03 Thank you for your attention! Hozo Support Site: http://www.hozo.jp/ Contact: kozaki@ei.sanken.oaka-u.ac.jp 69ONSD2013@ICEC2013