Some thoughts about the gaps across languages and domains through the experience on building the core common vocabularies

National Institute of Informatics (NII)
National Institute of Informatics (NII)Researcher at National Institute of Informatics (NII)
Some thoughts about the gaps across
languages and domains
through the experience on building the
core common vocabularies
Hideaki Takeda
National Institute of Informatics
takeda@nii.ac.jp
Glocal KO Workshop, Thursday August 13, 2015, Copenhagen
Who am I?
Hideaki Takeda, Dr., Eng.
• Professor, National Institute of Informatics
– Research Institute mainly for Computer Science
• Background: Computer Science, in particular, Artificial
Intelligence
• Current interest: Semantic Web, Ontology, Linked Open
Data (LOD), Social Media Analysis
• Social activities
– President, Linked Open Data Initiative (NPO)
– Founder, Dbpedia Japanese Chapter
– Specialist, Information-technology Promotion Agency,
Japan (IPA)
– Chair, Japan Link Center (Registration Agency of
International DOI Foundation)
– Board, ORCID
Core Vocabularies
• Background
– Everything is on infosphere, i.e., web
– Lots of information, lots of data, lots of systems
• Problems
– Misunderstanding/mis-matching/”missing
links“ across different domains
– Gap between human and machines (computers)
Core Vocabularies
• Aim
– Increase interoperability of information/data
– Bridge human and machine understanding
• Target
– Governmental documents/data
• Method
– Define a set of concepts which bridge (human-
readable) terms and (computer-processable) symbols
(URIs)
– Starting from the most common concepts
Core Vocabularies
• Activities worldwide
– USA: NIEM Core
• NIEM (National Information Exchange Model)
– Europe: ISA Core Vocabularies
– UN: United Nations Centre for Trade Facilitation
and Electronic Business (UN/CEFACT)
• Core Components Library (UN/CCL)
– Japan: IMI Core Vocabulary
Some thoughts about the gaps across languages and domains through the experience on building the core common vocabularies
ISA Core Vocabularies v 1.1
NIEM Architecture
http://niem.github.io/technical/iepd-versions/
NIEM
http://reference.niem.gov/niem/guidance/user-guide/vol1/user-guide-vol1.pdf
http://www.epa.gov/oei/symposium/2010/roy.pdf
IMI Project
• Supported by
– Ministry of Economy, Trade,
and Industry, Japan
• Technical Framework
– Data Model
– Core Vocabulary
– Design Rules
• Support Framework
– Tools
• for data developer
• for schema developer
– Database
• schema / tools / templates/ …
Person Type
Name
Gender
Gender Code
Birth Date
Address
…
Name Type
Type
Name
Family Name
Given Name
…
Address Type
Type
Notation
Zip Code
Prefecture
City
…
String
String
String
Code TypeString
String
String
String
String
String
Code Type
Type
Value
Name Type
Address Type
Codelist Type
String
Thing Type
10
IMI as a template for schema
Registration form for Confere
Name:
Address:
Gender:
Affiliation:
Affiliation
Address:
Attending date: -
M /
Person Type
Name
Gender
Gender Code
Birth Date
Address
…
Name Type
Type
Name
Family Name
Given Name
…
Address Type
Type
Notation
Zip Code
Prefecture
City
…
String
String
String
Code Type
String String
String
String
String
String
Code Type
Type
Value
Name Type
Address Type
Codelist Type
String
Thing Type
IMI Individual Form
Person Type
Name
Gender
Address
Affiliation
Name Type
Name
Address Type
Notation
Zip-code
String
String
String
String
Name
Address
Org.
Person
Date
Event Participation Type
Participant
Date
Design Schema
Remove unnecessary items
Add necessary items
Roles of IMI
• Structured concept dictionary
– Concept dictionary
• Terms as notation of concepts
– The entry is concept, not term
• Class concept and relation concept
• General-specific relation
– Structured dictionary
• Concepts form a network of concepts which in tern represents
meaning of individual concepts
• A class concept consists of relation concepts representing
attributes and general/specific relations
• A relation concept consists of class concepts connected as
domains and ranges and general/specific relations
• Template for schemata
– Add or remove items for the specific needs
Use of IMI
• Define the concept model
• “Serialize” it into specific “physical” forms
• Use suitable a physical form
IMI Concept
Model
RDF XML
Natural
Language Form
For Open Data For data exchange For spread sheets and documents
• Relax definition
• Interoperability
with other open data
schemata
• Strict definition
• Interoperability with DB
schemata
• Relax definition with simple
structure
• Readability by humans
IMI Core vocabulary v2.2
• Published on Feb.3 2015
• 48 core class terms
– person, address, facility, location, date, …
• 206 core property terms
– name of person, birth date, birth country, …
• Multi format
– rdf schema, xml schema
and documents for human
http://imi.ipa.go.jp/ns/core/2/ 14
Some thoughts about the gaps across languages and domains through the experience on building the core common vocabularies
Class definition (person class)
person 人
説明:人の情報を表現するためのデータ型 Data Type to describe a person
継承(inherit from) : ic:実体型
property Data type cardinality 説明 (ja) Description (en)
ID ID ic:ID型 0..n ID Identification of a Person
Name of person 氏名 ic:氏名型 0..n 氏名 Name of a Person
Gender 性別 xsd:string 0..1 性別の表記 Gender of a Person
Gender code 性別コード ic:コード型 0..1 性別コード Gender of a Person
Birth date 生年月日 ic:日付型 0..1 生年月日 Date of Birth of a Person
Death date 死亡年月日 ic:日付型 0..1 死亡年月日 Date of Death of a Person
Residence address 住所 ic:住所型 0..n 現住所 Present address of a Person
Domicile of origin 本籍 ic:住所型 0..1 本籍 Legal residence address of a Person
Contact information 連絡先 ic:連絡先型 0..n 連絡先 Contact information of a Person
Nationality 国籍 xsd:string 0..n 国籍の表記
A county that assigns rights, duties, and
privileges to a person because of the birth or
naturalization of the person in that country.
Nationality code 国籍コード ic:コード型 0..n
住民基本台帳で利用さ
れている国籍コード
A county that assigns rights, duties, and
privileges to a person because of the birth or
naturalization of the person in that country.
Birth country 出生国 xsd:string 0..1 生まれた国名 A location where a person was born.
Birth country code 出生国コード ic:コード型 0..1 生まれた国のコード A location where a person was born.
Birth place 出生地 ic:住所型 0..1 生まれた場所 A location where a person was born.
16
Class Structure
person 人
name ic:氏名型
Contact ic:連絡先型
: :
氏名
Family name xsd:string
Romanized Family name xsd:string
: :
contact 連絡先
Phone number ic:電話番号型
Address ic:住所型
: :
電話番号
: :
address 住所
Country xsd:string
Prefecture xsd:string
: :
 A class term has a property term as a sub element and the property term can refer a class
term. Again, the class term has a list of property terms. That constructs a layered structure
of terms as the following figure.
phone number
name
Concept of the IMI framework
International interoperability is highly
considered in preparing IMI.
Core
Vocabulary
Shelter
Location
Hospital
Station
Geographical Space
/Facilities
Transportation
Disaster
Prevention
Finance
Domain-specific
Vocabularies
Disaster
Restoration
Cost
Cross Domain
Vocabulary
IMI
Japanese
Local
government
Standard
(APPLIC)
DE fact
Standards
(DC, foaf,
etc)
NIEM
(US)
ISA
(EU)
Schema.org
18
Mapping between concepts in
different core vocabularies
• Difficulty of concept-concept mapping
– Matching of meaning tends to be very abstract
discussion
Concept
reference
Ontology
Real world
Concept
reference
?
Mapping between concepts in
different core vocabularies
• Difficulty of concept-concept mapping
– Matching of meaning tends to be very abstract
discussion
– Matching of references is easier
Concept
reference
Ontology
Real world
Concept
reference
?
Mapping between concepts in
different core vocabularies
• Difficulty of concept-concept mapping
– Syntactical mapping vs. semantic mapping
• Just consider what it refers in the real world, not how it
is represented in systems.
Concept
reference
Ontology
Concept
reference
?
Systems World
Cognitive World
Person
person 人
説明:人の情報を表現するためのデータ型 Data Type to describe a
person
継承(inherit from) : ic:実体型
prop
erty
Data
type
cardi
nalit
y
説明 (ja) Description (en)
ID ID ic:ID型 0..n ID Identification of a Person
Name of
person
氏名
ic:氏名
型
0..n 氏名 Name of a Person
Gender 性別
xsd:strin
g
0..1 性別の表記 Gender of a Person
ender code
性別
コード
ic:コード
型
0..1 性別コード Gender of a Person
Birth date
生年月
日
ic:日付
型
0..1 生年月日 Date of Birth of a Person
Death date
死亡年
月日
ic:日付
型
0..1 死亡年月日 Date of Death of a Person
Residence
address
住所
ic:住所
型
0..n 現住所 Present address of a Person
Domicile of
origin
本籍
ic:住所
型
0..1 本籍
Legal residence address of a
Person
Contact
nformation
連絡先
ic:連絡
先型
0..n 連絡先
Contact information of a
Person
Nationality 国籍
xsd:strin
g
0..n 国籍の表記
A county that assigns rights,
duties, and privileges to a
person because of the birth or
naturalization of the person in
that country.
住民基本台帳
A county that assigns rights,
duties, and privileges to a
?
?
Systems World
Cognitive World
Postal Code
?
?
“101-8430” ^^xsd:string “SW1A 0AA”@en
(postal code in Japan) (postal code in Europe)
Systems World
Cognitive World
Semantic Mapping
• Semantic Mapping
– Mapping on the cognitive layer
– Two ways of judging mapping
• Extensional Mapping
– Check whether ‘things’ are shared
– e.g., person
– Mostly for Class Mapping
• Intensional Mapping
– Check whether ‘values’ are shared
– e.g., postal-code
– Mostly for Property Mapping
• Syntactical Mapping
– Mapping on the systems layer
Types of matching: SKOS
• Exact Match
• Close Match
• Broad/Narrow Match
• Related Match
Close match
• Close match: nearly matched but not exactly
matched.
• Extensional mapping
– Coverage of ‘things’ are overlapped so much
• Coverage of ‘Country’ is slightly different
– ‘things’ are close
• Reference of ‘Person’ is slightly different (person vs. legal
Person)
• Intensional mapping
– Coverage of ‘values’ are overlapped so much
Broad match/narrow match
• Broad/narrow match
– One subsumes the other
• Extensional mapping
– Coverage of ‘things’ are subsumed, i.e., the subset
is exact match
• Intensional mapping
– Coverage of ‘values’ are subsumed, i.e., the subset
is exact match
More different matching
• Complicated match
– An element of a system matches a combination of
two or more elements.
– “Pathway” match
• A single property matches the combination of two or
more properties
– “Conditional” match
• An element matches the other element if some
condition is hold
IdentifierIssuingAuthority Link Has related match IMI ic:ID型.ic:ID体系.ic:発行者
LegalEntityRegisteredAddress Link Has broad match IMI ic:法人型.ic:住所 It is exact match if the value of ic:住所.種別 should be "登記住所".
Results
Core Vocabulary Identifier Link Mapping relation Data model Identifier
Address Link Has exact match IMI ic:住所型
AddressAddressArea Link Has narrow match IMI ic:住所型.ic:町名
AddressAddressArea Link Has narrow match IMI ic:住所型.ic:丁目
AddressAddressArea Link Has narrow match IMI ic:住所型.ic:番地補足
AddressAddressArea Link Has narrow match IMI ic:住所型.ic:番地
AddressAddressArea Link Has narrow match IMI ic:住所型.ic:号
AddressAddressID Link Has exact match IMI ic:住所型.ic:ID
AddressAdminUnitL1 Link Has exact match IMI ic:住所型.ic:国
AddressAdminUnitL2 Link Has narrow match IMI ic:住所型.ic:都道府県
AddressFullAddress Link Has exact match IMI ic:住所型.ic:表記
AddressLocatorDesignator Link Has narrow match IMI ic:住所型.ic:ビル番号
AddressLocatorDesignator Link Has narrow match IMI ic:住所型.ic:部屋番号
AddressLocatorName Link Has narrow match IMI ic:住所型.ic:ビル名
AddressPOBox Link Has related match IMI ic:住所型.ic:方書
AddressPostCode Link Has exact match IMI ic:住所型.ic:郵便番号
AddressPostName Link Has narrow match IMI ic:住所型.ic:市区町村
AddressPostName Link Has narrow match IMI ic:住所型.ic:区
AddressThoroughfare Link Has no match IMI
Agent Link Has exact match IMI ic:実体型
Results
Identifier Link Has exact match IMI ic:ID型
IdentifierIdentifier Link Has exact match IMI ic:ID型.ic:識別値
IdentifierIssueDate Link Has no match IMI
IdentifierIssuingAuthority Link Has related match IMI ic:ID型.ic:ID体系.ic:発行者
IdentifierIssuingAuthorityURI Link Has exact match IMI ic:ID型.ic:ID体系.ic:URI
IdentifierType Link Has no match IMI
JurisdictionIdentifier Link Has related match IMI ic:国籍コード
JurisdictionName Link Has related match IMI ic:国籍
LegalEntity Link Has exact match IMI ic:法人型
LegalEntityAddress Link Has broad match IMI ic:法人型.ic:住所
LegalEntityAlternativeName Link Has no match IMI
LegalEntityCompanyActivity Link Has close match IMI ic:法人型.ic:事業種目
LegalEntityCompanyStatus Link Has related match IMI ic:法人型.ic:活動状況
LegalEntityCompanyType Link Has exact match IMI ic:法人型.ic:組織種別
LegalEntityIdentifier Link Has exact match IMI ic:法人型.ic:ID
LegalEntityLegalIdentifier Link Has no match IMI
LegalEntityLegalName Link Has broad match IMI ic:法人型.ic:名称.表記
LegalEntityLocation Link Has related match IMI ic:法人型.ic:地物.説明
LegalEntityRegisteredAddress Link Has broad match IMI ic:法人型.ic:住所
Location Link Has exact match IMI ic:場所型
LocationAddress Link Has exact match IMI ic:場所型.ic:住所
LocationGeographicIdentifier Link Has broad match IMI ic:場所型.ic:地理識別子
LocationGeographicName Link Has exact match IMI ic:場所型.ic:名称.ic:表記
LocationGeometry Link Has exact match IMI ic:場所型.ic:地理座標
Results
Person Link Has exact match IMI ic:人型
PersonAddress Link Has exact match IMI ic:人型.ic:住所
PersonAlternativeName Link Has broad match IMI ic:人型.ic:氏名.ic:姓名
PersonBirthName Link Has broad match IMI ic:人型.ic:氏名.ic:姓名
PersonCitizenship Link Has no match IMI
PersonCountryOfBirth Link Has exact match IMI ic:人型.ic:出生国
PersonCountryOfDeath Link Has no match IMI
PersonDateOfBirth Link Has exact match IMI ic:人型.ic:生年月日
PersonDateOfDeath Link Has exact match IMI ic:人型.ic:死亡年月日
PersonFamilyName Link Has exact match IMI ic:人型.ic:氏名.ic:姓
PersonFullName Link Has exact match IMI ic:人型.ic:氏名.ic:姓名
PersonGender Link Has exact match IMI ic:人型.ic:性別コード
PersonGivenName Link Has exact match IMI ic:人型.ic:氏名.ic:名
PersonIdentifier Link Has broad match IMI ic:人型.ic:ID
PersonPatronymicName Link Has no match IMI ic:人型.ic:氏名.ic:姓名
PersonPlaceOfBirth Link Has narrow match IMI ic:人型.ic:出生地
Bridging core and domain vocabularies
(working in progress)
• Aim: Core vocabulary would be extended to
domain vocabularies
– Agriculture
– Finance
– Traffic
– …
• Task:
– Can concepts be shared between core and domains?
really?
Agricultural Activity Ontology (AAO)
Agricultural activity
crop production activity
activity for propagation
activity in the vegetative growth stage
activity in the reproductive growth stage
activity for environment control
activity for soil control
activity for climate control
activity for water control
activity for biotic control
activity for chemical control
post production activity
activity for harvesting
activity for processing
activity for extending shelf-life
activity for wrapping
indirect activity
activity for preparing materials
activity for cleaning
activity for transport
activity for monitoring
activity for maintaining farm equipment
administrative activity
activity for business administration
http://cavoc.org/aao/
An example: “activity” (and “event”)
• S: (n) activity (any specific behavior) "they avoided all recreational activity"
– direct hyponym / full hyponym
– direct hypernym / inherited hypernym / sister term
• S: (n) act, deed, human action, human activity (something that people do or cause to happen)
– S: (n) event (something that happens at a given place and time)
– [WordNet]
• Each activity is a Happening which involves volition and participants. It has
temporal dimension. It is distinguished from Events by the fact that the activity
does not trigger change of state and does not have a conceptual end point.
– [PROTON Extent module (a lightweight upper-level ontology)]
• Activity: This class represents the abstract content of an event, which may be
repeated many times, once or never. For example a training course, or a play.
– [The Event Programme Vocabulary (prog)]
• E5 Event
– Subclass of: E4 Period
– Superclass of: E7 Activity, E63 Beginning of Existence, E64 End of Existence
• E7 Activity
– Subclass of: E5 Event
– Superclass of: E8 Acquisition, E9 Move, E10 Transfer of Custody, E11 Modification,
E13 Attribute Assignment, E65 Creation …
– [CIDOC Conceptual Reference Model]
Summary
• Sharing concepts is a very long way
• No ground truth
– Step-by-step understanding of the world
– Careful consensus making
• More flexible framework is needed
– Simple mapping is not so happy
1 of 35

More Related Content

Viewers also liked(14)

corripio corripio
corripio
Sabrina Amaral1.9K views
Sigmund freud   obras completas - lopez ballesterosSigmund freud   obras completas - lopez ballesteros
Sigmund freud obras completas - lopez ballesteros
Gabinete de Psicología Profesional15.1K views
Augmenter la satisfaction de l'utilisateurAugmenter la satisfaction de l'utilisateur
Augmenter la satisfaction de l'utilisateur
Digicomp Academy Suisse Romande SA1.3K views
Manual dqpManual dqp
Manual dqp
Ana Barroca1.6K views
Estrategia nal. obesidad 1Estrategia nal. obesidad 1
Estrategia nal. obesidad 1
Universidad de Ixtlahuaca CUI794 views
Marketing na InternetMarketing na Internet
Marketing na Internet
renatofrigo5.8K views
Roteiro de estudo de caso simulação do processo de comprasRoteiro de estudo de caso simulação do processo de compras
Roteiro de estudo de caso simulação do processo de compras
Antonio Marcos Montai Messias1.3K views
Problemas de aprendizajeProblemas de aprendizaje
Problemas de aprendizaje
LISS9.5K views
Seo proposal for tensator groupSeo proposal for tensator group
Seo proposal for tensator group
Parixit Dwivedi1.3K views
Mai2010 einladung doktorandenkolloquiumMai2010 einladung doktorandenkolloquium
Mai2010 einladung doktorandenkolloquium
w&p Wilde & Partner Public Relations GmbH1.1K views

Similar to Some thoughts about the gaps across languages and domains through the experience on building the core common vocabularies

Similar to Some thoughts about the gaps across languages and domains through the experience on building the core common vocabularies(20)

More from National Institute of Informatics (NII)(20)

"分人"型社会とAI"分人"型社会とAI
"分人"型社会とAI
National Institute of Informatics (NII)759 views
研究オープンデータにおける大学と研究者の役割研究オープンデータにおける大学と研究者の役割
研究オープンデータにおける大学と研究者の役割
National Institute of Informatics (NII)5.9K views
Presenting and Preserving the Change in Taxonomic Knowledge for Linked DataPresenting and Preserving the Change in Taxonomic Knowledge for Linked Data
Presenting and Preserving the Change in Taxonomic Knowledge for Linked Data
National Institute of Informatics (NII)546 views
Crop vocabulary (CVO): Core vocabulary of crop namesCrop vocabulary (CVO): Core vocabulary of crop names
Crop vocabulary (CVO): Core vocabulary of crop names
National Institute of Informatics (NII)1.9K views
ORCIDとオープンサイエンスORCIDとオープンサイエンス
ORCIDとオープンサイエンス
National Institute of Informatics (NII)2.2K views
How to build ontologies - a case study of Agriculture Activity OntologyHow to build ontologies - a case study of Agriculture Activity Ontology
How to build ontologies - a case study of Agriculture Activity Ontology
National Institute of Informatics (NII)1.1K views
LODとオープンデータ(DBpediaとIMIの周辺を中心に)LODとオープンデータ(DBpediaとIMIの周辺を中心に)
LODとオープンデータ (DBpediaとIMIの周辺を中心に)
National Institute of Informatics (NII)751 views
Working with Global Infrastructure at a National LevelWorking with Global Infrastructure at a National Level
Working with Global Infrastructure at a National Level
National Institute of Informatics (NII)598 views
Activities of JaLC as a national serviceActivities of JaLC as a national service
Activities of JaLC as a national service
National Institute of Informatics (NII)377 views
Development and Application of Agriculture Ontologies Development and Application of Agriculture Ontologies
Development and Application of Agriculture Ontologies
National Institute of Informatics (NII)1.1K views
Design Process of Agriculture OntologiesDesign Process of Agriculture Ontologies
Design Process of Agriculture Ontologies
National Institute of Informatics (NII)1.3K views
AIの未来~技術と社会の関係のダイナミクス~AIの未来~技術と社会の関係のダイナミクス~
AIの未来 ~技術と社会の関係のダイナミクス~
National Institute of Informatics (NII)2.4K views
Towards Knowledge-Enabled SocietyTowards Knowledge-Enabled Society
Towards Knowledge-Enabled Society
National Institute of Informatics (NII)998 views
研究データ利活用に関する国内活動及び国際動向について研究データ利活用に関する国内活動及び国際動向について
研究データ利活用に関する国内活動及び国際動向について
National Institute of Informatics (NII)1.8K views
オープンサイエンスとオープンデータオープンサイエンスとオープンデータ
オープンサイエンスとオープンデータ
National Institute of Informatics (NII)3.8K views

Recently uploaded(20)

CXL at OCPCXL at OCP
CXL at OCP
CXL Forum183 views
The Research Portal of Catalonia: Growing more (information) & more (services)The Research Portal of Catalonia: Growing more (information) & more (services)
The Research Portal of Catalonia: Growing more (information) & more (services)
CSUC - Consorci de Serveis Universitaris de Catalunya51 views
ChatGPT and AI for Web DevelopersChatGPT and AI for Web Developers
ChatGPT and AI for Web Developers
Maximiliano Firtman152 views
METHOD AND SYSTEM FOR PREDICTING OPTIMAL LOAD FOR WHICH THE YIELD IS MAXIMUM ...METHOD AND SYSTEM FOR PREDICTING OPTIMAL LOAD FOR WHICH THE YIELD IS MAXIMUM ...
METHOD AND SYSTEM FOR PREDICTING OPTIMAL LOAD FOR WHICH THE YIELD IS MAXIMUM ...
Prity Khastgir IPR Strategic India Patent Attorney Amplify Innovation23 views
Java Platform Approach 1.0 - Picnic MeetupJava Platform Approach 1.0 - Picnic Meetup
Java Platform Approach 1.0 - Picnic Meetup
Rick Ossendrijver23 views

Some thoughts about the gaps across languages and domains through the experience on building the core common vocabularies

  • 1. Some thoughts about the gaps across languages and domains through the experience on building the core common vocabularies Hideaki Takeda National Institute of Informatics takeda@nii.ac.jp Glocal KO Workshop, Thursday August 13, 2015, Copenhagen
  • 2. Who am I? Hideaki Takeda, Dr., Eng. • Professor, National Institute of Informatics – Research Institute mainly for Computer Science • Background: Computer Science, in particular, Artificial Intelligence • Current interest: Semantic Web, Ontology, Linked Open Data (LOD), Social Media Analysis • Social activities – President, Linked Open Data Initiative (NPO) – Founder, Dbpedia Japanese Chapter – Specialist, Information-technology Promotion Agency, Japan (IPA) – Chair, Japan Link Center (Registration Agency of International DOI Foundation) – Board, ORCID
  • 3. Core Vocabularies • Background – Everything is on infosphere, i.e., web – Lots of information, lots of data, lots of systems • Problems – Misunderstanding/mis-matching/”missing links“ across different domains – Gap between human and machines (computers)
  • 4. Core Vocabularies • Aim – Increase interoperability of information/data – Bridge human and machine understanding • Target – Governmental documents/data • Method – Define a set of concepts which bridge (human- readable) terms and (computer-processable) symbols (URIs) – Starting from the most common concepts
  • 5. Core Vocabularies • Activities worldwide – USA: NIEM Core • NIEM (National Information Exchange Model) – Europe: ISA Core Vocabularies – UN: United Nations Centre for Trade Facilitation and Electronic Business (UN/CEFACT) • Core Components Library (UN/CCL) – Japan: IMI Core Vocabulary
  • 10. IMI Project • Supported by – Ministry of Economy, Trade, and Industry, Japan • Technical Framework – Data Model – Core Vocabulary – Design Rules • Support Framework – Tools • for data developer • for schema developer – Database • schema / tools / templates/ … Person Type Name Gender Gender Code Birth Date Address … Name Type Type Name Family Name Given Name … Address Type Type Notation Zip Code Prefecture City … String String String Code TypeString String String String String String Code Type Type Value Name Type Address Type Codelist Type String Thing Type 10
  • 11. IMI as a template for schema Registration form for Confere Name: Address: Gender: Affiliation: Affiliation Address: Attending date: - M / Person Type Name Gender Gender Code Birth Date Address … Name Type Type Name Family Name Given Name … Address Type Type Notation Zip Code Prefecture City … String String String Code Type String String String String String String Code Type Type Value Name Type Address Type Codelist Type String Thing Type IMI Individual Form Person Type Name Gender Address Affiliation Name Type Name Address Type Notation Zip-code String String String String Name Address Org. Person Date Event Participation Type Participant Date Design Schema Remove unnecessary items Add necessary items
  • 12. Roles of IMI • Structured concept dictionary – Concept dictionary • Terms as notation of concepts – The entry is concept, not term • Class concept and relation concept • General-specific relation – Structured dictionary • Concepts form a network of concepts which in tern represents meaning of individual concepts • A class concept consists of relation concepts representing attributes and general/specific relations • A relation concept consists of class concepts connected as domains and ranges and general/specific relations • Template for schemata – Add or remove items for the specific needs
  • 13. Use of IMI • Define the concept model • “Serialize” it into specific “physical” forms • Use suitable a physical form IMI Concept Model RDF XML Natural Language Form For Open Data For data exchange For spread sheets and documents • Relax definition • Interoperability with other open data schemata • Strict definition • Interoperability with DB schemata • Relax definition with simple structure • Readability by humans
  • 14. IMI Core vocabulary v2.2 • Published on Feb.3 2015 • 48 core class terms – person, address, facility, location, date, … • 206 core property terms – name of person, birth date, birth country, … • Multi format – rdf schema, xml schema and documents for human http://imi.ipa.go.jp/ns/core/2/ 14
  • 16. Class definition (person class) person 人 説明:人の情報を表現するためのデータ型 Data Type to describe a person 継承(inherit from) : ic:実体型 property Data type cardinality 説明 (ja) Description (en) ID ID ic:ID型 0..n ID Identification of a Person Name of person 氏名 ic:氏名型 0..n 氏名 Name of a Person Gender 性別 xsd:string 0..1 性別の表記 Gender of a Person Gender code 性別コード ic:コード型 0..1 性別コード Gender of a Person Birth date 生年月日 ic:日付型 0..1 生年月日 Date of Birth of a Person Death date 死亡年月日 ic:日付型 0..1 死亡年月日 Date of Death of a Person Residence address 住所 ic:住所型 0..n 現住所 Present address of a Person Domicile of origin 本籍 ic:住所型 0..1 本籍 Legal residence address of a Person Contact information 連絡先 ic:連絡先型 0..n 連絡先 Contact information of a Person Nationality 国籍 xsd:string 0..n 国籍の表記 A county that assigns rights, duties, and privileges to a person because of the birth or naturalization of the person in that country. Nationality code 国籍コード ic:コード型 0..n 住民基本台帳で利用さ れている国籍コード A county that assigns rights, duties, and privileges to a person because of the birth or naturalization of the person in that country. Birth country 出生国 xsd:string 0..1 生まれた国名 A location where a person was born. Birth country code 出生国コード ic:コード型 0..1 生まれた国のコード A location where a person was born. Birth place 出生地 ic:住所型 0..1 生まれた場所 A location where a person was born. 16
  • 17. Class Structure person 人 name ic:氏名型 Contact ic:連絡先型 : : 氏名 Family name xsd:string Romanized Family name xsd:string : : contact 連絡先 Phone number ic:電話番号型 Address ic:住所型 : : 電話番号 : : address 住所 Country xsd:string Prefecture xsd:string : :  A class term has a property term as a sub element and the property term can refer a class term. Again, the class term has a list of property terms. That constructs a layered structure of terms as the following figure. phone number name
  • 18. Concept of the IMI framework International interoperability is highly considered in preparing IMI. Core Vocabulary Shelter Location Hospital Station Geographical Space /Facilities Transportation Disaster Prevention Finance Domain-specific Vocabularies Disaster Restoration Cost Cross Domain Vocabulary IMI Japanese Local government Standard (APPLIC) DE fact Standards (DC, foaf, etc) NIEM (US) ISA (EU) Schema.org 18
  • 19. Mapping between concepts in different core vocabularies • Difficulty of concept-concept mapping – Matching of meaning tends to be very abstract discussion Concept reference Ontology Real world Concept reference ?
  • 20. Mapping between concepts in different core vocabularies • Difficulty of concept-concept mapping – Matching of meaning tends to be very abstract discussion – Matching of references is easier Concept reference Ontology Real world Concept reference ?
  • 21. Mapping between concepts in different core vocabularies • Difficulty of concept-concept mapping – Syntactical mapping vs. semantic mapping • Just consider what it refers in the real world, not how it is represented in systems. Concept reference Ontology Concept reference ? Systems World Cognitive World
  • 22. Person person 人 説明:人の情報を表現するためのデータ型 Data Type to describe a person 継承(inherit from) : ic:実体型 prop erty Data type cardi nalit y 説明 (ja) Description (en) ID ID ic:ID型 0..n ID Identification of a Person Name of person 氏名 ic:氏名 型 0..n 氏名 Name of a Person Gender 性別 xsd:strin g 0..1 性別の表記 Gender of a Person ender code 性別 コード ic:コード 型 0..1 性別コード Gender of a Person Birth date 生年月 日 ic:日付 型 0..1 生年月日 Date of Birth of a Person Death date 死亡年 月日 ic:日付 型 0..1 死亡年月日 Date of Death of a Person Residence address 住所 ic:住所 型 0..n 現住所 Present address of a Person Domicile of origin 本籍 ic:住所 型 0..1 本籍 Legal residence address of a Person Contact nformation 連絡先 ic:連絡 先型 0..n 連絡先 Contact information of a Person Nationality 国籍 xsd:strin g 0..n 国籍の表記 A county that assigns rights, duties, and privileges to a person because of the birth or naturalization of the person in that country. 住民基本台帳 A county that assigns rights, duties, and privileges to a ? ? Systems World Cognitive World
  • 23. Postal Code ? ? “101-8430” ^^xsd:string “SW1A 0AA”@en (postal code in Japan) (postal code in Europe) Systems World Cognitive World
  • 24. Semantic Mapping • Semantic Mapping – Mapping on the cognitive layer – Two ways of judging mapping • Extensional Mapping – Check whether ‘things’ are shared – e.g., person – Mostly for Class Mapping • Intensional Mapping – Check whether ‘values’ are shared – e.g., postal-code – Mostly for Property Mapping • Syntactical Mapping – Mapping on the systems layer
  • 25. Types of matching: SKOS • Exact Match • Close Match • Broad/Narrow Match • Related Match
  • 26. Close match • Close match: nearly matched but not exactly matched. • Extensional mapping – Coverage of ‘things’ are overlapped so much • Coverage of ‘Country’ is slightly different – ‘things’ are close • Reference of ‘Person’ is slightly different (person vs. legal Person) • Intensional mapping – Coverage of ‘values’ are overlapped so much
  • 27. Broad match/narrow match • Broad/narrow match – One subsumes the other • Extensional mapping – Coverage of ‘things’ are subsumed, i.e., the subset is exact match • Intensional mapping – Coverage of ‘values’ are subsumed, i.e., the subset is exact match
  • 28. More different matching • Complicated match – An element of a system matches a combination of two or more elements. – “Pathway” match • A single property matches the combination of two or more properties – “Conditional” match • An element matches the other element if some condition is hold IdentifierIssuingAuthority Link Has related match IMI ic:ID型.ic:ID体系.ic:発行者 LegalEntityRegisteredAddress Link Has broad match IMI ic:法人型.ic:住所 It is exact match if the value of ic:住所.種別 should be "登記住所".
  • 29. Results Core Vocabulary Identifier Link Mapping relation Data model Identifier Address Link Has exact match IMI ic:住所型 AddressAddressArea Link Has narrow match IMI ic:住所型.ic:町名 AddressAddressArea Link Has narrow match IMI ic:住所型.ic:丁目 AddressAddressArea Link Has narrow match IMI ic:住所型.ic:番地補足 AddressAddressArea Link Has narrow match IMI ic:住所型.ic:番地 AddressAddressArea Link Has narrow match IMI ic:住所型.ic:号 AddressAddressID Link Has exact match IMI ic:住所型.ic:ID AddressAdminUnitL1 Link Has exact match IMI ic:住所型.ic:国 AddressAdminUnitL2 Link Has narrow match IMI ic:住所型.ic:都道府県 AddressFullAddress Link Has exact match IMI ic:住所型.ic:表記 AddressLocatorDesignator Link Has narrow match IMI ic:住所型.ic:ビル番号 AddressLocatorDesignator Link Has narrow match IMI ic:住所型.ic:部屋番号 AddressLocatorName Link Has narrow match IMI ic:住所型.ic:ビル名 AddressPOBox Link Has related match IMI ic:住所型.ic:方書 AddressPostCode Link Has exact match IMI ic:住所型.ic:郵便番号 AddressPostName Link Has narrow match IMI ic:住所型.ic:市区町村 AddressPostName Link Has narrow match IMI ic:住所型.ic:区 AddressThoroughfare Link Has no match IMI Agent Link Has exact match IMI ic:実体型
  • 30. Results Identifier Link Has exact match IMI ic:ID型 IdentifierIdentifier Link Has exact match IMI ic:ID型.ic:識別値 IdentifierIssueDate Link Has no match IMI IdentifierIssuingAuthority Link Has related match IMI ic:ID型.ic:ID体系.ic:発行者 IdentifierIssuingAuthorityURI Link Has exact match IMI ic:ID型.ic:ID体系.ic:URI IdentifierType Link Has no match IMI JurisdictionIdentifier Link Has related match IMI ic:国籍コード JurisdictionName Link Has related match IMI ic:国籍 LegalEntity Link Has exact match IMI ic:法人型 LegalEntityAddress Link Has broad match IMI ic:法人型.ic:住所 LegalEntityAlternativeName Link Has no match IMI LegalEntityCompanyActivity Link Has close match IMI ic:法人型.ic:事業種目 LegalEntityCompanyStatus Link Has related match IMI ic:法人型.ic:活動状況 LegalEntityCompanyType Link Has exact match IMI ic:法人型.ic:組織種別 LegalEntityIdentifier Link Has exact match IMI ic:法人型.ic:ID LegalEntityLegalIdentifier Link Has no match IMI LegalEntityLegalName Link Has broad match IMI ic:法人型.ic:名称.表記 LegalEntityLocation Link Has related match IMI ic:法人型.ic:地物.説明 LegalEntityRegisteredAddress Link Has broad match IMI ic:法人型.ic:住所 Location Link Has exact match IMI ic:場所型 LocationAddress Link Has exact match IMI ic:場所型.ic:住所 LocationGeographicIdentifier Link Has broad match IMI ic:場所型.ic:地理識別子 LocationGeographicName Link Has exact match IMI ic:場所型.ic:名称.ic:表記 LocationGeometry Link Has exact match IMI ic:場所型.ic:地理座標
  • 31. Results Person Link Has exact match IMI ic:人型 PersonAddress Link Has exact match IMI ic:人型.ic:住所 PersonAlternativeName Link Has broad match IMI ic:人型.ic:氏名.ic:姓名 PersonBirthName Link Has broad match IMI ic:人型.ic:氏名.ic:姓名 PersonCitizenship Link Has no match IMI PersonCountryOfBirth Link Has exact match IMI ic:人型.ic:出生国 PersonCountryOfDeath Link Has no match IMI PersonDateOfBirth Link Has exact match IMI ic:人型.ic:生年月日 PersonDateOfDeath Link Has exact match IMI ic:人型.ic:死亡年月日 PersonFamilyName Link Has exact match IMI ic:人型.ic:氏名.ic:姓 PersonFullName Link Has exact match IMI ic:人型.ic:氏名.ic:姓名 PersonGender Link Has exact match IMI ic:人型.ic:性別コード PersonGivenName Link Has exact match IMI ic:人型.ic:氏名.ic:名 PersonIdentifier Link Has broad match IMI ic:人型.ic:ID PersonPatronymicName Link Has no match IMI ic:人型.ic:氏名.ic:姓名 PersonPlaceOfBirth Link Has narrow match IMI ic:人型.ic:出生地
  • 32. Bridging core and domain vocabularies (working in progress) • Aim: Core vocabulary would be extended to domain vocabularies – Agriculture – Finance – Traffic – … • Task: – Can concepts be shared between core and domains? really?
  • 33. Agricultural Activity Ontology (AAO) Agricultural activity crop production activity activity for propagation activity in the vegetative growth stage activity in the reproductive growth stage activity for environment control activity for soil control activity for climate control activity for water control activity for biotic control activity for chemical control post production activity activity for harvesting activity for processing activity for extending shelf-life activity for wrapping indirect activity activity for preparing materials activity for cleaning activity for transport activity for monitoring activity for maintaining farm equipment administrative activity activity for business administration http://cavoc.org/aao/
  • 34. An example: “activity” (and “event”) • S: (n) activity (any specific behavior) "they avoided all recreational activity" – direct hyponym / full hyponym – direct hypernym / inherited hypernym / sister term • S: (n) act, deed, human action, human activity (something that people do or cause to happen) – S: (n) event (something that happens at a given place and time) – [WordNet] • Each activity is a Happening which involves volition and participants. It has temporal dimension. It is distinguished from Events by the fact that the activity does not trigger change of state and does not have a conceptual end point. – [PROTON Extent module (a lightweight upper-level ontology)] • Activity: This class represents the abstract content of an event, which may be repeated many times, once or never. For example a training course, or a play. – [The Event Programme Vocabulary (prog)] • E5 Event – Subclass of: E4 Period – Superclass of: E7 Activity, E63 Beginning of Existence, E64 End of Existence • E7 Activity – Subclass of: E5 Event – Superclass of: E8 Acquisition, E9 Move, E10 Transfer of Custody, E11 Modification, E13 Attribute Assignment, E65 Creation … – [CIDOC Conceptual Reference Model]
  • 35. Summary • Sharing concepts is a very long way • No ground truth – Step-by-step understanding of the world – Careful consensus making • More flexible framework is needed – Simple mapping is not so happy