SlideShare a Scribd company logo
1 of 52
Download to read offline
2012 INTERNATIONAL ASIAN SUMMER SCHOOL IN LINKED DATA
            IASLOD 2012, August 13-17, 2012, KAIST, Daejeon, Korea




Identity and schema for Linked Data



             Hideaki Takeda
     National Institute of Informatics
            takeda@nii.ac.jp
                                      Hideaki Takeda / National Institute of Informatics
How to put the data into computer?
• How to describe the data?
  – The way to describe individual data
     • Schema/Class/Concept
  – The way to describe relationship among
    schema/class/concept
     • Ontology/Taxonomy/Thesaurus


• How to refer the data?
  – The way to identify individual data
     • Identifier
  – Relationship among identifiers
                              Hideaki Takeda / National Institute of Informatics
Architecture for the Semantic Web
   The world of classes (Ontologies)
   The world of instances
    (Linked Data)




           Tim Berners-Lee http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/
                                          Hideaki Takeda / National Institute of Informatics
Layers of Semantic Web
• Ontology
   – Descriptions on classes
   – RDFS, OWL
   – Challenges for ontology building
       • Ontology building is difficult by nature
           – Consistency, comprehensiveness, logicality
       • Alignment of ontologies is more difficult


  Descriptions on classes
                                     Ontology

  インスタンスに関する記述
                                     Linked Data


                            Tim Berners-Lee http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/
                                              Hideaki Takeda / National Institute of Informatics
Layers of Semantic Web
• Linked Data
   – Descriptions on instances (individuals)
   – RDF + (RDFS, OWL)
   – Pros for Linked Data
       • Easy to write (mainly fact description)
       • Easy to link (fact to fact link)
   – Cons for Linked Data
       • Difficult to describe complex structures
       • Still need for class description (-> ontology)
   Descriptions on classes
                                      Ontology

  Description on instances
                                      Linked Data


                             Tim Berners-Lee http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/
                                               Hideaki Takeda / National Institute of Informatics
Importance of Identifiers for Entities
• Everything should be identifiable!
• Human can identify things with vague
  identifiers or even without identifiers with
  help from the context around things
• On the web, the context is usually not
  available and the computer can seldom
  understand the context even if it exists
• So we need identifiers for all things

                         Hideaki Takeda / National Institute of Informatics
Identification System
• Identification is one of the primary functions for
  human information processing
   – Naming: e.g., names for people, pets, and some daily
     things
      • OK if the number of things is not so big
   – Systematic Identification
      • e.g., phone number, post-code, passport number, product number,
        ISBN
      • If the number of things is big enough
• Requirements for Systematic Identification
   – Identifier is stable and sustainable
   – Uniqueness is guaranteed
   – Identifier publisher is reliable and sustainable

                                     Hideaki Takeda / National Institute of Informatics
Identification system for Web
• Not so different from conventional identification systems
• Difference
   – Cross-system use
   – Truly digitized

• Requirements for Systematic Identification for web
   – Identifier is stable and sustainable (even after an entity may
     disappear)
   – Uniqueness is guaranteed over all systems
   – Description on should be associated to identifiers
       • since entities may not accessible
   – Identifier publisher is reliable and sustainable



                                       Hideaki Takeda / National Institute of Informatics
Solutions for the Requirements by LOD
• Requirements for Systematic Identification for
  web
  – 1. Identifier is stable and sustainable (even after an
    entity may disappear)
     • (up to each identifier publisher)
  – 2. Uniqueness is guaranteed over all systems
     • URI (not URN)
  – 3. Description on should be associated to identifiers
     • Dereferenceable URI
         – If URI is accessed, a description associated to it should be
           returned
  – 4. Identifier publisher is reliable and sustainable


                                      Hideaki Takeda / National Institute of Informatics
Some examples
ISBN(International Standard Book Number)
 •   Abstract
      – a unique numeric commercial book identifier
      – 13 digits
           • Prefix: 978 or 979 (for compatibility with EAN code)
           • Group(language-sharing country group): 1 to 5 digits
           • Publisher code:
           • Item number:
           • Check num: 1 digit
      – Management: two layers
           • National ISBN Agency – Publisher
 •   Requirement Satisfaction
      – 1. (Stable ID) Maybe (versioning often matters, and sometimes publisher may
         re-use ISBN)
      – 2. (Unique ID) Uniqueness is guaranteed but not URI
      – 3. (Dereferenceable) No mechanisms (amazon does instead!)
      – 4. (Reliable publisher) Yes          Hideaki Takeda / National Institute of Informatics
Some examples
           DOI (Digital Object Identifier)
•   Abstract
     – An identifier for scientific digital objects (mostly scientific articles)
     – An unfixed string: “prefix/suffix”
          • Prefix: assigned for publishers
          • Suffix: assigned for each object
     – Management: three layers
          • IDF (International DOI Foundation) – Registration Agency – Publisher
•   Requirement Satisfaction
     – 1. (Stable ID) Yes (not re-usable)
     – 2. (Unique ID)Uniqueness is guaranteed and URI accessible
        (http://dx.doi.org/”DOI”)
     – 3. (Dereferenaceable)Mapping to object pages but no RDF
     – 4. (Reliable publisher) Maybe



                                            Hideaki Takeda / National Institute of Informatics
Some examples
                 Dbpedia (as Identifier)
•   Abstract
     – A wikipedia page
     – Name of wikipedia page
          • Maintained manually
               – Disambiguation page
               – Redirect page
•   Requirement Satisfaction
     – 1. (Stable ID) maybe (sometimes disappear, sometimes change names,
        sometime change contents)
     – 2. (Unique ID) Uniqueness is mostly guaranteed and URI accessible
     – 3. (Dereferenceable) RDF
     – 4. (Reliable publisher) Maybe

•

                                         Hideaki Takeda / National Institute of Informatics
Identification of relationship between
                identifiers
• Co-existence of multiple identification systems on a
  field
     – Difference of coverage
     – Difference of Viewpoint
   An entity can have multiple identifiers
   Need for mapping between identifiers in different
    identification systems
   Method: Use special properties
       owl:sameAs, (rdfs:seeAlso, skos:exactMatch)
       http://sameas.org
   Some problems
         –   Logical inconsistency with owl:sameAs
         –   Maintainance



                                             Hideaki Takeda / National Institute of Informatics
LOD Cloud
(Linking Open Data)




         Hideaki Takeda / National Institute of Informatics
Summary for ID
• Identification is the crucial part in LOD
  – Data availability
  – Data inconsistency
  – Data interoperability
• Establishment of a good identification system
  leads a reliable and sustainable LOD.




                            Hideaki Takeda / National Institute of Informatics
Structuring Information
•   A wide range of structuring information
     – Keywords, tags
         • A freely chosen word or phrase just indicating some features
     – Controlled vocabulary
         • Mapping to the fixed set of words or phrases
         • e.g., the list of countries, the name authorities
     – Classification
         • System for classifying entities. Often hierarchical. Class may not carry meaning.
     – Taxonomy
         • Hierarchical term system for classification. Upper/lower relation usually means
            general/specific relation
         • e.g., the subject headings of LC
     – Thesaurus
         • System for semantics. More different types of relations: (hypersym, hyposym),
            synonym, antonym, homonym, holonym, meronym
     – Ontology
         • System of concepts. Concepts rather than words. More various relations, the
            definitions of concepts
                                                Hideaki Takeda / National Institute of Informatics
Examples in Library Science
• Many systems in the library community
• Classification
   – Universal Decimal Classification (UDC)
• Controlled Vocabulary
   – the authority files for person names, organizations, location names
       • Library of Congress : 8 Million records, MADS &SKOS
       • British Library: 2.6 million records, foaf & BIO (A vocabulary for
         biographical information)
       • National Diet Library (Japan): 1 million records, foaf
       • Deutsche Nationalbibliothek (DNB, Germany): 1.8 & 1.3 million records
         (names & organization),
       • Virtual International Authority File (VIAF): 4 million records
• Taxonomy
   – Subject Heading: LC, NDL,
       •   Library of Congress: MADS &SKOS
       •   British Library:
       •   National Diet Library (Japan): 0.1 million records, SKOS
       •   Deutsche Nationalbibliothek (DNB, Germany): 0.16 million records
                                        Hideaki Takeda / National Institute of Informatics
Hideaki Takeda / National Institute of Informatics
Hideaki Takeda / National Institute of Informatics
UDC ELEMENT              DEFINITION
                                      UDC as Linked Data                                                                    SKOS TERM        UDC
                                                                                                                                             SUBPROPERTY

UDC number (notation)    UDC notation is combination of symbols (numerals, signs and letters) that represent a class, its   skos:notation    ---
                         position in the hierarchy and its relation to other classes. Notation is a language-independent
                         indexing term that enables mechanical sorting and filing of subjects. Also called 'UDC number'
                         and 'UDC classmark'
class identifier (URI)   A unique identifier assigned to each UDC class. It identifies the relationship between a class'    skos:Concept     ---
                         meaning and its notational representation
broader class (URI)      Superordinate class: the class hierarchically above the class in question                          skos:broader     ---
caption                  Verbal description of the class content                                                            skos:prefLabel   ---
including note           Extension of the caption containing verbal examples of the class content (usually a selection of   skos:note        udc:includingN
                         important terms that do not appear in the subdivision)                                                              ote
application note         Instructions for number building, further extension and specification of the class                 skos:note        udc:application
                                                                                                                                             Note
scope note               Note explaining the extent and the meaning of a UDC class. Used to resolve disambiguation or       skos:scopeNot    ---
                         to distinguish this class from other similar classes                                               e
examples                 Examples of combination are used to illustrate UDC class building i.e. complex subject             skos:example     ---
                         statements
see also reference       Indication of conceptual relationship between UDC classes from different hierarchies               skos:related     ---

                                              <skos:Concept rdf:about="http://udcdata.info/025553">
    69,000 records                            <skos:inScheme rdf:resource="http://udcdata.info/udc-schema"/>
    40 Languages                              <skos:broader rdf:resource="http://udcdata.info/025461"/>
                                              <skos:notation rdf:datatype="http://udcdata.info/UDCnotation">510.6</skos:notation>
                                                          <skos:prefLabel xml:lang="en">Mathematical logic</skos:prefLabel>
                                                          <skos:prefLabel xml:lang="ja">記号論理学</skos:prefLabel>
                                                          <skos:related rdf:resource="http://udcdata.info/000016"/>
http://udcdata.info/                          </skos:Concept>
                                                                       Hideaki Takeda / National Institute of Informatics
http://id.loc.gov/authorities/names/n79084664.html   <http://id.loc.gov/authorities/names/n79084664>
                                                                   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
                                                                   <http://www.loc.gov/mads/rdf/v1#PersonalName> .
                                                     <http://id.loc.gov/authorities/names/n79084664>
                                                                   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
                                                                   <http://www.loc.gov/mads/rdf/v1#Authority> .
                                                     <http://id.loc.gov/authorities/names/n79084664>
                                                                   <http://www.loc.gov/mads/rdf/v1#authoritativeLabel>
                                                                    "Natsume, Sōseki, 1867-1916"@en .
                                                     <http://id.loc.gov/authorities/names/n79084664>
                                                                   <http://www.loc.gov/mads/rdf/v1#elementList>
                                                                   _:bnode7authoritiesnamesn79084664 .
                                                     _:bnode7authoritiesnamesn79084664
                                                                   <http://www.w3.org/1999/02/22-rdf-syntax-ns#first>
                                                                   _:bnode8authoritiesnamesn79084664 .
                                                     _:bnode7authoritiesnamesn79084664
                                                                   <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest>
                                                                   _:bnode010 .
                                                     _:bnode8authoritiesnamesn79084664
                                                                   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
                                                                   <http://www.loc.gov/mads/rdf/v1#FullNameElement> .
                                                     _:bnode8authoritiesnamesn79084664
                                                                   <http://www.loc.gov/mads/rdf/v1#elementValue>
                                                                   "Natsume, Sōseki,"@en .
                                                     _:bnode010 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first>
                                                                   _:bnode11authoritiesnamesn79084664 .
                                                     _:bnode010 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest>
                                                                   <http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> .
                                                     _:bnode11authoritiesnamesn79084664
                                                                   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
                                                                   <http://www.loc.gov/mads/rdf/v1#DateNameElement> .
                                                                  Hideaki Takeda / National Institute of Informatics
http://id.loc.gov/authorities/subjects/sh85008180.html




                                                         Hideaki Takeda / National Institute of Informatics
http://data.bnf.fr/11932084/intelligence_artificielle/




                                            Hideaki Takeda / National Institute of Informatics
Some examples
    Scientific Names for Species and Taxa
• Abstract
   – Names for biological species and other taxa (kingdom, divison, class,
     order, family, tribe, genus)
   – A string
       • Binomial name for species
       • Academic societies maintain taxon names individually
   – E.g., Papilo xuthus (Asian Swallowtail, ナミアゲハ,호랑나비)

•   Requirement Satisfaction
     – 1. Mostly yes (sometimes disappear, change names, change contents)
     – 2. Uniqueness is generally guaranteed but precise speaking some ambiguity
       because of change.
     – 3. No. Many systems exists but none covers all species
     – 4. Maybe



                                          Hideaki Takeda / National Institute of Informatics
植物            藻類              菌類             動物
             分類群 Taxon
                                Plants         Algae           Fungi         Animals
ドメイン           Domain
 界            Kingdom
 門         Division/Phylum      -phyta        -phyta          -mycota
 亜門     Subdivision/Subphylum   -phytina     -phytina       -mycotina
 綱             Class            -opsida      -phyceae        -mycetes
 亜綱           Subclass           -idae      -phycidae       -mycetidae
 目             Order             -ales         -ales            -ales
 亜目           Suborder          -ineae        -ineae           -ineae
 上科          Superfamily         -acea         -acea           -acea          -oidea
 科             Family           -aceae        -aceae          -aceae           -idae
 亜科          Subfamily          -oideae       -oideae         -oideae          -inae
 族/連            Tribe            -eae           -eae            -eae            -ini
亜族/亜連         Subtribe           -inae         -inae           -inae            -ina
 属             Genus
 亜属           Subgenus
 種            Species
 亜種          Subspecies          Hideaki Takeda / National Institute of Informatics
Ontology
An ontology is an explicit specification of a
  conceptualization [Gruber]
      An ontology is an explicit specification of a conceptualization. The
       term is borrowed from philosophy, where an Ontology is a systematic
       account of Existence. For AI systems, what "exists" is that which can
       be represented. When the knowledge of a domain is represented in a
       declarative formalism, the set of objects that can be represented is
       called the universe of discourse. This set of objects, and the
       describable relationships among them, are reflected in the
       representational vocabulary with which a knowledge-based program
       represents knowledge. Thus, in the context of AI, we can describe the
       ontology of a program by defining a set of representational terms. In
       such an ontology, definitions associate the names of entities in the
       universe of discourse (e.g., classes, relations, functions, or other
       objects) with human-readable text describing what the names mean,
       and formal axioms that constrain the interpretation and well-formed
       use of these terms. Formally, an ontology is the statement of a logical
       theory.



                                      Hideaki Takeda / National Institute of Informatics
Conceptualization
                                                 object

                                                                        on_desk(A)
                                           box                          on(A, B)
                                                                        put(A,B)

                                     red box     blue box yellow box


        object          on_desk(A)               object            on(A/box, B/object)
                        on(A, B)                                   put(A/box,B/object)
                        put(A,B)
  box                                     box          desk
                                                 box
box                                               color:{red, blue, yellow}
 color:{red, blue, yellow}


There are many possible ways to conceptualize the target world
                 Trade off between generality and efficiency
                                        Hideaki Takeda / National Institute of Informatics
Types of Ontologies
• Upper (top-level) ontology vs. Domain ontology
   – Upper Ontology: A common ontology throughout all domains
   – Domain Ontology: An ontology which is meaningful in a specific
     domain
• Object ontology vs. Task ontology
   – Object Ontology: An ontology on “things” and “events”
   – Task Ontology: An ontology on “doing”
• Heavy-weight ontology vs. light-weight ontology
   – Heavy-weight ontology: fully described ontology including
     concept definitions and relations, in particular in a logical way
   – Light-weight ontology: partially described ontology including
     typically only is-a relations



                                    Hideaki Takeda / National Institute of Informatics
Top-level ontology
• Ontology which covers all of the world!
• Very…. Difficult
  – e.g., how does a thing exist?
     • A thing is four dimensional existence?
     • A thing exists three-dimensionally over time?
• Common requirements
  – A small number of concepts can cover the world
  – Concepts can be used in lower ontologies

  – Concept should be general and abstract

                                 Hideaki Takeda / National Institute of Informatics
•   Three approaches    Top-level ontology
     – Formal approach
          • Logical formalization
          • Fully Abstract
          • Pros: clean
          • Cons: hardly understandable
          • e.g., Sowa’s top-level ontology, DOLCE
     – Linguistic approach
          • Use and extension of linguistic concepts
          • Partially abstract and partially general
          • Pros: understandable
          • Cons: limitation to the linguistic world
          • e.g., Penman Upper Model, WordNet
     – Empirical Approach
          • Use and extension of everyday concepts
          • Mostly general
          • Pros: understandable and applicable to all the world
          • Cons: lack of solid foundation
          • e.g. SUMO, Cyc, EDR


                                               Hideaki Takeda / National Institute of Informatics
Empirical top-level ontology
• SUMO(Suggested Upper
  Merged Ontology)
  – Collection and organization of                                                        Substance


    concepts used frequently                     Object
                                                           SelfConnectedObject

                                                            CorpuscularObject
                                                                                           Organic

                                                                                           Inorganic


  – Simple relationship between       Phsical
                                                            Collection

                                                                             Biological
                                                                                                   Phisiologic
                                                                                                    Process
                                                           NaturalProcess
    concepts                                                                 Process
                                                                                                   Pathojogic
                                                                                                    Process

                                                                            ChangeOfProssession
                                                 Process   Intentionally
                                                              Caused         Searching           Communication
                                                             Process
                             Entity                                            Social            Cooperation
                                                                            Interaction
                                                                                                       Contest

                                                                                                     Meeting


                                                                               Transfer                Impelling

                                                                                                       Putting
                                                                              Impacting
                                                              Motion
                                                                                                     Removing
                                                                            BringingTogether
                                      Abstract              ChangeOf                             Transportation
                                                              State           Separating
                             Hideaki Takeda / National Institute of Informatics
Formal Ontology: DOLCE
• DOLCE(a Descriptive Ontology for Linguistic
  and Cognitive Engineering)
  – Intended to a reference system for top-level
    ontology
  – Logical definition
  – Particular (DOLCE) vs. Universal
     • Particular: ontology about things, phenomena, quality…
     • Universal: ontology for describing particular like
       categories and attributes


                              Hideaki Takeda / National Institute of Informatics
M



    Formal Ontology: DOLCE
                                                                                                 Amount of
                                                                                                  Matter

                                                                           PED
                                                                                                    F
                                                                         Physical                                         APO
                                                                                                 Feature
                                                                         Endurant                                       Agentive
                                                                                                                     Physical Object
                                                                                                   POB
                                                                                                 Physical
•   Concepts                                                                                      Object                 NAPO
                                                                                                                      Non-agentive

     – Endurant / Perdurant / Quality / Abstract                            NPED
                                                                         Non-Physical
                                                                                                                     Physical Object

                                                                                                    NPOB                MOB
          • Endurant:                                    ED
                                                                           Endurant
                                                                                                 Non-physical
                                                                                                   Object
                                                                                                                     Mental Object
                                                      Endurant
              – “Things”                                                    AS
                                                                         Arbitrary
                                                                                                                         SOB
                                                                                                                     Social Object

              – An existence over time                                     Sum
                                                                                                          ACH

              – May change its attribute
                                                                                                       Achievement
                                                                                 EV
                                                                                Event
                                                          PD                                                   ACC
          • Perdurant                           ALL
                                               Entity
                                                       Perdurant
                                                      Occurence
                                                                                                           Accomplishment


              – “process”                                                        STV
                                                                                                            ST
                                                                                                           State
              – No change over time                                             Stative
                                                                                                            PRO
              – May switch a part to the other                                                             Process


•   Relations                                           Q
                                                                              TQ
                                                                        Temporal Quality
                                                                                                               TL
                                                                                                        Temporal Location

     – Parthood (abstract or perdurant)               Quality
                                                                              PQ
                                                                                                               SL
                                                                        Physical Quality
     – Temporally Parthood (endurant)                                         AQ
                                                                                                        Spatial Location


     – Constitution (endurant or perdurant)                             Abstract Quality


     – Participation between perdurant and endurantAB                  Fact                     TR
                                                                                          Temporal Region
                                                            Abstract                                                      T
                                                                       Set                                           Time Interval
                                                                                                PR
                                                                                          Physical Region
                                                                         R                                                S
                                                                       Region                                        Space Region
                                                                                                AR
                                                                                          Abstract Region
                                              Hideaki Takeda / National Institute of Informatics
Linguistic top-level ontology
• WordNet
   – A lexical reference system
         • “Link-based electronic dictionary”
 http://www.cogsci.princeton.edu/cgi-bin/webwn



   – Concepts
         • synset
               – Noun 79,689
               – Verb 13,508
   – Relations
         • synonym
         • hypernym/hyponym (is-a)
         • holonym/meronym (a-part-of)


                                                 Hideaki Takeda / National Institute of Informatics
•
          Linguistic top-level ontology
    WordNet
    – Top-level
        • { entity, physical thing (that which is perceived or known or inferred to
          have its own physical existence (living or nonliving)) }
        • { psychological_feature, (a feature of the mental life of a living organism) }
        • { abstraction, (a general concept formed by extracting common features
          from specific examples) }
        • { state, (the way something is with respect to its main attributes; "the
          current state of knowledge"; "his state of health"; "in a weak financial
          state") }
        • { event, (something that happens at a given place and time) }
        • { act, human_action, human_activity, (something that people do or cause
          to happen) }
        • { group, grouping, (any number of entities (members) considered as a
          unit) }
        • { possession, (anything owned or possessed) }
        • { phenomenon, (any state or process known through the senses rather
          than by intuition or reasoning) }


                                           Hideaki Takeda / National Institute of Informatics
Summary for structuring information
• Keywords, tags/Controlled vocabulary
  /Classification/Taxonomy /Thesaurus/Ontology
  – The difference is not clear, not important
  – The trend is to go more structured ones
  – The same requirements to Identification systems




                          Hideaki Takeda / National Institute of Informatics
Summary
• Requirements for Successful Structuring
  Systems
  – 1. Entity is stable and sustainable
                                                                             LOD Tech.
  – 2. Uniqueness is guaranteed over all systems                             can help
  – 3. Description on should be associated to entity
  – 4. System publisher is reliable and sustainable
     • Learn from success in the library community




                                  Hideaki Takeda / National Institute of Informatics
Schema/Vocabulary for LOD
• Class/Concept description
  – Axiom of a concept in ontology
  – Database schema for a table in Relational database
  – Object definition in Object-Oriented Programming/DB
• Class description in Semantic Web
  – RDFS/OWL description for a class
     • RDFS: Simple class system
     • OWL: Description Logic-based
• Class description in Linked Data
  – Mostly RDFS-based (exception: owl:sameAs)
  – Simple Structure (mostly property-value pair)

                               Hideaki Takeda / National Institute of Informatics
Schema/Vocabulary for LOD
• The importance of sharing schema
  – Interoperability
  – Generic applications
• Some famous and frequently used shemata
  – Dublin Core
  – FOAF (Friend-Of-A-Friend)
  – SKOS (Simple Knowledge Organization System)



                           Hideaki Takeda / National Institute of Informatics
Usage of Common Vocabularies
        Prefix                      Namespace                                    Used by

dc               http://purl.org/dc/elements/1.1/                             66 (31.88 %)
foaf             http://xmlns.com/foaf/0.1/                                   55 (26.57 %)
dcterms          http://purl.org/dc/terms/                                    38 (18.36 %)
skos             http://www.w3.org/2004/02/skos/core#                         29 (14.01 %)

akt              http://www.aktors.org/ontology/portal#                        17 (8.21 %)

geo              http://www.w3.org/2003/01/geo/wgs84_pos#                      14 (6.76 %)
mo               http://purl.org/ontology/mo/                                  13 (6.28 %)
bibo             http://purl.org/ontology/bibo/                                 8 (3.86 %)

vcard            http://www.w3.org/2006/vcard/ns#                               6 (2.90 %)

frbr             http://purl.org/vocab/frbr/core#                               5 (2.42 %)
sioc             http://rdfs.org/sioc/ns#                                       4 (1.93 %)
                               LDOW2011 Presentation, Christian Bizer (Freie Universität Berlin), 2011
                                         Hideaki Takeda / National Institute of Informatics
(Simple) Dublin Core
• Started from the library                • 15 elements
  community                                      –   Title
• Now maintained by DCMI (Dublin                 –   Creator
  Core Metadata Initiative)                      –   Subject
• (Simple) Dublin Core                           –   Description
   –   Just 15 elements                          –   Publisher
   –   Simple is best                            –   Contributor
   –   No range restriction                      –   Date
   –   http://purl.org/dc/elements/1.1/          –   Type
                                                 –   Format
                                                 –   Identifier
                                                 –   Source
                                                 –   Language
                                                 –   Relation
                                                 –   Coverage
                                                 –   Rights




                                          Hideaki Takeda / National Institute of Informatics
dc terms
    • Qualified Dublin Core
           – Domain & Range
           – More precise terms
                  • Extension of simple dc

Properties in the /           abstract , accessRights , accrualMethod , accrualPeriodicity , accrualPolicy , alternative , audience , available , bibliograp
                              hicCitation ,conformsTo , contributor , coverage , created , creator , date , dateAccepted , dateCopyrighted , dateSubmit
                              ted , description ,educationLevel , extent , format , hasFormat , hasPart , hasVersion , identifier , instructionalMethod , i
                              sFormatOf , isPartOf , isReferencedBy ,isReplacedBy , isRequiredBy , issued , isVersionOf , language , license , mediator ,
                              medium , modified , provenance , publisher , references ,relation , replaces , requires , rights , rightsHolder , source , sp
                              atial , subject , tableOfContents , temporal , title , type , valid
Properties in the             contributor , coverage , creator , date , description , format , identifier , language , publisher , relation , rights , source , s
/elements/1.1/namespace       ubject , title , type
Vocabulary Encoding Schemes   DCMIType , DDC , IMT , LCC , LCSH , MESH , NLM , TGN , UDC
Syntax Encoding Schemes       Box , ISO3166 , ISO639-2 , ISO639-3 , Period , Point , RFC1766 , RFC3066 , RFC4646 , RFC5646 , URI , W3CDTF

Classes                       Agent , AgentClass , BibliographicResource , FileFormat , Frequency , Jurisdiction , LicenseDocument , LinguisticSystem ,
                              Location ,LocationPeriodOrJurisdiction , MediaType , MediaTypeOrExtent , MethodOfAccrual , MethodOfInstruction , Pe
                              riodOfTime , PhysicalMedium ,PhysicalResource , Policy , ProvenanceStatement , RightsStatement , SizeOrDuration , Sta
                              ndard
DCMI Type Vocabulary          Collection , Dataset , Event , Image , InteractiveResource , MovingImage , PhysicalObject , Service , Software , Sound , Sti
                              llImage , Text
Terms related to the DCMI     memberOf , VocabularyEncodingScheme
Abstract Model                                                             Hideaki Takeda / National Institute of Informatics
Dcterms           subPropertyOf                 Domain            Range                      Dcterms               subPropertyOf                Domain            Range


contributor       dc:contributor                rdfs:Resource     dcterms:Agent              conformsTo            dc:relation, dcterms:relation rdfs:Resource    dcterms:Standard

                                                                                             hasFormat             dc:relation, dcterms:relation rdfs:Resource    rdfs:Resource
                  dc:creator,
creator                                         rdfs:Resource     dcterms:Agent
                  dcterms:contributor                                                        hasPart               dc:relation, dcterms:relation rdfs:Resource    rdfs:Resource
                                                                  dcterms:LocationPeriodOr
coverage          dc:coverage                   rdfs:Resource                                hasVersion            dc:relation, dcterms:relation rdfs:Resource    rdfs:Resource
                                                                  Jurisdiction
                  dc:coverage,                                                               isFormatOf            dc:relation, dcterms:relation rdfs:Resource    rdfs:Resource
spatial                                         rdfs:Resource     dcterms:Location
                  dcterms:coverage
                                                                                             isPartOf              dc:relation, dcterms:relation rdfs:Resource    rdfs:Resource
                  dc:coverage,
Temporal                                        rdfs:Resource     dcterms:PeriodOfTime
                  dcterms:coverage                                                           isReferencedBy        dc:relation, dcterms:relation rdfs:Resource    rdfs:Resource
Date              dc:date                       rdfs:Resource     rdfs:Literal
                                                                                             isReplacedBy          dc:relation, dcterms:relation rdfs:Resource    rdfs:Resource
Available         dc:date, dcterms:date         rdfs:Resource     rdfs:Literal

Created           dc:date, dcterms:date         rdfs:Resource     rdfs:Literal               isRequiredBy          dc:relation, dcterms:relation rdfs:Resource    rdfs:Resource

dateAccepted      dc:date, dcterms:date         rdfs:Resource     rdfs:Literal               isVersionOf           dc:relation, dcterms:relation rdfs:Resource    rdfs:Resource

dateCopyrighted dc:date, dcterms:date           rdfs:Resource     rdfs:Literal               References            dc:relation, dcterms:relation rdfs:Resource    rdfs:Resource

dateSubmitted     dc:date, dcterms:date         rdfs:Resource     rdfs:Literal               Replaces              dc:relation, dcterms:relation rdfs:Resource    rdfs:Resource

Issued            dc:date, dcterms:date         rdfs:Resource     rdfs:Literal               Requires              dc:relation, dcterms:relation rdfs:Resource    rdfs:Resource
Modified          dc:date, dcterms:date         rdfs:Resource     rdfs:Literal               Rights                dc:rights                    rdfs:Resource     dcterms:RightsStatement
                                                                                             accessRights          dc:rights, dcterms:rights    rdfs:Resource     dcterms:RightsStatement
Valid             dc:date, dcterms:date         rdfs:Resource     rdfs:Literal
                                                                                             License               dc:rights, dcterms:rights    rdfs:Resource     dcterms:LicenseDocument
description       dc:description                rdfs:Resource     rdfs:Resource              Subject               dc:subject                   rdfs:Resource     rdfs:Resource
                  dc:description,                                                            title                 dc:title                     rdfs:Resource     rdfs:Resourcerdfs:Literal
Abstract                                        rdfs:Resource     rdfs:Resource
                  dcterms:description                                                        alternative           dc:title, dcterms:title      rdfs:Resource     rdfs:Resourcerdfs:Literal
                  dc:description,                                                            type                  dc:type                      rdfs:Resource     rdfs:Class
tableOfContents                                 rdfs:Resource     rdfs:Resource
                  dcterms:description                                                        audience                                           rdfs:Resource     dcterms:AgentClass
                                                                  dcterms:MediaTypeOrExte    educationLevel        dcterms:audience             rdfs:Resource     dcterms:AgentClass
format            dc:format                     rdfs:Resource                                mediator              dcterms:audience             rdfs:Resource     dcterms:AgentClass
                                                                  nt
                                                                                                                                                dcmitype:Collec
extent            dc:format, dcterms:format     rdfs:Resource     dcterms:SizeOrDuration     accrualMethod                                                        dcterms:MethodOfAccrual
                                                                                                                                                tion
                                                dcterms:PhysicalR                                                                               dcmitype:Collec
Medium            dc:format, dcterms:format                        dcterms:PhysicalMedium    accrualPeriodicity                                                   dcterms:Frequency
                                                esource                                                                                         tion
Identifier         dc:identifier                rdfs:Resource      rdfs:Literal                                                                 dcmitype:Collec
                                                                                             accrualPolicy                                                        dcterms:Policy
bibliographicCitat dc:identifier,               dcterms:Bibliograp                                                                              tion
                                                                   rdfs:Literal
ion                dcterms:identifier           hicResource                                  instructionalMethod                                rdfs:Resource     dcterms:MethodOfInstructio
                                                                                             provenance                                         rdfs:Resource     dcterms:ProvenanceStatem
Language          dc:language                   rdfs:Resource     dcterms:LinguisticSystem
                                                                                             rightsHolder                                       rdfs:Resource     dcterms:Agent
Publisher         dc:publisher                  rdfs:Resource     dcterms:Agent
Relation          dc:relation                   rdfs:Resource     rdfs:Resource               http://dublincore.org/documents/dcmi-terms/
source            dc:source, dcterms:relation   rdfs:Resource     rdfs:Resource               Hideaki Takeda / National Institute of Informatics
                                                                                              http://www.kanzaki.com/docs/sw/dc-domain-range.html
The Friend of a Friend (FOAF)
    • Metadata describe persons and their relationship
    • Voluntary project
Classes:
 | Agent | Document | Group | Image | LabelProperty |
OnlineAccount | OnlineChatAccount |
OnlineEcommerceAccount | OnlineGamingAccount |
Organization | Person | PersonalProfileDocument | Project |
                                                                     @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
Properties:                                                          @prefix foaf: <http://xmlns.com/foaf/0.1/> .
 | account | accountName | accountServiceHomepage | age | @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
aimChatID | based_near | birthday | currentProject |
                                                                     <#JW>
depiction | depicts | dnaChecksum | familyName |                       a foaf:Person ;
family_name | firstName | focus | fundedBy | geekcode |                foaf:name "Jimmy Wales" ;
                                                                       foaf:mbox <mailto:jwales@bomis.com> ;
gender | givenName | givenname | holdsAccount |                        foaf:homepage <http://www.jimmywales.com/> ;
homepage | icqChatID | img | interest | isPrimaryTopicOf |             foaf:nick "Jimbo" ;
                                                                       foaf:depiction
jabberID | knows | lastName | logo | made | maker | mbox | <http://www.jimmywales.com/aus_img_small.jpg> ;
mbox_sha1sum | member | membershipClass | msnChatID                    foaf:interest <http://www.wikimedia.org> ;
                                                                       foaf:knows [
| myersBriggs | name | nick | openid | page | pastProject |               a foaf:Person ;
phone | plan | primaryTopic | publications |                              foaf:name "Angela Beesley"
                                                                       ].
schoolHomepage | sha1 | skypeID | status | surname | theme
| thumbnail | tipjar | title | topic | topic_interest | weblog |     <http://www.wikimedia.org>
workInfoHomepage | workplaceHomepage | yahooChatID | Takeda /"Wikipedia" . Institute of Informatics
                                                              Hideaki
                                                                       rdfs:label
                                                                                   National
SKOS (Simple Knowledge Organization
              System)
• Metadata for taxonomy
  – Hierarchical structure of concepts
     • Invented to represent taxonomy such as subject
       heading
     • =/= subclass relationship among classes
• W3C Recommendation 18 August 2009




                              Hideaki Takeda / National Institute of Informatics
SKOS (Simple Knowledge Organization
              System)
• SKOS Core (hierarchical concept structure)
  – skos:semanticRelation
  – skos:broaderTransitive
                                       subPropertyOf
  – skos:narrowerTransitive
  – skos:broader
  – skos:narrower
  – skos:related
  – skos:preflabel
  – skos:altlabel
  – skos:hiddenlabel       Hideaki Takeda / National Institute of Informatics
SKOS (Simple Knowledge Organization
              System)
• SKOS Mapping
  – skos:mappingRelation
  – skos:closeMatch
                                        subPropertyOf
  – skos:exactMatch
  – skos:broadMatch
  – skos:narrowMatch
  – skos:relatedMatch



                           Hideaki Takeda / National Institute of Informatics
Linked Open Vocabulary (LOV)
• A technical platform for search and quality
  assessment among the vocabularies
  ecosystem
  – Register schemata
  – Search schemata
• http://labs.mondeca.com/dataset/lov/




                         Hideaki Takeda / National Institute of Informatics
X




Hideaki Takeda / National Institute of Informatics
More Info.
• http://www.w3.org/2005/Incubator/lld/wiki/V
  ocabulary_and_Dataset




                       Hideaki Takeda / National Institute of Informatics
Summary for schema
• Some major schemata
  – DC, DC terms, FOAF, SKOS …
• More domain-specific schemata
  – CIDOC CRM
  – PRISM
  –…
• Re-using is highly recommended
  – LOV

                         Hideaki Takeda / National Institute of Informatics
Summary
• Three layers
  – Ontology/Thesaurus/Taxonomy
  – Schema
  – Identification
• Not just top-down, rather bottom-up
• Each layer has own role
• Not pursue the value of each layer, rather
  make a good combination of them

                         Hideaki Takeda / National Institute of Informatics

More Related Content

Viewers also liked

Next Generation Embryology
Next Generation EmbryologyNext Generation Embryology
Next Generation EmbryologyJano van Hemert
 
Bradley witham lesson 9
Bradley witham lesson 9Bradley witham lesson 9
Bradley witham lesson 9Brad Witham
 
Suffixes of the Nervous System
Suffixes of the Nervous SystemSuffixes of the Nervous System
Suffixes of the Nervous Systemcpreis
 
US Air Force, NASA, Russian Accademy of Science Letters
US Air Force, NASA, Russian Accademy of Science LettersUS Air Force, NASA, Russian Accademy of Science Letters
US Air Force, NASA, Russian Accademy of Science LettersThane Heins
 
Message from the Mountaintop #MCSS17
Message from the Mountaintop #MCSS17Message from the Mountaintop #MCSS17
Message from the Mountaintop #MCSS17mshomakerteach
 
Christian funny stories jokes
Christian funny stories jokesChristian funny stories jokes
Christian funny stories jokesJoelalisen
 
Glaciation in the Nant Ffrancon Valley
Glaciation in the Nant Ffrancon ValleyGlaciation in the Nant Ffrancon Valley
Glaciation in the Nant Ffrancon ValleyRCha
 
Mycotoxins and mycetism
Mycotoxins and mycetismMycotoxins and mycetism
Mycotoxins and mycetismDr. Komal Lohi
 
Must vs Don't have to IIº ppt
Must  vs  Don't have to  IIº   pptMust  vs  Don't have to  IIº   ppt
Must vs Don't have to IIº pptmluisa007
 
Cardiac development final
Cardiac development finalCardiac development final
Cardiac development finalSandip Gupta
 
Multimodal Semiotics
Multimodal SemioticsMultimodal Semiotics
Multimodal Semioticssvngl
 

Viewers also liked (20)

e-Science Research
e-Science Researche-Science Research
e-Science Research
 
Next Generation Embryology
Next Generation EmbryologyNext Generation Embryology
Next Generation Embryology
 
Musicography
MusicographyMusicography
Musicography
 
Anatomia cardiaca
Anatomia cardiacaAnatomia cardiaca
Anatomia cardiaca
 
Bradley witham lesson 9
Bradley witham lesson 9Bradley witham lesson 9
Bradley witham lesson 9
 
Suffixes of the Nervous System
Suffixes of the Nervous SystemSuffixes of the Nervous System
Suffixes of the Nervous System
 
US Air Force, NASA, Russian Accademy of Science Letters
US Air Force, NASA, Russian Accademy of Science LettersUS Air Force, NASA, Russian Accademy of Science Letters
US Air Force, NASA, Russian Accademy of Science Letters
 
Message from the Mountaintop #MCSS17
Message from the Mountaintop #MCSS17Message from the Mountaintop #MCSS17
Message from the Mountaintop #MCSS17
 
Equinos
EquinosEquinos
Equinos
 
Christian funny stories jokes
Christian funny stories jokesChristian funny stories jokes
Christian funny stories jokes
 
Muttawali Waqf
Muttawali Waqf Muttawali Waqf
Muttawali Waqf
 
Glaciation in the Nant Ffrancon Valley
Glaciation in the Nant Ffrancon ValleyGlaciation in the Nant Ffrancon Valley
Glaciation in the Nant Ffrancon Valley
 
Development of occlusion1
Development of occlusion1Development of occlusion1
Development of occlusion1
 
Mycotoxins and mycetism
Mycotoxins and mycetismMycotoxins and mycetism
Mycotoxins and mycetism
 
Myelophthisic
MyelophthisicMyelophthisic
Myelophthisic
 
Must vs Don't have to IIº ppt
Must  vs  Don't have to  IIº   pptMust  vs  Don't have to  IIº   ppt
Must vs Don't have to IIº ppt
 
Cardiac development final
Cardiac development finalCardiac development final
Cardiac development final
 
Pteropsida
PteropsidaPteropsida
Pteropsida
 
Lecture 08
Lecture 08Lecture 08
Lecture 08
 
Multimodal Semiotics
Multimodal SemioticsMultimodal Semiotics
Multimodal Semiotics
 

Similar to Schema and Identity for Linked Data

Resource and Metadata Management with a Linked Data perspective
Resource and Metadata Management with a Linked Data perspectiveResource and Metadata Management with a Linked Data perspective
Resource and Metadata Management with a Linked Data perspectiveHannes Ebner
 
Knowledge Representation, Semantic Web
Knowledge Representation, Semantic WebKnowledge Representation, Semantic Web
Knowledge Representation, Semantic WebSerendipity Seraph
 
Large-Scale Semantic Search
Large-Scale Semantic SearchLarge-Scale Semantic Search
Large-Scale Semantic SearchRoi Blanco
 
Web 3 final(1)
Web 3 final(1)Web 3 final(1)
Web 3 final(1)Venky Dood
 
Building OBO Foundry ontology using semantic web tools
Building OBO Foundry ontology using semantic web toolsBuilding OBO Foundry ontology using semantic web tools
Building OBO Foundry ontology using semantic web toolsMelanie Courtot
 
Practical Information Architecture
Practical Information ArchitecturePractical Information Architecture
Practical Information ArchitectureRob Bogue
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jDebanjan Mahata
 
It19 20140721 linked data personal perspective
It19 20140721 linked data personal perspectiveIt19 20140721 linked data personal perspective
It19 20140721 linked data personal perspectiveJanifer Gatenby
 
Semantic Web: introduction & overview
Semantic Web: introduction & overviewSemantic Web: introduction & overview
Semantic Web: introduction & overviewAmit Sheth
 
A review of the state of the art in Machine Learning on the Semantic Web
A review of the state of the art in Machine Learning on the Semantic WebA review of the state of the art in Machine Learning on the Semantic Web
A review of the state of the art in Machine Learning on the Semantic WebSimon Price
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsJon Voss
 
MARC and BIBFRAME; Linking libraries and archives
MARC and BIBFRAME; Linking libraries and archivesMARC and BIBFRAME; Linking libraries and archives
MARC and BIBFRAME; Linking libraries and archivesDorothea Salo
 
Semantic web xml-rdf-dom parser
Semantic web xml-rdf-dom parserSemantic web xml-rdf-dom parser
Semantic web xml-rdf-dom parserSerdar Sönmez
 
OER for repository managers
OER for repository managersOER for repository managers
OER for repository managersNick Sheppard
 

Similar to Schema and Identity for Linked Data (20)

General Introduction for Semantic Web and Linked Open Data
General Introduction for Semantic Web and Linked Open DataGeneral Introduction for Semantic Web and Linked Open Data
General Introduction for Semantic Web and Linked Open Data
 
Creating Order Out of the Chaos
Creating Order Out of the ChaosCreating Order Out of the Chaos
Creating Order Out of the Chaos
 
Open Science and Identifiers
Open Science and IdentifiersOpen Science and Identifiers
Open Science and Identifiers
 
Resource and Metadata Management with a Linked Data perspective
Resource and Metadata Management with a Linked Data perspectiveResource and Metadata Management with a Linked Data perspective
Resource and Metadata Management with a Linked Data perspective
 
Knowledge Representation, Semantic Web
Knowledge Representation, Semantic WebKnowledge Representation, Semantic Web
Knowledge Representation, Semantic Web
 
Knowledge mangement
Knowledge mangementKnowledge mangement
Knowledge mangement
 
Large-Scale Semantic Search
Large-Scale Semantic SearchLarge-Scale Semantic Search
Large-Scale Semantic Search
 
Web 3 final(1)
Web 3 final(1)Web 3 final(1)
Web 3 final(1)
 
Building OBO Foundry ontology using semantic web tools
Building OBO Foundry ontology using semantic web toolsBuilding OBO Foundry ontology using semantic web tools
Building OBO Foundry ontology using semantic web tools
 
Practical Information Architecture
Practical Information ArchitecturePractical Information Architecture
Practical Information Architecture
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4j
 
Implementing Linked Data in Low-Resource Conditions
Implementing Linked Data in Low-Resource ConditionsImplementing Linked Data in Low-Resource Conditions
Implementing Linked Data in Low-Resource Conditions
 
It19 20140721 linked data personal perspective
It19 20140721 linked data personal perspectiveIt19 20140721 linked data personal perspective
It19 20140721 linked data personal perspective
 
Semantic Web: introduction & overview
Semantic Web: introduction & overviewSemantic Web: introduction & overview
Semantic Web: introduction & overview
 
A review of the state of the art in Machine Learning on the Semantic Web
A review of the state of the art in Machine Learning on the Semantic WebA review of the state of the art in Machine Learning on the Semantic Web
A review of the state of the art in Machine Learning on the Semantic Web
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & Museums
 
MARC and BIBFRAME; Linking libraries and archives
MARC and BIBFRAME; Linking libraries and archivesMARC and BIBFRAME; Linking libraries and archives
MARC and BIBFRAME; Linking libraries and archives
 
Decoder Ring
Decoder RingDecoder Ring
Decoder Ring
 
Semantic web xml-rdf-dom parser
Semantic web xml-rdf-dom parserSemantic web xml-rdf-dom parser
Semantic web xml-rdf-dom parser
 
OER for repository managers
OER for repository managersOER for repository managers
OER for repository managers
 

More from National Institute of Informatics (NII)

趙簡単LOD入門 〜デジタル庁をデジタル化する〜 (改訂版)
趙簡単LOD入門 〜デジタル庁をデジタル化する〜 (改訂版)趙簡単LOD入門 〜デジタル庁をデジタル化する〜 (改訂版)
趙簡単LOD入門 〜デジタル庁をデジタル化する〜 (改訂版)National Institute of Informatics (NII)
 
趙簡単LOD入門 〜デジタル庁をデジタル化する〜
趙簡単LOD入門 〜デジタル庁をデジタル化する〜趙簡単LOD入門 〜デジタル庁をデジタル化する〜
趙簡単LOD入門 〜デジタル庁をデジタル化する〜National Institute of Informatics (NII)
 
セマンティックWeb技術を用いた農業分野の標準語彙の構築
セマンティックWeb技術を用いた農業分野の標準語彙の構築セマンティックWeb技術を用いた農業分野の標準語彙の構築
セマンティックWeb技術を用いた農業分野の標準語彙の構築National Institute of Informatics (NII)
 
NII研究100連発 ウェブと人工知能の融合 -人間の創造性を刺激するコンピュータ
NII研究100連発 ウェブと人工知能の融合 -人間の創造性を刺激するコンピュータ NII研究100連発 ウェブと人工知能の融合 -人間の創造性を刺激するコンピュータ
NII研究100連発 ウェブと人工知能の融合 -人間の創造性を刺激するコンピュータ National Institute of Informatics (NII)
 
Presenting and Preserving the Change in Taxonomic Knowledge for Linked Data
Presenting and Preserving the Change in Taxonomic Knowledge for Linked DataPresenting and Preserving the Change in Taxonomic Knowledge for Linked Data
Presenting and Preserving the Change in Taxonomic Knowledge for Linked DataNational Institute of Informatics (NII)
 
共通語彙の構築の基本的な考え方と方法 〜研究データのために語彙・スキーマを作るには〜
共通語彙の構築の基本的な考え方と方法 〜研究データのために語彙・スキーマを作るには〜共通語彙の構築の基本的な考え方と方法 〜研究データのために語彙・スキーマを作るには〜
共通語彙の構築の基本的な考え方と方法 〜研究データのために語彙・スキーマを作るには〜National Institute of Informatics (NII)
 
研究データ利活用に関する国内活動及び国際動向について
研究データ利活用に関する国内活動及び国際動向について研究データ利活用に関する国内活動及び国際動向について
研究データ利活用に関する国内活動及び国際動向についてNational Institute of Informatics (NII)
 

More from National Institute of Informatics (NII) (20)

趙簡単LOD入門 〜デジタル庁をデジタル化する〜 (改訂版)
趙簡単LOD入門 〜デジタル庁をデジタル化する〜 (改訂版)趙簡単LOD入門 〜デジタル庁をデジタル化する〜 (改訂版)
趙簡単LOD入門 〜デジタル庁をデジタル化する〜 (改訂版)
 
趙簡単LOD入門 〜デジタル庁をデジタル化する〜
趙簡単LOD入門 〜デジタル庁をデジタル化する〜趙簡単LOD入門 〜デジタル庁をデジタル化する〜
趙簡単LOD入門 〜デジタル庁をデジタル化する〜
 
"分人"型社会とAI
"分人"型社会とAI"分人"型社会とAI
"分人"型社会とAI
 
セマンティックWeb技術を用いた農業分野の標準語彙の構築
セマンティックWeb技術を用いた農業分野の標準語彙の構築セマンティックWeb技術を用いた農業分野の標準語彙の構築
セマンティックWeb技術を用いた農業分野の標準語彙の構築
 
研究オープンデータにおける大学と研究者の役割
研究オープンデータにおける大学と研究者の役割研究オープンデータにおける大学と研究者の役割
研究オープンデータにおける大学と研究者の役割
 
NII研究100連発 ウェブと人工知能の融合 -人間の創造性を刺激するコンピュータ
NII研究100連発 ウェブと人工知能の融合 -人間の創造性を刺激するコンピュータ NII研究100連発 ウェブと人工知能の融合 -人間の創造性を刺激するコンピュータ
NII研究100連発 ウェブと人工知能の融合 -人間の創造性を刺激するコンピュータ
 
Presenting and Preserving the Change in Taxonomic Knowledge for Linked Data
Presenting and Preserving the Change in Taxonomic Knowledge for Linked DataPresenting and Preserving the Change in Taxonomic Knowledge for Linked Data
Presenting and Preserving the Change in Taxonomic Knowledge for Linked Data
 
Crop vocabulary (CVO): Core vocabulary of crop names
Crop vocabulary (CVO): Core vocabulary of crop namesCrop vocabulary (CVO): Core vocabulary of crop names
Crop vocabulary (CVO): Core vocabulary of crop names
 
ORCIDとオープンサイエンス
ORCIDとオープンサイエンスORCIDとオープンサイエンス
ORCIDとオープンサイエンス
 
How to build ontologies - a case study of Agriculture Activity Ontology
How to build ontologies - a case study of Agriculture Activity OntologyHow to build ontologies - a case study of Agriculture Activity Ontology
How to build ontologies - a case study of Agriculture Activity Ontology
 
LODとオープンデータ (DBpediaとIMIの周辺を中心に)
LODとオープンデータ(DBpediaとIMIの周辺を中心に)LODとオープンデータ(DBpediaとIMIの周辺を中心に)
LODとオープンデータ (DBpediaとIMIの周辺を中心に)
 
共通語彙の構築の基本的な考え方と方法 〜研究データのために語彙・スキーマを作るには〜
共通語彙の構築の基本的な考え方と方法 〜研究データのために語彙・スキーマを作るには〜共通語彙の構築の基本的な考え方と方法 〜研究データのために語彙・スキーマを作るには〜
共通語彙の構築の基本的な考え方と方法 〜研究データのために語彙・スキーマを作るには〜
 
Working with Global Infrastructure at a National Level
Working with Global Infrastructure at a National LevelWorking with Global Infrastructure at a National Level
Working with Global Infrastructure at a National Level
 
Activities of JaLC as a national service
Activities of JaLC as a national serviceActivities of JaLC as a national service
Activities of JaLC as a national service
 
Development and Application of Agriculture Ontologies
Development and Application of Agriculture Ontologies Development and Application of Agriculture Ontologies
Development and Application of Agriculture Ontologies
 
Design Process of Agriculture Ontologies
Design Process of Agriculture OntologiesDesign Process of Agriculture Ontologies
Design Process of Agriculture Ontologies
 
AIの未来 ~技術と社会の関係のダイナミクス~
AIの未来~技術と社会の関係のダイナミクス~AIの未来~技術と社会の関係のダイナミクス~
AIの未来 ~技術と社会の関係のダイナミクス~
 
Towards Knowledge-Enabled Society
Towards Knowledge-Enabled SocietyTowards Knowledge-Enabled Society
Towards Knowledge-Enabled Society
 
研究データ利活用に関する国内活動及び国際動向について
研究データ利活用に関する国内活動及び国際動向について研究データ利活用に関する国内活動及び国際動向について
研究データ利活用に関する国内活動及び国際動向について
 
オープンサイエンスとオープンデータ
オープンサイエンスとオープンデータオープンサイエンスとオープンデータ
オープンサイエンスとオープンデータ
 

Recently uploaded

Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docxPoojaSen20
 

Recently uploaded (20)

Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docx
 

Schema and Identity for Linked Data

  • 1. 2012 INTERNATIONAL ASIAN SUMMER SCHOOL IN LINKED DATA IASLOD 2012, August 13-17, 2012, KAIST, Daejeon, Korea Identity and schema for Linked Data Hideaki Takeda National Institute of Informatics takeda@nii.ac.jp Hideaki Takeda / National Institute of Informatics
  • 2. How to put the data into computer? • How to describe the data? – The way to describe individual data • Schema/Class/Concept – The way to describe relationship among schema/class/concept • Ontology/Taxonomy/Thesaurus • How to refer the data? – The way to identify individual data • Identifier – Relationship among identifiers Hideaki Takeda / National Institute of Informatics
  • 3. Architecture for the Semantic Web  The world of classes (Ontologies)  The world of instances (Linked Data) Tim Berners-Lee http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/ Hideaki Takeda / National Institute of Informatics
  • 4. Layers of Semantic Web • Ontology – Descriptions on classes – RDFS, OWL – Challenges for ontology building • Ontology building is difficult by nature – Consistency, comprehensiveness, logicality • Alignment of ontologies is more difficult Descriptions on classes Ontology インスタンスに関する記述 Linked Data Tim Berners-Lee http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/ Hideaki Takeda / National Institute of Informatics
  • 5. Layers of Semantic Web • Linked Data – Descriptions on instances (individuals) – RDF + (RDFS, OWL) – Pros for Linked Data • Easy to write (mainly fact description) • Easy to link (fact to fact link) – Cons for Linked Data • Difficult to describe complex structures • Still need for class description (-> ontology) Descriptions on classes Ontology Description on instances Linked Data Tim Berners-Lee http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/ Hideaki Takeda / National Institute of Informatics
  • 6. Importance of Identifiers for Entities • Everything should be identifiable! • Human can identify things with vague identifiers or even without identifiers with help from the context around things • On the web, the context is usually not available and the computer can seldom understand the context even if it exists • So we need identifiers for all things Hideaki Takeda / National Institute of Informatics
  • 7. Identification System • Identification is one of the primary functions for human information processing – Naming: e.g., names for people, pets, and some daily things • OK if the number of things is not so big – Systematic Identification • e.g., phone number, post-code, passport number, product number, ISBN • If the number of things is big enough • Requirements for Systematic Identification – Identifier is stable and sustainable – Uniqueness is guaranteed – Identifier publisher is reliable and sustainable Hideaki Takeda / National Institute of Informatics
  • 8. Identification system for Web • Not so different from conventional identification systems • Difference – Cross-system use – Truly digitized • Requirements for Systematic Identification for web – Identifier is stable and sustainable (even after an entity may disappear) – Uniqueness is guaranteed over all systems – Description on should be associated to identifiers • since entities may not accessible – Identifier publisher is reliable and sustainable Hideaki Takeda / National Institute of Informatics
  • 9. Solutions for the Requirements by LOD • Requirements for Systematic Identification for web – 1. Identifier is stable and sustainable (even after an entity may disappear) • (up to each identifier publisher) – 2. Uniqueness is guaranteed over all systems • URI (not URN) – 3. Description on should be associated to identifiers • Dereferenceable URI – If URI is accessed, a description associated to it should be returned – 4. Identifier publisher is reliable and sustainable Hideaki Takeda / National Institute of Informatics
  • 10. Some examples ISBN(International Standard Book Number) • Abstract – a unique numeric commercial book identifier – 13 digits • Prefix: 978 or 979 (for compatibility with EAN code) • Group(language-sharing country group): 1 to 5 digits • Publisher code: • Item number: • Check num: 1 digit – Management: two layers • National ISBN Agency – Publisher • Requirement Satisfaction – 1. (Stable ID) Maybe (versioning often matters, and sometimes publisher may re-use ISBN) – 2. (Unique ID) Uniqueness is guaranteed but not URI – 3. (Dereferenceable) No mechanisms (amazon does instead!) – 4. (Reliable publisher) Yes Hideaki Takeda / National Institute of Informatics
  • 11. Some examples DOI (Digital Object Identifier) • Abstract – An identifier for scientific digital objects (mostly scientific articles) – An unfixed string: “prefix/suffix” • Prefix: assigned for publishers • Suffix: assigned for each object – Management: three layers • IDF (International DOI Foundation) – Registration Agency – Publisher • Requirement Satisfaction – 1. (Stable ID) Yes (not re-usable) – 2. (Unique ID)Uniqueness is guaranteed and URI accessible (http://dx.doi.org/”DOI”) – 3. (Dereferenaceable)Mapping to object pages but no RDF – 4. (Reliable publisher) Maybe Hideaki Takeda / National Institute of Informatics
  • 12. Some examples Dbpedia (as Identifier) • Abstract – A wikipedia page – Name of wikipedia page • Maintained manually – Disambiguation page – Redirect page • Requirement Satisfaction – 1. (Stable ID) maybe (sometimes disappear, sometimes change names, sometime change contents) – 2. (Unique ID) Uniqueness is mostly guaranteed and URI accessible – 3. (Dereferenceable) RDF – 4. (Reliable publisher) Maybe • Hideaki Takeda / National Institute of Informatics
  • 13. Identification of relationship between identifiers • Co-existence of multiple identification systems on a field – Difference of coverage – Difference of Viewpoint  An entity can have multiple identifiers  Need for mapping between identifiers in different identification systems  Method: Use special properties  owl:sameAs, (rdfs:seeAlso, skos:exactMatch)  http://sameas.org  Some problems – Logical inconsistency with owl:sameAs – Maintainance Hideaki Takeda / National Institute of Informatics
  • 14. LOD Cloud (Linking Open Data) Hideaki Takeda / National Institute of Informatics
  • 15. Summary for ID • Identification is the crucial part in LOD – Data availability – Data inconsistency – Data interoperability • Establishment of a good identification system leads a reliable and sustainable LOD. Hideaki Takeda / National Institute of Informatics
  • 16. Structuring Information • A wide range of structuring information – Keywords, tags • A freely chosen word or phrase just indicating some features – Controlled vocabulary • Mapping to the fixed set of words or phrases • e.g., the list of countries, the name authorities – Classification • System for classifying entities. Often hierarchical. Class may not carry meaning. – Taxonomy • Hierarchical term system for classification. Upper/lower relation usually means general/specific relation • e.g., the subject headings of LC – Thesaurus • System for semantics. More different types of relations: (hypersym, hyposym), synonym, antonym, homonym, holonym, meronym – Ontology • System of concepts. Concepts rather than words. More various relations, the definitions of concepts Hideaki Takeda / National Institute of Informatics
  • 17. Examples in Library Science • Many systems in the library community • Classification – Universal Decimal Classification (UDC) • Controlled Vocabulary – the authority files for person names, organizations, location names • Library of Congress : 8 Million records, MADS &SKOS • British Library: 2.6 million records, foaf & BIO (A vocabulary for biographical information) • National Diet Library (Japan): 1 million records, foaf • Deutsche Nationalbibliothek (DNB, Germany): 1.8 & 1.3 million records (names & organization), • Virtual International Authority File (VIAF): 4 million records • Taxonomy – Subject Heading: LC, NDL, • Library of Congress: MADS &SKOS • British Library: • National Diet Library (Japan): 0.1 million records, SKOS • Deutsche Nationalbibliothek (DNB, Germany): 0.16 million records Hideaki Takeda / National Institute of Informatics
  • 18. Hideaki Takeda / National Institute of Informatics
  • 19. Hideaki Takeda / National Institute of Informatics
  • 20. UDC ELEMENT DEFINITION UDC as Linked Data SKOS TERM UDC SUBPROPERTY UDC number (notation) UDC notation is combination of symbols (numerals, signs and letters) that represent a class, its skos:notation --- position in the hierarchy and its relation to other classes. Notation is a language-independent indexing term that enables mechanical sorting and filing of subjects. Also called 'UDC number' and 'UDC classmark' class identifier (URI) A unique identifier assigned to each UDC class. It identifies the relationship between a class' skos:Concept --- meaning and its notational representation broader class (URI) Superordinate class: the class hierarchically above the class in question skos:broader --- caption Verbal description of the class content skos:prefLabel --- including note Extension of the caption containing verbal examples of the class content (usually a selection of skos:note udc:includingN important terms that do not appear in the subdivision) ote application note Instructions for number building, further extension and specification of the class skos:note udc:application Note scope note Note explaining the extent and the meaning of a UDC class. Used to resolve disambiguation or skos:scopeNot --- to distinguish this class from other similar classes e examples Examples of combination are used to illustrate UDC class building i.e. complex subject skos:example --- statements see also reference Indication of conceptual relationship between UDC classes from different hierarchies skos:related --- <skos:Concept rdf:about="http://udcdata.info/025553"> 69,000 records <skos:inScheme rdf:resource="http://udcdata.info/udc-schema"/> 40 Languages <skos:broader rdf:resource="http://udcdata.info/025461"/> <skos:notation rdf:datatype="http://udcdata.info/UDCnotation">510.6</skos:notation> <skos:prefLabel xml:lang="en">Mathematical logic</skos:prefLabel> <skos:prefLabel xml:lang="ja">記号論理学</skos:prefLabel> <skos:related rdf:resource="http://udcdata.info/000016"/> http://udcdata.info/ </skos:Concept> Hideaki Takeda / National Institute of Informatics
  • 21. http://id.loc.gov/authorities/names/n79084664.html <http://id.loc.gov/authorities/names/n79084664> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.loc.gov/mads/rdf/v1#PersonalName> . <http://id.loc.gov/authorities/names/n79084664> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.loc.gov/mads/rdf/v1#Authority> . <http://id.loc.gov/authorities/names/n79084664> <http://www.loc.gov/mads/rdf/v1#authoritativeLabel> "Natsume, Sōseki, 1867-1916"@en . <http://id.loc.gov/authorities/names/n79084664> <http://www.loc.gov/mads/rdf/v1#elementList> _:bnode7authoritiesnamesn79084664 . _:bnode7authoritiesnamesn79084664 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> _:bnode8authoritiesnamesn79084664 . _:bnode7authoritiesnamesn79084664 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:bnode010 . _:bnode8authoritiesnamesn79084664 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.loc.gov/mads/rdf/v1#FullNameElement> . _:bnode8authoritiesnamesn79084664 <http://www.loc.gov/mads/rdf/v1#elementValue> "Natsume, Sōseki,"@en . _:bnode010 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> _:bnode11authoritiesnamesn79084664 . _:bnode010 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> <http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> . _:bnode11authoritiesnamesn79084664 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.loc.gov/mads/rdf/v1#DateNameElement> . Hideaki Takeda / National Institute of Informatics
  • 22. http://id.loc.gov/authorities/subjects/sh85008180.html Hideaki Takeda / National Institute of Informatics
  • 23. http://data.bnf.fr/11932084/intelligence_artificielle/ Hideaki Takeda / National Institute of Informatics
  • 24. Some examples Scientific Names for Species and Taxa • Abstract – Names for biological species and other taxa (kingdom, divison, class, order, family, tribe, genus) – A string • Binomial name for species • Academic societies maintain taxon names individually – E.g., Papilo xuthus (Asian Swallowtail, ナミアゲハ,호랑나비) • Requirement Satisfaction – 1. Mostly yes (sometimes disappear, change names, change contents) – 2. Uniqueness is generally guaranteed but precise speaking some ambiguity because of change. – 3. No. Many systems exists but none covers all species – 4. Maybe Hideaki Takeda / National Institute of Informatics
  • 25. 植物 藻類 菌類 動物 分類群 Taxon Plants Algae Fungi Animals ドメイン Domain 界 Kingdom 門 Division/Phylum -phyta -phyta -mycota 亜門 Subdivision/Subphylum -phytina -phytina -mycotina 綱 Class -opsida -phyceae -mycetes 亜綱 Subclass -idae -phycidae -mycetidae 目 Order -ales -ales -ales 亜目 Suborder -ineae -ineae -ineae 上科 Superfamily -acea -acea -acea -oidea 科 Family -aceae -aceae -aceae -idae 亜科 Subfamily -oideae -oideae -oideae -inae 族/連 Tribe -eae -eae -eae -ini 亜族/亜連 Subtribe -inae -inae -inae -ina 属 Genus 亜属 Subgenus 種 Species 亜種 Subspecies Hideaki Takeda / National Institute of Informatics
  • 26. Ontology An ontology is an explicit specification of a conceptualization [Gruber]  An ontology is an explicit specification of a conceptualization. The term is borrowed from philosophy, where an Ontology is a systematic account of Existence. For AI systems, what "exists" is that which can be represented. When the knowledge of a domain is represented in a declarative formalism, the set of objects that can be represented is called the universe of discourse. This set of objects, and the describable relationships among them, are reflected in the representational vocabulary with which a knowledge-based program represents knowledge. Thus, in the context of AI, we can describe the ontology of a program by defining a set of representational terms. In such an ontology, definitions associate the names of entities in the universe of discourse (e.g., classes, relations, functions, or other objects) with human-readable text describing what the names mean, and formal axioms that constrain the interpretation and well-formed use of these terms. Formally, an ontology is the statement of a logical theory. Hideaki Takeda / National Institute of Informatics
  • 27. Conceptualization object on_desk(A) box on(A, B) put(A,B) red box blue box yellow box object on_desk(A) object on(A/box, B/object) on(A, B) put(A/box,B/object) put(A,B) box box desk box box color:{red, blue, yellow} color:{red, blue, yellow} There are many possible ways to conceptualize the target world Trade off between generality and efficiency Hideaki Takeda / National Institute of Informatics
  • 28. Types of Ontologies • Upper (top-level) ontology vs. Domain ontology – Upper Ontology: A common ontology throughout all domains – Domain Ontology: An ontology which is meaningful in a specific domain • Object ontology vs. Task ontology – Object Ontology: An ontology on “things” and “events” – Task Ontology: An ontology on “doing” • Heavy-weight ontology vs. light-weight ontology – Heavy-weight ontology: fully described ontology including concept definitions and relations, in particular in a logical way – Light-weight ontology: partially described ontology including typically only is-a relations Hideaki Takeda / National Institute of Informatics
  • 29. Top-level ontology • Ontology which covers all of the world! • Very…. Difficult – e.g., how does a thing exist? • A thing is four dimensional existence? • A thing exists three-dimensionally over time? • Common requirements – A small number of concepts can cover the world – Concepts can be used in lower ontologies – Concept should be general and abstract Hideaki Takeda / National Institute of Informatics
  • 30. Three approaches Top-level ontology – Formal approach • Logical formalization • Fully Abstract • Pros: clean • Cons: hardly understandable • e.g., Sowa’s top-level ontology, DOLCE – Linguistic approach • Use and extension of linguistic concepts • Partially abstract and partially general • Pros: understandable • Cons: limitation to the linguistic world • e.g., Penman Upper Model, WordNet – Empirical Approach • Use and extension of everyday concepts • Mostly general • Pros: understandable and applicable to all the world • Cons: lack of solid foundation • e.g. SUMO, Cyc, EDR Hideaki Takeda / National Institute of Informatics
  • 31. Empirical top-level ontology • SUMO(Suggested Upper Merged Ontology) – Collection and organization of Substance concepts used frequently Object SelfConnectedObject CorpuscularObject Organic Inorganic – Simple relationship between Phsical Collection Biological Phisiologic Process NaturalProcess concepts Process Pathojogic Process ChangeOfProssession Process Intentionally Caused Searching Communication Process Entity Social Cooperation Interaction Contest Meeting Transfer Impelling Putting Impacting Motion Removing BringingTogether Abstract ChangeOf Transportation State Separating Hideaki Takeda / National Institute of Informatics
  • 32. Formal Ontology: DOLCE • DOLCE(a Descriptive Ontology for Linguistic and Cognitive Engineering) – Intended to a reference system for top-level ontology – Logical definition – Particular (DOLCE) vs. Universal • Particular: ontology about things, phenomena, quality… • Universal: ontology for describing particular like categories and attributes Hideaki Takeda / National Institute of Informatics
  • 33. M Formal Ontology: DOLCE Amount of Matter PED F Physical APO Feature Endurant Agentive Physical Object POB Physical • Concepts Object NAPO Non-agentive – Endurant / Perdurant / Quality / Abstract NPED Non-Physical Physical Object NPOB MOB • Endurant: ED Endurant Non-physical Object Mental Object Endurant – “Things” AS Arbitrary SOB Social Object – An existence over time Sum ACH – May change its attribute Achievement EV Event PD ACC • Perdurant ALL Entity Perdurant Occurence Accomplishment – “process” STV ST State – No change over time Stative PRO – May switch a part to the other Process • Relations Q TQ Temporal Quality TL Temporal Location – Parthood (abstract or perdurant) Quality PQ SL Physical Quality – Temporally Parthood (endurant) AQ Spatial Location – Constitution (endurant or perdurant) Abstract Quality – Participation between perdurant and endurantAB Fact TR Temporal Region Abstract T Set Time Interval PR Physical Region R S Region Space Region AR Abstract Region Hideaki Takeda / National Institute of Informatics
  • 34. Linguistic top-level ontology • WordNet – A lexical reference system • “Link-based electronic dictionary” http://www.cogsci.princeton.edu/cgi-bin/webwn – Concepts • synset – Noun 79,689 – Verb 13,508 – Relations • synonym • hypernym/hyponym (is-a) • holonym/meronym (a-part-of) Hideaki Takeda / National Institute of Informatics
  • 35. Linguistic top-level ontology WordNet – Top-level • { entity, physical thing (that which is perceived or known or inferred to have its own physical existence (living or nonliving)) } • { psychological_feature, (a feature of the mental life of a living organism) } • { abstraction, (a general concept formed by extracting common features from specific examples) } • { state, (the way something is with respect to its main attributes; "the current state of knowledge"; "his state of health"; "in a weak financial state") } • { event, (something that happens at a given place and time) } • { act, human_action, human_activity, (something that people do or cause to happen) } • { group, grouping, (any number of entities (members) considered as a unit) } • { possession, (anything owned or possessed) } • { phenomenon, (any state or process known through the senses rather than by intuition or reasoning) } Hideaki Takeda / National Institute of Informatics
  • 36. Summary for structuring information • Keywords, tags/Controlled vocabulary /Classification/Taxonomy /Thesaurus/Ontology – The difference is not clear, not important – The trend is to go more structured ones – The same requirements to Identification systems Hideaki Takeda / National Institute of Informatics
  • 37. Summary • Requirements for Successful Structuring Systems – 1. Entity is stable and sustainable LOD Tech. – 2. Uniqueness is guaranteed over all systems can help – 3. Description on should be associated to entity – 4. System publisher is reliable and sustainable • Learn from success in the library community Hideaki Takeda / National Institute of Informatics
  • 38. Schema/Vocabulary for LOD • Class/Concept description – Axiom of a concept in ontology – Database schema for a table in Relational database – Object definition in Object-Oriented Programming/DB • Class description in Semantic Web – RDFS/OWL description for a class • RDFS: Simple class system • OWL: Description Logic-based • Class description in Linked Data – Mostly RDFS-based (exception: owl:sameAs) – Simple Structure (mostly property-value pair) Hideaki Takeda / National Institute of Informatics
  • 39. Schema/Vocabulary for LOD • The importance of sharing schema – Interoperability – Generic applications • Some famous and frequently used shemata – Dublin Core – FOAF (Friend-Of-A-Friend) – SKOS (Simple Knowledge Organization System) Hideaki Takeda / National Institute of Informatics
  • 40. Usage of Common Vocabularies Prefix Namespace Used by dc http://purl.org/dc/elements/1.1/ 66 (31.88 %) foaf http://xmlns.com/foaf/0.1/ 55 (26.57 %) dcterms http://purl.org/dc/terms/ 38 (18.36 %) skos http://www.w3.org/2004/02/skos/core# 29 (14.01 %) akt http://www.aktors.org/ontology/portal# 17 (8.21 %) geo http://www.w3.org/2003/01/geo/wgs84_pos# 14 (6.76 %) mo http://purl.org/ontology/mo/ 13 (6.28 %) bibo http://purl.org/ontology/bibo/ 8 (3.86 %) vcard http://www.w3.org/2006/vcard/ns# 6 (2.90 %) frbr http://purl.org/vocab/frbr/core# 5 (2.42 %) sioc http://rdfs.org/sioc/ns# 4 (1.93 %) LDOW2011 Presentation, Christian Bizer (Freie Universität Berlin), 2011 Hideaki Takeda / National Institute of Informatics
  • 41. (Simple) Dublin Core • Started from the library • 15 elements community – Title • Now maintained by DCMI (Dublin – Creator Core Metadata Initiative) – Subject • (Simple) Dublin Core – Description – Just 15 elements – Publisher – Simple is best – Contributor – No range restriction – Date – http://purl.org/dc/elements/1.1/ – Type – Format – Identifier – Source – Language – Relation – Coverage – Rights Hideaki Takeda / National Institute of Informatics
  • 42. dc terms • Qualified Dublin Core – Domain & Range – More precise terms • Extension of simple dc Properties in the / abstract , accessRights , accrualMethod , accrualPeriodicity , accrualPolicy , alternative , audience , available , bibliograp hicCitation ,conformsTo , contributor , coverage , created , creator , date , dateAccepted , dateCopyrighted , dateSubmit ted , description ,educationLevel , extent , format , hasFormat , hasPart , hasVersion , identifier , instructionalMethod , i sFormatOf , isPartOf , isReferencedBy ,isReplacedBy , isRequiredBy , issued , isVersionOf , language , license , mediator , medium , modified , provenance , publisher , references ,relation , replaces , requires , rights , rightsHolder , source , sp atial , subject , tableOfContents , temporal , title , type , valid Properties in the contributor , coverage , creator , date , description , format , identifier , language , publisher , relation , rights , source , s /elements/1.1/namespace ubject , title , type Vocabulary Encoding Schemes DCMIType , DDC , IMT , LCC , LCSH , MESH , NLM , TGN , UDC Syntax Encoding Schemes Box , ISO3166 , ISO639-2 , ISO639-3 , Period , Point , RFC1766 , RFC3066 , RFC4646 , RFC5646 , URI , W3CDTF Classes Agent , AgentClass , BibliographicResource , FileFormat , Frequency , Jurisdiction , LicenseDocument , LinguisticSystem , Location ,LocationPeriodOrJurisdiction , MediaType , MediaTypeOrExtent , MethodOfAccrual , MethodOfInstruction , Pe riodOfTime , PhysicalMedium ,PhysicalResource , Policy , ProvenanceStatement , RightsStatement , SizeOrDuration , Sta ndard DCMI Type Vocabulary Collection , Dataset , Event , Image , InteractiveResource , MovingImage , PhysicalObject , Service , Software , Sound , Sti llImage , Text Terms related to the DCMI memberOf , VocabularyEncodingScheme Abstract Model Hideaki Takeda / National Institute of Informatics
  • 43. Dcterms subPropertyOf Domain Range Dcterms subPropertyOf Domain Range contributor dc:contributor rdfs:Resource dcterms:Agent conformsTo dc:relation, dcterms:relation rdfs:Resource dcterms:Standard hasFormat dc:relation, dcterms:relation rdfs:Resource rdfs:Resource dc:creator, creator rdfs:Resource dcterms:Agent dcterms:contributor hasPart dc:relation, dcterms:relation rdfs:Resource rdfs:Resource dcterms:LocationPeriodOr coverage dc:coverage rdfs:Resource hasVersion dc:relation, dcterms:relation rdfs:Resource rdfs:Resource Jurisdiction dc:coverage, isFormatOf dc:relation, dcterms:relation rdfs:Resource rdfs:Resource spatial rdfs:Resource dcterms:Location dcterms:coverage isPartOf dc:relation, dcterms:relation rdfs:Resource rdfs:Resource dc:coverage, Temporal rdfs:Resource dcterms:PeriodOfTime dcterms:coverage isReferencedBy dc:relation, dcterms:relation rdfs:Resource rdfs:Resource Date dc:date rdfs:Resource rdfs:Literal isReplacedBy dc:relation, dcterms:relation rdfs:Resource rdfs:Resource Available dc:date, dcterms:date rdfs:Resource rdfs:Literal Created dc:date, dcterms:date rdfs:Resource rdfs:Literal isRequiredBy dc:relation, dcterms:relation rdfs:Resource rdfs:Resource dateAccepted dc:date, dcterms:date rdfs:Resource rdfs:Literal isVersionOf dc:relation, dcterms:relation rdfs:Resource rdfs:Resource dateCopyrighted dc:date, dcterms:date rdfs:Resource rdfs:Literal References dc:relation, dcterms:relation rdfs:Resource rdfs:Resource dateSubmitted dc:date, dcterms:date rdfs:Resource rdfs:Literal Replaces dc:relation, dcterms:relation rdfs:Resource rdfs:Resource Issued dc:date, dcterms:date rdfs:Resource rdfs:Literal Requires dc:relation, dcterms:relation rdfs:Resource rdfs:Resource Modified dc:date, dcterms:date rdfs:Resource rdfs:Literal Rights dc:rights rdfs:Resource dcterms:RightsStatement accessRights dc:rights, dcterms:rights rdfs:Resource dcterms:RightsStatement Valid dc:date, dcterms:date rdfs:Resource rdfs:Literal License dc:rights, dcterms:rights rdfs:Resource dcterms:LicenseDocument description dc:description rdfs:Resource rdfs:Resource Subject dc:subject rdfs:Resource rdfs:Resource dc:description, title dc:title rdfs:Resource rdfs:Resourcerdfs:Literal Abstract rdfs:Resource rdfs:Resource dcterms:description alternative dc:title, dcterms:title rdfs:Resource rdfs:Resourcerdfs:Literal dc:description, type dc:type rdfs:Resource rdfs:Class tableOfContents rdfs:Resource rdfs:Resource dcterms:description audience rdfs:Resource dcterms:AgentClass dcterms:MediaTypeOrExte educationLevel dcterms:audience rdfs:Resource dcterms:AgentClass format dc:format rdfs:Resource mediator dcterms:audience rdfs:Resource dcterms:AgentClass nt dcmitype:Collec extent dc:format, dcterms:format rdfs:Resource dcterms:SizeOrDuration accrualMethod dcterms:MethodOfAccrual tion dcterms:PhysicalR dcmitype:Collec Medium dc:format, dcterms:format dcterms:PhysicalMedium accrualPeriodicity dcterms:Frequency esource tion Identifier dc:identifier rdfs:Resource rdfs:Literal dcmitype:Collec accrualPolicy dcterms:Policy bibliographicCitat dc:identifier, dcterms:Bibliograp tion rdfs:Literal ion dcterms:identifier hicResource instructionalMethod rdfs:Resource dcterms:MethodOfInstructio provenance rdfs:Resource dcterms:ProvenanceStatem Language dc:language rdfs:Resource dcterms:LinguisticSystem rightsHolder rdfs:Resource dcterms:Agent Publisher dc:publisher rdfs:Resource dcterms:Agent Relation dc:relation rdfs:Resource rdfs:Resource http://dublincore.org/documents/dcmi-terms/ source dc:source, dcterms:relation rdfs:Resource rdfs:Resource Hideaki Takeda / National Institute of Informatics http://www.kanzaki.com/docs/sw/dc-domain-range.html
  • 44. The Friend of a Friend (FOAF) • Metadata describe persons and their relationship • Voluntary project Classes: | Agent | Document | Group | Image | LabelProperty | OnlineAccount | OnlineChatAccount | OnlineEcommerceAccount | OnlineGamingAccount | Organization | Person | PersonalProfileDocument | Project | @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . Properties: @prefix foaf: <http://xmlns.com/foaf/0.1/> . | account | accountName | accountServiceHomepage | age | @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . aimChatID | based_near | birthday | currentProject | <#JW> depiction | depicts | dnaChecksum | familyName | a foaf:Person ; family_name | firstName | focus | fundedBy | geekcode | foaf:name "Jimmy Wales" ; foaf:mbox <mailto:jwales@bomis.com> ; gender | givenName | givenname | holdsAccount | foaf:homepage <http://www.jimmywales.com/> ; homepage | icqChatID | img | interest | isPrimaryTopicOf | foaf:nick "Jimbo" ; foaf:depiction jabberID | knows | lastName | logo | made | maker | mbox | <http://www.jimmywales.com/aus_img_small.jpg> ; mbox_sha1sum | member | membershipClass | msnChatID foaf:interest <http://www.wikimedia.org> ; foaf:knows [ | myersBriggs | name | nick | openid | page | pastProject | a foaf:Person ; phone | plan | primaryTopic | publications | foaf:name "Angela Beesley" ]. schoolHomepage | sha1 | skypeID | status | surname | theme | thumbnail | tipjar | title | topic | topic_interest | weblog | <http://www.wikimedia.org> workInfoHomepage | workplaceHomepage | yahooChatID | Takeda /"Wikipedia" . Institute of Informatics Hideaki rdfs:label National
  • 45. SKOS (Simple Knowledge Organization System) • Metadata for taxonomy – Hierarchical structure of concepts • Invented to represent taxonomy such as subject heading • =/= subclass relationship among classes • W3C Recommendation 18 August 2009 Hideaki Takeda / National Institute of Informatics
  • 46. SKOS (Simple Knowledge Organization System) • SKOS Core (hierarchical concept structure) – skos:semanticRelation – skos:broaderTransitive subPropertyOf – skos:narrowerTransitive – skos:broader – skos:narrower – skos:related – skos:preflabel – skos:altlabel – skos:hiddenlabel Hideaki Takeda / National Institute of Informatics
  • 47. SKOS (Simple Knowledge Organization System) • SKOS Mapping – skos:mappingRelation – skos:closeMatch subPropertyOf – skos:exactMatch – skos:broadMatch – skos:narrowMatch – skos:relatedMatch Hideaki Takeda / National Institute of Informatics
  • 48. Linked Open Vocabulary (LOV) • A technical platform for search and quality assessment among the vocabularies ecosystem – Register schemata – Search schemata • http://labs.mondeca.com/dataset/lov/ Hideaki Takeda / National Institute of Informatics
  • 49. X Hideaki Takeda / National Institute of Informatics
  • 50. More Info. • http://www.w3.org/2005/Incubator/lld/wiki/V ocabulary_and_Dataset Hideaki Takeda / National Institute of Informatics
  • 51. Summary for schema • Some major schemata – DC, DC terms, FOAF, SKOS … • More domain-specific schemata – CIDOC CRM – PRISM –… • Re-using is highly recommended – LOV Hideaki Takeda / National Institute of Informatics
  • 52. Summary • Three layers – Ontology/Thesaurus/Taxonomy – Schema – Identification • Not just top-down, rather bottom-up • Each layer has own role • Not pursue the value of each layer, rather make a good combination of them Hideaki Takeda / National Institute of Informatics