OKFN KoreaHackathon Day2013. 06. 22.Toward Open Data World
OKFN Korea2What is linkeddata, Opendata?RefineModellingAccessTripleStorageother topicsimage: Leo Oosterloo @ flickr.com
서울시 데이터 Enrichment 목표 서울시 데이터 상세화를 위한 온톨로지 설계 또는 매핑 구조화, 의미화, 그리고 연결: 서울시 데이터 (비정형 데이터)를 온톨로지를 이용해모델링하고, 외부 데이터와 연결 영문...
서울시 데이터 Enrichment 예를 들어, 박물관을 모델링 할 경우,• 박물관에 대한 infobox 템플릿을 위키피디아에서 선택• Dbpedia에서 박물관 infobox와 매핑한 어휘 선택• 어휘와 데이터셋 항목 ...
ContentsOKFN KoreaModeling Issues1Management Issues25
Modelling – RDFSubject Predicate Object
Modelling – RDFSubject Predicate Objectsome school has a name/label some literal
Modelling – RDFSubject Predicate Objecthttp://education.data.gov.uk/id/school/401874has a name/label ―Cardiff High School‖
Modelling – RDFSubject Predicate Objecthttp://education.data.gov.uk/id/school/401874http://www.w3.org/2000/01/rdf-schema#l...
Modelling – RDFSubject Predicate Objectschool:401874 rdfs:label ―Cardiff High School‖whereschool: = http://education.data....
Modelling – RDFSubject Predicate Objectschool:401874 rdfs:label ―Cardiff High School‖school:401874 ont:districtAdministrat...
Modelling – RDFSubject Predicate Objectschool:401874 rdfs:label ―Cardiff High School‖school:401874 ont:districtAdministrat...
Modelling – RDFSubject Predicate Objectschool:401874 rdfs:label ―Cardiff High School‖school:401874 ont:districtAdministrat...
Modelling – vocabulariesLogical modellingmodelling the domain, not a particulardata structure what exists what is asser...
Modelling – vocabulariesunfamiliar terminology but related to information architecture and conceptualmodelling domain-d...
Elements of: Vocabulary (defining terms)• I define a relationship called “prescribed dose.” Schema (defining types)• “p...
Modelling – RDFSRDF vocabulary description languageclasses, types and type hierarchyont:School rdfs:Classrdf:type―School‖...
Modelling – RDFSRDF vocabulary description languageclasses, types and type hierarchyont:WelshEstablishmentont:School rdfs...
Modelling – RDFSRDF vocabulary description languageclasses, types and type hierarchyschool:401874ont:WelshEstablishmenton...
Modelling – RDFSRDF vocabulary description languageclasses, types and type hierarchyschool:401874ont:WelshEstablishmenton...
Modelling – RDFSRDF vocabulary description languageproperties, property hierarchyschool:401874person:JoeBloggsont:staffAt...
Modelling – RDFSRDF vocabulary description languageclass/property relations domain rangeAlready have power to do some ...
WOL OWL is…23Web Ontology Language
Elements of ontology Same/different identity• “author” and “auteur” are the same relation• two resources with the same “...
Answer questions of Consistency• Are there any contradictions in this model? Classification• What are all the inferred ...
Building Useful Ontologies Developing and maintaining quality ontolgies is verychallenging Users need tools and services...
Building Useful Ontologies Developing and maintaining quality ontolgies is verychallenging Users need tools and services...
Building Useful Ontologies Developing and maintaining quality ontolgies is verychallenging Users need tools and services...
Modelling - OWL richer modelling and semantics axioms on properties transitive, symmetric, inverseOf, ... functional, ...
Modelling – OWLsupports much richer modellingconsistency checking of modelconsistency checking of data some surprises ...
ModellingSpectrum of goals and stylesLightweight vocabularies Rich ontological models simple modelling just enough agree...
ModellingOntology reuseinvest in complete ontology for a domain rich but general model, may be modular inside strong ―o...
Reusable, public ontologies33Measurement Units OntologyThe Event OntologyFOAF
schema.org is one of a number ofmicrodata vocabulariesit is a shared collection of microdataschemas for use by webmaster...
annotate an item with text-valuedproperties using the “itemprop”attributemicrodata properties35<div itemscope><p>My name ...
GoogleYahooBingWhy should you use schema.org?36
Top types37
maintains schema.org ↔RDFmappings there are mappings for BIBO, DBpedia,Dublin Core, FOAF, GoodRelations, SIOC,and WordNe...
Triple StoreOKFN Korea39
Triple Store & RDBOKFN Koreahttp://blog.gniewoslaw.pl/2012/11/relational-databases-vs-triple-stores/40
Storage Solutionsfor RDF DataTriple Table (Basic Idea) Store all RDF triples in a single table Create indexes on combin...
The Internet MapOKFN Koreahttp://internet-map.net/42
creditsThese slides are partially based on“Linked data and its role in thesemantic web” by Dave Reynolds,Epimorphics Ltd....
OKFN Korea
Upcoming SlideShare
Loading in...5
×

20130622 okfn hackathon t2

224

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
224
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Definition.
  • 20130622 okfn hackathon t2

    1. 1. OKFN KoreaHackathon Day2013. 06. 22.Toward Open Data World
    2. 2. OKFN Korea2What is linkeddata, Opendata?RefineModellingAccessTripleStorageother topicsimage: Leo Oosterloo @ flickr.com
    3. 3. 서울시 데이터 Enrichment 목표 서울시 데이터 상세화를 위한 온톨로지 설계 또는 매핑 구조화, 의미화, 그리고 연결: 서울시 데이터 (비정형 데이터)를 온톨로지를 이용해모델링하고, 외부 데이터와 연결 영문화: 비 한국어권 사용자가 사용할 수 있는 서울시 데이터 제공 범위 서울시 데이터셋 약 40종 문화재: 문화재청에서 수집한 국내 문화재 (국보, 보물, 지정문화재, 무형문화재 등) 방법론: 기존 RDF 어휘의 재사용을 통해 데이터 모델링 1) 데이터 선정: 서울시 열린데이터 광장에서 모델링 대상 데이터셋 선정 2) 데이터 셋 항목 검토: 데이터 셋의 개별 항목과 Dbpedia 온톨로지 (클래스, 속성)의 매핑 관계 검토• Dbpedia 온톨로지: 사물에 대한 개념 및 위키피디아 infobox 항목을 포함하고 있음OKFN Korea3
    4. 4. 서울시 데이터 Enrichment 예를 들어, 박물관을 모델링 할 경우,• 박물관에 대한 infobox 템플릿을 위키피디아에서 선택• Dbpedia에서 박물관 infobox와 매핑한 어휘 선택• 어휘와 데이터셋 항목 매핑• 매핑되지 않는 항목의 모델링 여부 결정 (클래스, 속성 포함): 모델링 도구 결정 필요• URI 체계 (별도 설계 필요) 적용• 온톨로지 스키마 설계 완료 3) 데이터 정제• Google Refine을 통해 데이터 정제• Refine에서 추가하기 전에 할 작업• 위치 데이터: 원본 데이터 (서울시)에 위치값을 변환 또는 추가• 영문명: 한글명의 변환, 매핑 (수작업 필요)• Refine에서 할 작업– 한글, 영문 위키피디아 URL 추가– Dbpedia, Freebase URL 추가: Refine reconciliation을 이용해서 추가– RDF 변환 매핑 Skelton 작업– RDF, Excel 추출 4) 데이터 업로드 (RDF 또는 Excel) 데이터 스토어 선택 Jena, 4Store, …OKFN Korea4
    5. 5. ContentsOKFN KoreaModeling Issues1Management Issues25
    6. 6. Modelling – RDFSubject Predicate Object
    7. 7. Modelling – RDFSubject Predicate Objectsome school has a name/label some literal
    8. 8. Modelling – RDFSubject Predicate Objecthttp://education.data.gov.uk/id/school/401874has a name/label ―Cardiff High School‖
    9. 9. Modelling – RDFSubject Predicate Objecthttp://education.data.gov.uk/id/school/401874http://www.w3.org/2000/01/rdf-schema#label―Cardiff High School‖
    10. 10. Modelling – RDFSubject Predicate Objectschool:401874 rdfs:label ―Cardiff High School‖whereschool: = http://education.data.gov.uk/id/school/rdfs: = http://www.w3.org/2000/01/rdf-schema#
    11. 11. Modelling – RDFSubject Predicate Objectschool:401874 rdfs:label ―Cardiff High School‖school:401874 ont:districtAdministrative la:00PTla:00PT rdfs:label Cardiff
    12. 12. Modelling – RDFSubject Predicate Objectschool:401874 rdfs:label ―Cardiff High School‖school:401874 ont:districtAdministrative la:00PTla:00PT rdfs:label ―Cardiff‖school:401874―Cardiff High School‖ont:districtAdministrativela:00PT―Cardiff‖rdfs:labelrdfs:label
    13. 13. Modelling – RDFSubject Predicate Objectschool:401874 rdfs:label ―Cardiff High School‖school:401874 ont:districtAdministrative la:00PTla:00PT rdfs:label ―Cardiff‖la:00PT rdfs:label ―Caerdydd‖@cy
    14. 14. Modelling – vocabulariesLogical modellingmodelling the domain, not a particulardata structure what exists what is asserted? what can you deduce fromthat? not about constraints as such monotonic, open worldcontrolledvocabularytaxonomythesaurusontologyOntology
    15. 15. Modelling – vocabulariesunfamiliar terminology but related to information architecture and conceptualmodelling domain-driven design ... and yes knowledge representation
    16. 16. Elements of: Vocabulary (defining terms)• I define a relationship called “prescribed dose.” Schema (defining types)• “prescribed dose” relates “treatments” to “dosagees” Taxonomy (defining hierarchies)• Any “doctor” is a “medical professional”16RDF Schema is…
    17. 17. Modelling – RDFSRDF vocabulary description languageclasses, types and type hierarchyont:School rdfs:Classrdf:type―School‖rdfs:label
    18. 18. Modelling – RDFSRDF vocabulary description languageclasses, types and type hierarchyont:WelshEstablishmentont:School rdfs:Classrdf:typerdf:typerdfs:subClassOf―School‖rdfs:label
    19. 19. Modelling – RDFSRDF vocabulary description languageclasses, types and type hierarchyschool:401874ont:WelshEstablishmentont:WelshEstablishmentont:School rdfs:Class rdf:typerdf:typerdf:typerdfs:subClassOf―School‖rdfs:label
    20. 20. Modelling – RDFSRDF vocabulary description languageclasses, types and type hierarchyschool:401874ont:WelshEstablishmentont:WelshEstablishmentont:School rdfs:Class rdf:typerdf:typerdf:typerdfs:subClassOfschool:401874ont:WelshEstablishmentont:Schoolrdf:type―School‖rdfs:label―School‖rdfs:label
    21. 21. Modelling – RDFSRDF vocabulary description languageproperties, property hierarchyschool:401874person:JoeBloggsont:staffAtont:headOfrdf:Propertyont:headOfrdf:typerdfs:subPropertyOfschool:401874person:JoeBloggsont:staffAtont:headOf
    22. 22. Modelling – RDFSRDF vocabulary description languageclass/property relations domain rangeAlready have power to do some vocabulary mapping declare classes or properties from different vocabularies to be equivalent:A rdfs:subClassOf BB rdfs:subClassOf A
    23. 23. WOL OWL is…23Web Ontology Language
    24. 24. Elements of ontology Same/different identity• “author” and “auteur” are the same relation• two resources with the same “ISBN” are the same“book” More expressive type definitions• A “cycle” is a “vehicle” with at least one “wheel”• A “bicycle” is a “cycle” with exactly two “wheels” More expressive relation definitions• “sibling” is a symmetric predicate• the value of the “favorite dwarf” relation must be one of“happy”, “sleepy”, “sneezy”, “grumpy”, “dopey”,“bashful”, “doc”OWL is…24
    25. 25. Answer questions of Consistency• Are there any contradictions in this model? Classification• What are all the inferred types of this resource? Satisfiability• Are there any classes in this ontology that cannot possibly have any members?What can we do with OWL?25
    26. 26. Building Useful Ontologies Developing and maintaining quality ontolgies is verychallenging Users need tools and services, e.g., to help checkif ontology is: Meaningful — all named classes can have instanceshttp://www.aber.ac.uk/compsci/public/media/presentations/OUCL-seminar.ppt
    27. 27. Building Useful Ontologies Developing and maintaining quality ontolgies is verychallenging Users need tools and services, e.g., to help checkif ontology is: Meaningful — all named classes can have instances Correct — captures intuitions of domain experts
    28. 28. Building Useful Ontologies Developing and maintaining quality ontolgies is verychallenging Users need tools and services, e.g., to help check if ontology is: Meaningful — all named classes can have instances Correct — captures intuitions of domain experts Minimally redundant — no unintended synonymsBanana split Banana sundae
    29. 29. Modelling - OWL richer modelling and semantics axioms on properties transitive, symmetric, inverseOf, ... functional, inverse functional equivalent property axioms on classes intersection, union, disjoint, equivalent restrictions on classes some value from, all values from, cardinality, has value,one of, keys axioms on individuals same as, different from, all different imports
    30. 30. Modelling – OWLsupports much richer modellingconsistency checking of modelconsistency checking of data some surprises if used to schema languages open world, no unique name assumption can extend to closed world checkinginference classification inferred relationships
    31. 31. ModellingSpectrum of goals and stylesLightweight vocabularies Rich ontological models simple modelling just enough agreementto get useful work done removing boundaries toenable information to befound and connected global consistency notpossible a little semantics goesa long way rich domain models need expressivity consistency is critical make complex inferences you can rely on,across data you trust knowledge is power
    32. 32. ModellingOntology reuseinvest in complete ontology for a domain rich but general model, may be modular inside strong ―ontological commitment‖ e.g. medical ontologiesreuse small, common, vocabularies FOAF, SIOC, Dublin Core, Org ... pick and choose classes and properties you need fill in a few missing links for your domaingeneric reusable vocabularies Data cube vocabulary
    33. 33. Reusable, public ontologies33Measurement Units OntologyThe Event OntologyFOAF
    34. 34. schema.org is one of a number ofmicrodata vocabulariesit is a shared collection of microdataschemas for use by webmastersincludes a type hierarchy, like anRDFS schema starts with top-level Thing and DataTypetypes properties are inherited by descendant typesSchema.org34
    35. 35. annotate an item with text-valuedproperties using the “itemprop”attributemicrodata properties35<div itemscope><p>My name is <span itemprop="name">Daniel</span>.</p></div><div itemscope><p>Flavors in my favorite ice cream:</p><ul><li itemprop="flavor">Lemon sorbet</li><li itemprop="flavor">Apricot sorbet</li></ul></div>
    36. 36. GoogleYahooBingWhy should you use schema.org?36
    37. 37. Top types37
    38. 38. maintains schema.org ↔RDFmappings there are mappings for BIBO, DBpedia,Dublin Core, FOAF, GoodRelations, SIOC,and WordNetalso provides examples, tutorials, anddata dumpsSchema.rdfs.org38
    39. 39. Triple StoreOKFN Korea39
    40. 40. Triple Store & RDBOKFN Koreahttp://blog.gniewoslaw.pl/2012/11/relational-databases-vs-triple-stores/40
    41. 41. Storage Solutionsfor RDF DataTriple Table (Basic Idea) Store all RDF triples in a single table Create indexes on combinations of S, P, and OOKFN Korea41
    42. 42. The Internet MapOKFN Koreahttp://internet-map.net/42
    43. 43. creditsThese slides are partially based on“Linked data and its role in thesemantic web” by Dave Reynolds,Epimorphics Ltd.OKFN Korea43
    44. 44. OKFN Korea
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×