Your SlideShare is downloading. ×
0
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
06 gioca-ontologies
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

06 gioca-ontologies

266

Published on

OWL reprezentation of Cultural Inheritance

OWL reprezentation of Cultural Inheritance

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
266
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Chapter 6: Metadata and ontologies for digital cultural heritage documentation Information Technology and Arts OrganizationsA.A. 2010-2011 Information Technology and Arts Organizations 1
  • 2. Syllabus (3/3)5. Databases 1. Entities, attributes and relations 2. Primary key and foreign key, data domain, query language (SQL) 3. Examples using Access DBMS 4. Spatial access to digital content: GIS and GPS 5. GIS examples using ESRI Arcview6. Metadata and ontologies for digital cultural heritage documentation 1. XML, RDF 2. Dublin Core 3. Semantic Web 4. OWL, ontologies 5. Cidoc-CRMA.A. 2010-2011 Information Technology and Arts Organizations 2
  • 3. Motivations• Digital data are stored into files and databases• The data representation is important because if common convention are taken , different applications can cooperate , communicate and elaborate data to provide advanced services Interoperability• Internet is a perfect spreader of digital information about art and culture• A lot of standards are present it is difficult to have high level of interoperability• Information can be written in many ways (different languages, synonyms, ...) META-DATA means “data about data”A.A. 2010-2011 Information Technology and Arts Organizations 3
  • 4. Looking for a film budget... WEB 2.0 SPARQL 326.000 results 1 result ☺ WEB 1.0 SQLA.A. 2010-2011 Information Technology and Arts Organizations 4
  • 5. A cultural search engine... http://e-culture.multimedian.nl/demo/session/searchA.A. 2010-2011 Information Technology and Arts Organizations 5
  • 6. MultimediaN N9C Eculture projectA.A. 2010-2011 Information Technology and Arts Organizations 6
  • 7. A step toward ontologies...during these few lectures I’ve understood the importance ofnew technological devices in the arts: in particular way, they <feedback>could really help the public in better understand the contextand the history of a piece of art. The concepts I’ve learned <description> This survey takes less than ten minutes to be completed. The firstmake me think about new opportunities and new systems section is composed of 10 multiple choices questions followed by 5 open questions.which can be implemented using basic devices and means Thank you for your feedback.of communication.I would like to learn more about ICTs inthe fields of music and live performances, if there are some </description>important steps forward in this direction, because they are <course>the scenarios in which I’m more interested in. I’m alsointerested in archives and in query formulation.I personally Please insert here the name of the module your are going to evaluatehave some difficulties with the computer language and with </course>its working mechanisms. I also have some problems withabbreviations: I would like to know literary the meaning of <student>these acronyms in order to better understand their <name>Put here your name</name>applicability and functions.In these sessions I’ve foundinteresting the fact that even if technology seems very <ID>What we can use for this?</ID>complicated, in reality it is a sum of different small and </student>simple components which can be all be leaded by afundamental rationale. </feedback> Text XML (eXtensible Markup Language) • TAG• No structure • Structured data• For a computer it is just a sequence of chars • TAGs can be nested • Nesting does not represent any “relationship” • It can be represented by a tree • TAG name are free • No “common meaning” associated to each tagA.A. 2010-2011 Information Technology and Arts Organizations 7
  • 8. ...second step...<rdf:RDF><rdf:Description rdf:about="subject"> <rdf:RDF> Namespace xmlns:rdf=“http://www.w3.org/1999/02/22-rdf-syntax-ns#”<predicate rdf:resource="object" /> xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dc="http://purl.org/dc/elements/1.1/" URI <rdf:Description rdf:about="http://en.wikipedia.org/wiki/Oxford http://en.wikipedia.org/wiki/Oxford http://en.wikipedia.org/wiki/Oxford"> dc:title>Oxford</dc:title> </dc:title <dc:title>Oxford</dc:title><predicate>literalvalue</predicate> dc:coverage>Oxfordshire</dc:coverage> </dc:coverage <dc:coverage>Oxfordshire</dc:coverage> <dc:publisher rdf:resource=“http://en.wikipedia.org /> http://en.wikipedia.org http://en.wikipedia.org” </rdf:Description></rdf:Description> Literal </rdf:RDF>... Statement DC (Dublin Core) • Just 15 elements<rdf:Description .... /> • A DC resource can be represented using RDF/XML</rdf:RDF> • It can be seen as a namespace for resources description RDF (Resource Description Framework) • It can be used to describe a single resource• Triples (subject-predicate-object) • To describe a complex domain we need something different• Statements• We can relate different resources ISO Standard 15836:2009• It can be represented by a graph• Everything in unique identified (URI)• Namespaces vocabulariesA.A. 2010-2011 Information Technology and Arts Organizations 8
  • 9. ...third step! RDFS (RDF Schema) CIDOC-CRM (Conceptual Reference Model)• RDF + meaning to “special resources”• Concept of Class• Predicate is also known as Property “The CIDOC Conceptual Reference Model (CRM) provides definitions and a formal structure for describing the implicit and explicit concepts and relationships used in culturalOWL (Ontology Web Language) heritage documentation.” • Can be written in RDF An ontology of 80 classes and 132 • Added “new properties” properties • intersectionOf, unionOf, complementOf ISO Standard 21127:2006 • someValuesFrom, allValuesFrom • Cardinality of a property • Different types of property • Symmetric • Functional • InverseFunctionalA.A. 2010-2011 Information Technology and Arts Organizations 9
  • 10. The Semantic Web project“Most of the Webs content today is designed for humans to read, not forcomputer programs to manipulate meaningfully”Berners-Lee, Tim; James Hendler and Ora Lassila (May 17, 2001). "The Semantic Web".Scientific American Magazine.http://www.sciam.com/article.cfm?id=the-semantic-web&print=true. Retrieved March 26, 2008.http://en.wikipedia.org/wiki/Semantic_Webhttp://en.wikipedia.org/wiki/DBpediaA.A. 2010-2011 Information Technology and Arts Organizations 10
  • 11. XML: gives a structure to data (syntax)<TagnameA attrName1=“AttrValueX” … AttrNamen= “ AtrrValueY” >Text</TagNameA>OR<TagnameB attrName1=“AttrValueX” … AttrNamen= “ AtrrValueY” />Example:<root> <author> <name>Luca</name> <email>luca.roffia@unibo.it</email> </author> <author2 name="Luca" email="luca.roffia@unibo.it"/></root>• Tags can be nested the first opened is the last to be closed the structure of a XML document can be represented by a tree• A well formed document has only one root element• At the beginning of the document (before the root element) there is a line which declares the language, the version the encoding and other characteristic, i.e. <?xml version="1.0" encoding="UTF-8"?>A.A. 2010-2011 Information Technology and Arts Organizations 11
  • 12. Semantics in XML Same meaning but different structure<Works> <Work1>Monnalisa</Work1> <Works Work1=“Monnalisa” <Work2>Last Supper</Work2> Work2=”Last Supper”/></Works> Same structure but different meaning<Works> <Europe> <Work1>Monnalisa</Work1> <Nation1>Italy</Nation1> <Work2>Last Supper</Work2> <Nation2>France</Nation2></Works> </ Europe >A.A. 2010-2011 Information Technology and Arts Organizations 12
  • 13. The next step...RDF• Resource: everything we want to identify. Identification is done by using URI (Universal Resource Identifier): an URL (called namespace or prefix) + suffix• Statement: a triple like subject – predicate – object Object can be a resource or a primitive type, Subject and predicate are resources, i.e. i.e. number, string they are identified by an URIExampleURIhttp://dbpedia.org/resource/ (Namespace) + Rio_de_Janeiro (Suffix) URI: http://dbpedia.org/resource/Rio_de_Janeirohttp://dbpedia.org/property/ (Namespace) + populationTotal (Suffix) URI: http://dbpedia.org/property/populationTotalhttp://dbpedia.org/ontology/ (Namespace) + birthPlace (Suffix) URI: http://dbpedia.org/ontology/birthPlacehttp://dbpedia.org/resource/ (Namespace) + Paulo_Coelho (Suffix) URI: http://dbpedia.org/resource/Paulo_CoelhoSTATEMENTShttp://dbpedia.org/resource/Rio_de_Janeiro - http://dbpedia.org/property/populationTotal - 6093472http://dbpedia.org/resource/Rio_de_Janeiro - http://dbpedia.org/ontology/birthPlace - http://dbpedia.org/resource/Paulo_CoelhoA.A. 2010-2011 Information Technology and Arts Organizations 13
  • 14. RDF graph•A set of RDF statements can subject Drupal 7 data modelused to describe a domain.This set is called RDFknowledge base predicate•An RDF knowledge base can objectby represented by using alabelled graph: each node subjectrepresents a resource, i.e.subject or object, and eachedge represents a predicate predicate objectNAMESPACES A.A. 2010-2011 Information Technology and Arts Organizations 14
  • 15. RDF/XML statement• An RDF statement can be expressed by using the XML syntax• In order to make a RDF statement more concise, a namespace can be specified by using this convention @prefix namespace:URLExamples: @prefix dbpedia-owl:http://dbpedia.org/ontology/ @prefix dbpprop:http://dbpedia.org/property/ @prefix dbpedia:http://dbpedia.org/resource/ http://dbpedia.org/resource/Rio_de_Janeiro - http://dbpedia.org/ontology/birthPlace - http://dbpedia.org/resource/Paulo_Coelho becomes:dbpedia:Rio_de_Janeiro – dbpedia-owl:birthPlace – dbpedia:Paulo_Coelho <rdf:Description about= “dbpedia:Rio_de_Janeiro“> <dbpedia-owl:birthPlace>“RDF/XML: “dbpedia:Paulo_Coelho“ </dbpedia-owl:birthPlace> </rdf:Description>A.A. 2010-2011 Information Technology and Arts Organizations 15
  • 16. Dublin Core: a famous “namespace”The DC was born in Dublin (Ohio) in 1995. It was created by a research group organized by the Online Computer Library Center (OCLC) and by the National Center of Supercomputer Application (NCSA)Motivations:• Museums organize and present their resources in different ways• Even if the structures used to handle information are compatible, often there are difficulties in data interpretation, caused by a different terminology and semantics• Effort in cultural integration from different institutions are still limited. Integration of resources owned by different cultural sites could be very helpful for users. They could use an unified interface to search different kind of resources, available in different formats (from a real object to a digital representation)• The main obstacle is the structural/semantic incompatibility between information system hosted by institutions It is important to adopt a common/standard interchange data format: a standard to represent informationA.A. 2010-2011 Information Technology and Arts Organizations 16
  • 17. CIMI (Computer Interchange of Museum Information)CIMI (Computer Interchange of Museum Information): is “a consortium of cultural heritage institutions and organizations working together to remove barriers to sharing our most valuable cultural information.”The consortium develops relevant standards and encourages open, standards-based approaches to creating and sharing digital informationCIMI worked on the application of Dublin Core in museum resources and supply guidelines for the implementation of this standard in cultural heritage domain ISO Standard 15836:2009 of February 2009A.A. 2010-2011 Information Technology and Arts Organizations 17
  • 18. Dublin Core goals• Simplicity of creation and maintenanceThe Dublin Core element set has been kept as small and simple as possible to allow a non- specialist to create simple descriptive records for information resources easily and inexpensively, while providing for effective retrieval of those resources in the networked environment• Commonly understood semanticsThe Dublin Core can help a non-specialist searcher to find her way by supporting a common set of elements, the semantics (meaning) of which are universally understood and supported• ExtensibilityDublin Core developers have recognized the importance of providing a mechanism for extending the DC element set for additional resource discovery needs Dublin Core aims to allow the exchange of information in different environment to simplify the discovery of resources communicate collaborate exchangeA.A. 2010-2011 Information Technology and Arts Organizations 18
  • 19. One to one principleIn general Dublin Core metadata describes one manifestation or version of a resource, rather than assuming that manifestations stand in for one another• Surrogates are described separately from the original objectA jpeg image of the Mona Lisa has much in common with the original painting, but it is not the same as the painting. As such the digital image should be described as itselfThe problem of the original and surrogates is important for the museums, where the originals are exposed and the surrogates has to be described accurately but at the same time efficiently• This principle, in many cases, simplify the resource descriptionThe author of the of the original Mona Lisa is the painter, while the author of the photo is the photographA.A. 2010-2011 Information Technology and Arts Organizations 19
  • 20. The 15 DC ELEMENTS RESOURCETITLE A name given to the resource.DESCRIPTION Description may include but is not limited to: an abstract, a table of contents, a graphical representation, or a free-text account of the resource. Recommended best practice is to use a controlled vocabulary such as the DCMI Type Vocabulary [DCMITYPE]. To describe the file format, physical medium,TYPE or dimensions of the resource, use the Format element. Spatial topic and spatial applicability may be a named place or a location specified by its geographic coordinates. A jurisdiction may be a named administrative entity or a geographic place to which the resource applies. Recommended best practice is to use a controlled vocabulary such as the Thesaurus ofCOVERAGE Geographic Names [TGN]. Temporal topic may be a named period, date, or date range. Where appropriate, named places or time periods can be used in preference to numeric identifiers such as sets of coordinates or date ranges. Typically, the subject will be represented using keywords, key phrases, or classification codes. Recommended best practice is to use a controlledSUBJECT vocabulary. To describe the spatial or temporal topic of the resource, use the Coverage element. RELATIONSHIPS Recommended best practice is to identify the related resource by means of a string conforming to a formal identification system. Relationships may be expressed reciprocally (if the resources on both ends of the relationship are being described) or in one direction only, even when there is a refinement available toRELATION allow reciprocity. If text strings are used instead of identifying numbers, the reference should be appropriately specific. For instance, a formal bibliographic citation might be used to point users to a particular resource. The described resource may be derived from the related resource in whole or in part. Recommended best practice is to identify the related resource by means of aSOURCE string conforming to a formal identification system. In general, include in this area information about a resource that is related intellectually to the described resource but does not fit easily into a Relation element, e.g. Image from page 54 of the 1922 edition of Romeo and Juliet INTELLECTUAL PROPERTIESCREATOR Examples of a Creator include a person, an organization, or a service. Typically, the name of a Creator should be used to indicate the entity.CONTRIBUTOR Examples of a Contributor include a person, an organization, or a service. Typically, the name of a Contributor should be used to indicate the entityPUBLISHER Examples of a Publisher include a person, an organization, or a service. Typically, the name of a Publisher should be used to indicate the entity.RIGHTS Typically, rights information includes a statement about various property rights associated with the resource, including intellectual property rights. IDENTIFICATION Date may be used to express temporal information at any level of granularity. Recommended best practice is to use an encoding scheme, such as the W3CDTFDATE profile of ISO 8601 [W3CDTF]. Examples of dimensions include size and duration. Recommended best practice is to use a controlled vocabulary such as the list of Internet Media TypesFORMAT [MIME].IDENTIFIER Recommended best practice is to identify the resource by means of a string conforming to a formal identification system.LANGUAGE Recommended best practice is to use a controlled vocabulary such as RFC 4646 [RFC4646]. http://dublincore.org/documents/usageguide/elements.shtml http://dublincore.org/documents/dcmi-terms/A.A. 2010-2011 Information Technology and Arts Organizations 20
  • 21. AN EXAMPLE OF DUBLIN CODE IN RDF <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/metadata/dublin_core#"> <rdf:Description rdf:about="http://www.dlib.org"> <dc:Title>D-Lib Program</dc:Title> <dc:Description> The D-Lib program supports the community of people with research interests in digital libraries and electronic publishing. </dc:Description> <dc:Publisher> Corporation For National Research Initiatives </dc:Publisher> <dc:Date>1995-01-07</dc:Date> <dc:Subject>Research; statistical methods</dc:Subject> <dc:Type>World Wide Web Home Page</dc:Type> <dc:Format>text/html</dc:Format> <dc:Language>en</dc:Language> </rdf:Description> </rdf:RDF>A.A. 2010-2011 Information Technology and Arts Organizations 21
  • 22. Considerations• The Dublin Core is a standard useful in the mapping phase in different domains• It can not be applied to model complex domains• It is possible to add attributes called qualifiers to improve the quality and the detail of the information – “Dumb-Down Principle” : every application can ignore the qualifiers for which it has not an interpretation• This data model describes each single resource• Relationships among resources are not well specified• DC namespace can be used in RDF/XML statementsA.A. 2010-2011 Information Technology and Arts Organizations 22
  • 23. RDF Vocabulary Description Language - - RDF Schema (RDFS)• It extends RDF to include basic features needed to define ontologies – Everything is called resource (subject, predicate, object) – The predicate is also called property• It allows to give a meaning to “special” resources• It introduces the concept of Class• The rdfs:Class resource is the class of all the RDF classes The technique of inheritance is the process of merging the differentiae along the path above any category: Living is defined as animate material Substance, and Human is rational Tree of Porphyry sensitive animate material Substance.A.A. 2010-2011 Information Technology and Arts Organizations 23
  • 24. RDFS graph exampleClasses rdfs:Class rdfs:subClassOf rdfs:subClassOf WorkOfArt URI? Artist rdf:type rdf:type dc:creator http://en.wikipedia.org/wiki/David_%28Donatello%29 http://en.wikipedia.org/wiki/Donatello rdf:type rdf:type dc:creator http://en.wikipedia.org/wiki/David_%28Michelangelo%29 http://en.wikipedia.org/wiki/MichelangeloInstancesA.A. 2010-2011 Information Technology and Arts Organizations 24
  • 25. OWL (Ontology Web Language)• It is integrated with RDF OWL is directly accessible to web applications• It allows to create a knowledge base about a domain of interest in terms of: – individuals: are the basic elements of the domain, e.g. Donatello – concepts (classes): describe sets of individuals having similar characteristics , e.g. Artist – roles (properties): describe relationships between pairs of individuals, e.g. dc:creator• RDFS allows to model: – Hierarchy of classes and properties – Domain and range of properties• OWL extends RFDS in terms of: – Logical operation on classes – Additional property characteristics: Transitive, Symmetric, FunctionalA.A. 2010-2011 Information Technology and Arts Organizations 25
  • 26. OWL Properties examplesFigures from:A Practical Guide To Building OWL Ontologies Using Protégé 4 and CO-ODE Tools, Edition 1.2, Matthew HorridgeThe University Of Manchester, Copyright @ The University Of Manchester, March 13, 2009A.A. 2010-2011 Information Technology and Arts Organizations 26
  • 27. CIDOC-CRM• Semantic interoperability in culture can be achieved by an “extensible ontology” and explicit event modeling, that provides shared explanation rather than prescription of a common data structure• “The intended scope of the CIDOC CRM may be defined as all information required for the scientific documentation of cultural heritage collections, with a view to enabling wide area information exchange and integration of heterogeneous sources”• “The CIDOC Conceptual Reference Model (CRM) provides definitions and a formal structure for describing the implicit and explicit concepts and relationships used in cultural heritage documentation.”• The CIDOC-CRM models actors, events, objects in space and time The ontology is a language that IT experts and cultural experts can shareA.A. 2010-2011 Information Technology and Arts Organizations 27
  • 28. From DC to CIDOC-CRMType: TextTitle: Protocol of Proceedings of Crimea ConferenceTitle.Subtitle: II. Declaration of Liberated EuropeDate: February 11, 1945Creator: The Premier of the Union of Soviet Socialist Republics The Prime Minister of the United Kingdom The President of the United States of AmericaPublisher: State DepartmentSubject: Postwar division of Europe and JapanMetadata (Dublin Core) Document “The following declaration has been approved: About… The Premier of the Union of Soviet Socialist Republics,the Prime Minister of the United Kingdom and the President of the United States of America have consulted with each other in the common interests of the people of their countries and those of liberated Europe. They jointly declare their mutual agreement to concert…and to ensure that Germany will never again be able to disturb the peace of the world……“A.A. 2010-2011 Information Technology and Arts Organizations 28
  • 29. CIDOC-CRM example from Martin Doerr, Steve Stead “The CIDOC CRM, a Standard for the Integration of Cultural Information E52 Time-Span E53 Place February 1945E39 Actor P82 at some time within ce at k pla P1 1p P7 too art icip a ted in E7 Activity “Crimea Conference” P6 7 E38 ImageE39 Actor is re f er re d P86 falls within to E65 Creation EventE39 Actor * P9 4h as cre ate d d me e rfor P81 ongoing throughout 4p P1 E52 Time-Span E31 Document 11-2-1945 “Yalta Agreement”A.A. 2010-2011 Information Technology and Arts Organizations 29

×