SemanticSemantic CMS Community       Data                             Access Lecturer Organization Date of presentation   ...
Page:                           Part I: Foundations(1)   Introduction of Content                  Foundations of Semantic ...
Page: 3 What is this Lecture about? We   have learned ...                       Part II: Semantic Content    ... which l...
Page: 4  Outline Semantic         Data    Semantic Web    RDF Semantic         Data Storage    Triple Stores Semanti...
Page: 5  Semantic Data Stands  for machine understandable information Allows computers to figure out the data without us...
Page: 6  Semantic Data Provides        infrastructure to get practical results     Applications find out subsequent info...
Page: 7  Semantic Web A classical         generic description:     “Web of data” Extends        the World Wide Web    ...
Page: 8Semantic Web Layer Cake                      Semantic Web Layer Cake, Image source: http://www.w3.org/2007/03/layer...
Page: 9    Semantic Web   So many organizations publishing their data in different    domains       Media       Geograp...
Page: 10 Representation of Semantic Data RDF    The common data format    An abstract model with several serialization ...
Page: 11  Storing Semantic Data Need for specialized designs for triple collections Two modalities:     Relational data...
Page: 12  Triple Store A purpose-built          database for the storage and retrieval of  RDF data.     Optimized place...
Page: 13    Considering XML Databases   XML databases are existing storage systems for semi-    structured data       Id...
Page: 14    Considering XML Databases   XML Databases are not suitable for storage and querying    RDF       Only simple...
Page: 15  Monolithic approach for DB  Based Triple Stores Generic representation for all RDF schemas Only two tables are...
Page: 16    Monolithic approach for DB    Based Triple Storespredid       subid      objid   objvalue   id       uri6     ...
Page: 17  Triples Stores Can    be categorized into 3 category:    In memory triple stores      Used      for certain o...
Page: 18Functionalities provided byTriple Stores   RDBMS-support   General RDF model access   Query language support in...
Page: 19        Example Triple Store implementations   RDF Suite       Sofia Alexaki, Vassilis Christophides, Gregory Ka...
Page: 20   RDFSuite (ICS-Forth)** IST-1999-13479 C-Web, IST-2000-26074 Mesmuses      www.iks-project.eu                   ...
Page: 21  How triples are stored and  accessed in RDF Suite Separate       tables are created to store resources     Pro...
Page: 22                   How triples are stored and                   accessed in RDF Suite                             ...
Page: 23              Sesame Architecture                 DBMS-independent API for                  accessing triple     ...
Page: 24              SAIL API over PostgreSQL                                                                            ...
Page: 25              SAIL API over MySQL                                                                                 ...
Page: 26Jena2 Architecture www.iks-project.eu              Copyright IKS Consortium
Page: 27               Jena2 Architecture                     www.iks-project.eu                                          ...
Page: 28  Jena2 Jena2    Denormalized schema      Avoids unnecessary joins by merging URIs, literals in        statemen...
Page: 29Normalized vs DenormalizedTables www.iks-project.eu              Copyright IKS Consortium
Page: 30               Property Tables                  Triple Store Only                                                 ...
Page: 31 Jena Persistence Options SDB    Scalable storage and query for RDF    Specifically designed for SPARQL support...
Page: 32  Jena Persistence Options TDB    Provides for large scale storage and query of RDF     datasets using a pure Ja...
Page: 33  Virtuoso General  purpose RDBMS with extensive RDF  adaptations RDF data is stored as RDF quads, i.e. it suppo...
Page: 34  Querying Semantic Data Semantic         data can be queried from triple stores by    Various query languages  ...
Page: 35  SPARQL Is   an RDF query language      Standardized by W3C consortium      Similar concept of SQL for databas...
Page: 36  SPARQL Endpoints Provides functionality to query the knowledge base via  the SPARQL language Accepts queries a...
Page: 37  Semantic Data Access With API  Calls Open     source projects provides APIs to manipulate RDF data    Jena   ...
Page: 38     Jena Jenaprovides a rich API to manipulate the RDF stored in the underlying triple store.    Model to repre...
Page: 39    Jena Code SnippetString personURI = "http://somewhere/JohnSmith";String givenName = "John";String familyName =...
Page: 40   Jena Created     triples with the code snippet in previous slide: (<http://somewhere/JohnSmith>, VCARD.FN, “Jo...
Page: 41  Apache Clerezza Provides   an API regardless from the different triples  stores it supports Its API provides a...
Page: 42    Apache Clerezza Code Snippet   Simple code snippet adding two triples to the graph:String base = “http://www....
Page: 43  Linked Data Interrelated datasets on the Web so that computers can  explore them Has a standard format to be a...
Page: 44  Linked Data Fourfamous principles of linked data represented by Tim Berners-Lee    Use URIs as names of things...
Page: 45Linked Data www.iks-project.eu              Copyright IKS Consortium
Page: 46  Linking Open Data Project Isan W3C SWEO Project Aims to make data freely to everyone Aims to publish open dat...
Page: 47Linked Datasets As of October2008 www.iks-project.eu              Copyright IKS Consortium
Page: 48Linked Datasets As of September2010 www.iks-project.eu              Copyright IKS Consortium
Page: 492011   www.iks-project.eu              Copyright IKS Consortium
Page: 50  Access Data In The Cloud Follow  the RDF links representing the “things” SPARQL Endpoints Ready to use softwa...
Page: 51    Linked Data Applications   Lots of application on top of the linked data       Tabulator       Marbles    ...
Page: 52  Available SPARQL Endpoints http://dbpedia.org/sparql http://www4.wiwiss.fu-berlin.de/dblp/ Tosee possible SPA...
Page: 53     References   http://www.w3.org/TR/rdf-sparql-query   http://jena.sourceforge.net/tutorial/RDF_API/index.htm...
Upcoming SlideShare
Loading in …5
×

Lecture semantic dataaccess_presentation

1,097 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,097
On SlideShare
0
From Embeds
0
Number of Embeds
287
Actions
Shares
0
Downloads
16
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Web of data refers to interconnected structured datasets distributed all over the world. It enables machines to traverse the links between these datasets in a noiseless way. The noise referred here is resulted from containing metadata and actual data in the web sites.
  • The figure illustrates different layers of semantic web stack. Content of this lecture will be covering querying, data interchange, syntax and identifiers layers.The overall figure shows the standardized technologies to form Semantic Web.Identifiers are used to identify semantic web resources. URIs are used to identify resources in a dereferencable way. In the syntax layer, semantic web resources are represented in different formants e.g. XML. In the data interchange layer, RDF is the language that is used to represent semantic web resources. Different formats for RDF is available e.g. RDF+XML, Turtle, etc. Querying layer provides methods to obtain semantic web resources. Sparql is the most common query language.
  • An RDF triple contains three components: the subject, which is an RDF URI reference or a blank node the predicate, which is an RDF URI reference the object, which is an RDF URI reference, a literal or a blank node An RDF triple is conventionally written in the order subject, predicate, object.The predicate is also known as the property of the triple.From wikipedia:The subject denotes the resource, and the predicate denotes traits or aspects of the resource and expresses a relationship between the subject and the object. For example, one way to represent the notion &quot;The sky has the color blue&quot; in RDF is as the triple: a subject denoting &quot;the sky&quot;, a predicate denoting &quot;has the color&quot;, and an object denoting &quot;blue&quot;.
  • An XML model can be used to store triple-like data by rewriting the triples into simple 3-part XML element structures and then using existing XML query systems. However, XML data model is a tree-like structure with elements and attributes in different facets on the other hand RDF data forms a directed-cyclic graph which does not have a proper hierarchical structure.
  • Storing and querying semantic data through XML databases and Query Languages would not work, since:Only simple manipulations can be handled through XML query languagesRDF Schema processing and inference is not possibleStandard RDF/XML mapping is unsuitable since multipleXML serializations are possible for the same RDF graph, making retrieval complex.
  • In the monolithic approach there are two tables storing the data: Triples table and resources table.Resources table stores only the URIs and identifiers associated with them. In the triples table, one each reference for subject and for predicate is stored. If the object value is also a URI, it is also represented with a reference. These references are used to fetch corresponding URI from resources table. If the object value is not a URI i.e it is a literal, its value directly stored in the triples table.However, collecting all data within two tables is not scalable and does not allow complex operations e.g reasoning, querying on it.
  • Overall architecture of RDFSuite. It separates logical and physical data by allowing queries through a high level query language(RQL) over the stored semantic data. For storage, RDFSuite uses an ORDBMS. Resources are loaded to the system by exploiting the available RDF schema knowledge. Database representations can be customized according to employed schemas.
  • -A non-monolithic approach is used. This approach states separation of tables to store classes.-Indices are constructed on the attributes such as URI, source andtarget of the created tables in order to speed up joins andthe selection of spesific tuples of the tables
  • An example database structure that is formed through RDF schema. The core schema is represented by the four schemas namely, Class, Property, SubClass and SubPropertytables. This approach is more flexible than the monolithic approach in terms of ability of customizing the physical representation of data in the underlying database.
  • Main prominent feature of Sesame is to offer an Application Programming Interface on top of the actual data storage. This makes possible to implement the interface on top of different repositories. Other components are clients of SAIL API.
  • Difficult to add table in PostrgresqlWhen adding a new subClassOf relation between two existing classes, the complete class hierarchy starting from the subclass needs to broken down and rebuilt again because subtable relations can not be added to an existing table;the subtable relations have to be specified when a table is createdOnce created, the subtable relations are fixed.
  • Jena provides a simple minimalist view of the RDF graph allowing exposing of data as triples. Users interact with the abstract Model. Model interface delegates high level operations to the low level operations on triples stored in an RDF graph. Jena2 storage provides 3 graph operations namely, add, delete and find.
  • Persistence layer presents a Graph interface to the higher levels of Jena as already said. Each logical graph is implemented using an ordered list of specialized graphs.An operation on the entire logical graph, such as add , delete or find, is processed by invoking add, delete, find on each specialized graph.
  • Jena 2 uses denormalized schemas. Because in normalized graphs every find operation required multiple joinsbetween the Resources table and the Triples table. In denormalized schemas URIs and simple literal values are stored directly in the statement tables. They are exemplified in the next slide.There are also multiple statements tables. Because single statement table approach is not scalable for large data sets and cannot benefit from the locality among subjects and predicates. Jena2 uses Property Tables. Those tables store patterns of RDF statements. They are database tables independent from the actual triple store framework. Statement and properties are stored in triple store or property table, but not in both.
  • Let’s compare the triple store and application specific schema by an example. Suppose we want to store information about people, each of them has some properties such as name, age, and so on. The triple store approach needs to store 10 record. For application specific schema, if we know that most people have name, age and gender, we will group these 3 properties into one table, called property table. For those multi-value properties, we still store them in triple store, these way we reduce the number of records to be stored from 10 to 7. Also, if users always query people’s name by their age. Using property table, once the age is qualified, the name value can be retrieved immediately. But in triple store approach, it needs to first get the subject with certain value of age property, then use the subject to look for name value again, which is less efficient.
  • Provides for scalable storage and query of RDF datasets using conventional SQL databases for use in standalone applications, J2EE and other application frameworks.
  • All quads are in one table, which may have different indexing depending on the expected query loadtriples should be locatable given the S or a value of Otwo covering indices, G, S, P, O and O, G, P, S.Any Triple Store that supports Named Graph functionality is more than likely a Quad Store. Many Triple Stores are in fact Quad Storesdue to the need to maintain RDF Data provenance within the data
  • SPARQL is the defacto query language which is used to express queries over RDF data sources. It allows querying RDF graph patterns together with their conjunctions and disjunctions. Other languages are more proprietary and used in narrow scopes.There are several open source projects that provides knowledge management functionality such as Apache Clerezza and Jena. They provided APIs to users for storing and accessing the semantic data.As organizations publish their data in RDF format, there occurred opportunities to interlink the related contents. As a result, once a user obtain a resource from the linked data cloud, s/he can traverse related data through the links.
  • Different organizations provide querying services over their RDF data through SPARQL endpoints. SPARQL endpoints are machine friendly interfaces towards underlying knowledge bases. See http://www.w3.org/wiki/SparqlEndpoints for several SPARQL endpoints.
  • This figure represents the 4 design principles of linked data in a stack like architecture. URIs are used as names of the resources on the web. HTTP URIs are used so that others can access the actual data represented by the URI. RDF is the actual representation of the resources represented by URIs and lastly SPARQL is used to obtain desired information over the RDF data.
  • SWEO: Semantic Web Education and Outreach… This was an interest group within W3C. SWEO Interest Group had been established to develop strategies and materials to increase awareness among the Web community of the need and benefit for the Semantic Web, and educate the Web community regarding related solutions and technologies.
  • Lecture semantic dataaccess_presentation

    1. 1. SemanticSemantic CMS Community Data Access Lecturer Organization Date of presentation Co-funded by the 1 Copyright IKS Consortium European Union
    2. 2. Page: Part I: Foundations(1) Introduction of Content Foundations of Semantic (2) Management Web Technologies Part II: Semantic Content Part III: Methodologies Management Knowledge Interaction Requirements Engineering(3) (7) and Presentation for Semantic CMS(4) Knowledge Representation and Reasoning (8) Designing Semantic CMS Semantifying(5) Semantic Lifting (9) your CMS Storing and Accessing Designing Interactive(6) Semantic Data (10) Ubiquitous IS www.iks-project.eu Copyright IKS Consortium
    3. 3. Page: 3 What is this Lecture about? We have learned ... Part II: Semantic Content  ... which languages can be used Management to model knowledge. Knowledge Interaction (3)  ... how to extract knowledge and Presentation from content in a automatic way (semantic lifting). (4) Knowledge Representation and Reasoning We need a way ... (5) Semantic Lifting  ... to store the extracted Storing and Accessing knowledge technically in an (6) Semantic Data accessible way. www.iks-project.eu Copyright IKS Consortium
    4. 4. Page: 4 Outline Semantic Data  Semantic Web  RDF Semantic Data Storage  Triple Stores Semantic Data Access  SPARQL  RQL  API Calls www.iks-project.eu Copyright IKS Consortium
    5. 5. Page: 5 Semantic Data Stands for machine understandable information Allows computers to figure out the data without user interference Allows computers act intelligently without programming for each task www.iks-project.eu Copyright IKS Consortium
    6. 6. Page: 6 Semantic Data Provides infrastructure to get practical results  Applications find out subsequent information based on the previous relations. (e.g. Eiffel Tower -> Paris -> France) Allows reasoning capabilities  Providing extraction of related information which is not directly linked www.iks-project.eu Copyright IKS Consortium
    7. 7. Page: 7 Semantic Web A classical generic description:  “Web of data” Extends the World Wide Web  By encouraging,  Common language for representing data  Transformable to/from disparate sources such as relational databases, XML, etc (RDF)  Common reusable data model to represent data from different domains in common terms (RDFS, OWL, etc)  Rules to enable applications reason over the information (SWRL) www.iks-project.eu Copyright IKS Consortium
    8. 8. Page: 8Semantic Web Layer Cake Semantic Web Layer Cake, Image source: http://www.w3.org/2007/03/layerCake.svg www.iks-project.eu Copyright IKS Consortium
    9. 9. Page: 9 Semantic Web So many organizations publishing their data in different domains  Media  Geographic  Government  … Whole set contains approximately 30 billion triples One of the largest collections is DBPEDIA  Semantified version of Wikipedia  Example:  Obtain cities of China that have population over 20 million  Needs efficient storage and query for semantic data www.iks-project.eu Copyright IKS Consortium
    10. 10. Page: 10 Representation of Semantic Data RDF  The common data format  An abstract model with several serialization formats  Consists of statement referred as triples having the form (subject, predicate, object) where,  Subject: any resource identifier  Predicate: a resource identifier of any property  Object: either a resource identifier or a literal value www.iks-project.eu Copyright IKS Consortium
    11. 11. Page: 11 Storing Semantic Data Need for specialized designs for triple collections Two modalities:  Relational databases  Triple stores  Mostly used for storage  Lots of implementations  They can also be RDB based. www.iks-project.eu Copyright IKS Consortium
    12. 12. Page: 12 Triple Store A purpose-built database for the storage and retrieval of RDF data.  Optimized place to add, remove and query for triples. Each triple in the TripleStore complies with the form (subject, predicate, object) www.iks-project.eu Copyright IKS Consortium
    13. 13. Page: 13 Considering XML Databases XML databases are existing storage systems for semi- structured data  Idea: Transform RDF to XML and store it in XML databases  Yet, XML data model is not exactly same with semantic data  XML data model is a tree-like structure  RDF data is represented through a graph without an hierarchy www.iks-project.eu Copyright IKS Consortium
    14. 14. Page: 14 Considering XML Databases XML Databases are not suitable for storage and querying RDF  Only simple manipulations can be handled through XML query languages  RDF Schema processing and inference is not possible  Standard RDF/XML mapping is unsuitable www.iks-project.eu Copyright IKS Consortium
    15. 15. Page: 15 Monolithic approach for DB Based Triple Stores Generic representation for all RDF schemas Only two tables are used  Resources table  Triples table www.iks-project.eu Copyright IKS Consortium
    16. 16. Page: 16 Monolithic approach for DB Based Triple Storespredid subid objid objvalue id uri6 2 1 1 http://www.iks.og/topics.rdfs#Hotel5 3 7 2 http://www.iks.og/topics.rdfs#HotelDirections5 1 8 3 http://www.oclc.org/dublincore.rdfs#title5 9 2 4 http://www.iks.og/schema.rdf#Ext.Resource 5 http://www.w3.org/1999/02/22-rdf-syntax-ns#type3 9 Sunscal e 6 http://www.w3.org/2000/01/rdf-schema#subClassOf 7 http://www.w3.org/1999/02/22-rdf-syntax- ns#Property 8 http://www.w3.org/2000/01/rdf-schema#Class 9 rl www.iks-project.eu Copyright IKS Consortium
    17. 17. Page: 17 Triples Stores Can be categorized into 3 category:  In memory triple stores  Used for certain operations like benchmarking, caching, etc  Native triple stores  Provides their own implementations (Virtuoso, Mulgara, AllegroGraph, …)  Non memory non native triple stores  Are built on third party databases (Jena SDB, Kaon, …) www.iks-project.eu Copyright IKS Consortium
    18. 18. Page: 18Functionalities provided byTriple Stores RDBMS-support General RDF model access Query language support in the store such as RQL, SPARQL Some stores provide:  Provenance - tracking of who-said-what  APIs for accessing triple store over network Very few stores provide:  Full text search  Inference and rule languages www.iks-project.eu Copyright IKS Consortium
    19. 19. Page: 19 Example Triple Store implementations RDF Suite  Sofia Alexaki, Vassilis Christophides, Gregory Karvounarakis, Dimitris Plexousakis, Karsten Tolle. The ICS-FORTH RDFSuite: Managing Voluminous RDF Description Bases , SemWeb, 2001  Based on an ORDBMS model Sesame  http://www.openrdf.org/  Relational databases (mysql, postgres, oracle) Jena  http://www.hpl.hp.com/semweb/jena2.htm  Relational databases (mysql , postgres, oracle) Virtuoso  http://virtuoso.openlinksw.com/  Native RDF Quad Storage (Physical Quads) www.iks-project.eu Copyright IKS Consortium
    20. 20. Page: 20 RDFSuite (ICS-Forth)** IST-1999-13479 C-Web, IST-2000-26074 Mesmuses www.iks-project.eu Copyright IKS Consortium
    21. 21. Page: 21 How triples are stored and accessed in RDF Suite Separate tables are created to store resources  Properties, subClasses, subProperties and instances Indiceson attributes like URI, source and target Querying is possible through RQL www.iks-project.eu Copyright IKS Consortium
    22. 22. Page: 22 How triples are stored and accessed in RDF Suite [Figure from *] www.iks-project.eu Copyright IKS Consortium*Sofia Alexaki, Vassilis Christophides, Gregory Karvounarakis, Dimitris Plexousakis, Karsten Tolle. The ICS-FORTH RDFSuite: Managing Voluminous RDF Description Bases , SemWeb, 2001
    23. 23. Page: 23 Sesame Architecture  DBMS-independent API for accessing triple repositories  SAIL API  A set of Java interfaces between other modules and repository  Abstract from the actual storage mechanism  Query Module  RQL support  Different ways to communicate with clients  Through Protocol handlers www.iks-project.eu Copyright IKS Consortium*Jeen Broekstra and Arjohn Kampman and Frank van Harmelen, Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema, Proceedings of the First InternationalSemantic Web Conference, 2002
    24. 24. Page: 24 SAIL API over PostgreSQL  PostgreSQL  Object-relational DBMS  Support sub-table relations between its tables for providing RDF Schema class and property subsumption  Individuals are represented under separate tables created for resources  Difficult to add table www.iks-project.eu Copyright IKS Consortium*Jeen Broekstra and Arjohn Kampman and Frank van Harmelen, Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema, Proceedings of the First InternationalSemantic Web Conference, 2002
    25. 25. Page: 25 SAIL API over MySQL  MySQL  The database schema does not change when the RDFS changes  Has advantage where RDFS is unstable www.iks-project.eu Copyright IKS Consortium*Jeen Broekstra and Arjohn Kampman and Frank van Harmelen, Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema, Proceedings of the First InternationalSemantic Web Conference, 2002
    26. 26. Page: 26Jena2 Architecture www.iks-project.eu Copyright IKS Consortium
    27. 27. Page: 27 Jena2 Architecture www.iks-project.eu Copyright IKS Consortium*Kevin Wilkinson, Craig Sayers, Harumi A. Kuno, Dave Reynolds: Efficient RDF Storage and Retrieval in Jena2, Proceedings of SWDB03, The first International Workshop on Semantic Web and Databases
    28. 28. Page: 28 Jena2 Jena2  Denormalized schema  Avoids unnecessary joins by merging URIs, literals in statements table  Multiple statement tables  Better locality and caching  Property Tables www.iks-project.eu Copyright IKS Consortium
    29. 29. Page: 29Normalized vs DenormalizedTables www.iks-project.eu Copyright IKS Consortium
    30. 30. Page: 30 Property Tables Triple Store Only Person Property Table Subject Property Object ID name age gender person1 name Alice p1 Alice 32 - person1 age 32 p2 Bob 35 male person1 twinOf person2 person1 faxPhone x1234 Triple Store person1 adminPh x5678 Subject Property Object person2 name Bob person1 twinOf person2 person2 age 35 person1 faxPhone x1234 person2 adopteeOf person6 person1 adminPh x5678 person2 friendOf person8 person2 adopteeOf person6 person2 gender male person2 friendOf person8 www.iks-project.eu Copyright IKS Consortium*Kevin Wilkinson, Craig Sayers, Harumi A. Kuno, Dave Reynolds: Efficient RDF Storage and Retrieval in Jena2, Proceedings of SWDB03, The first International Workshop on Semantic Web and Databases
    31. 31. Page: 31 Jena Persistence Options SDB  Scalable storage and query for RDF  Specifically designed for SPARQL support  Supports: MySQL, PostgreSQL, Oracle 11g, Microsoft SQL server and IBM DB2  Scales to graphs of 100 million triples www.iks-project.eu Copyright IKS Consortium
    32. 32. Page: 32 Jena Persistence Options TDB  Provides for large scale storage and query of RDF datasets using a pure Java engine  Supports SPARQL  A non-transactional, faster database solution for use by a single system  It scales well beyond SDB and is simpler to setup www.iks-project.eu Copyright IKS Consortium
    33. 33. Page: 33 Virtuoso General purpose RDBMS with extensive RDF adaptations RDF data is stored as RDF quads, i.e. it supports RDF with named graphs  i.e. graph, subject, predicate, object tuples  The columns are G for graph, P for predicate, S for subject and O for object www.iks-project.eu Copyright IKS Consortium
    34. 34. Page: 34 Querying Semantic Data Semantic data can be queried from triple stores by  Various query languages  SPARQL  Different endpoints provided  RQL  RDQL  SeRQL …  API Calls  Through proprietary APIs of different projects  Linked Data www.iks-project.eu Copyright IKS Consortium
    35. 35. Page: 35 SPARQL Is an RDF query language  Standardized by W3C consortium  Similar concept of SQL for databases  Syntactically resembles to SQL  RDF Graphs instead of databases www.iks-project.eu Copyright IKS Consortium
    36. 36. Page: 36 SPARQL Endpoints Provides functionality to query the knowledge base via the SPARQL language Accepts queries and returns results through HTTP protocol Query results can be in different formats such as  RDF  XML  HTML  JSON  CSV www.iks-project.eu Copyright IKS Consortium
    37. 37. Page: 37 Semantic Data Access With API Calls Open source projects provides APIs to manipulate RDF data  Jena  Apache Clerezza  Sesame  JRDF www.iks-project.eu Copyright IKS Consortium
    38. 38. Page: 38 Jena Jenaprovides a rich API to manipulate the RDF stored in the underlying triple store.  Model to represent graphs  CRUD methods for triples  Querying methods for existing resources See the next slide for the code snippet… www.iks-project.eu Copyright IKS Consortium
    39. 39. Page: 39 Jena Code SnippetString personURI = "http://somewhere/JohnSmith";String givenName = "John";String familyName = "Smith";String fullName = givenName + " " + familyName;// create an empty Model which represents an RDF graphModel model = ModelFactory.createDefaultModel();// create the resource which will produce the triples in the next slideResource johnSmith = model.createResource(personURI) .addProperty(VCARD.FN, fullName) .addProperty(VCARD.N, model.createResource() .addProperty(VCARD.Given, givenName) .addProperty(VCARD.Family, familyName)); www.iks-project.eu Copyright IKS Consortium
    40. 40. Page: 40 Jena Created triples with the code snippet in previous slide: (<http://somewhere/JohnSmith>, VCARD.FN, “John Smith”) (<http://somewhere/JohnSmith>, VCARD.FN, _) (_, VCARD.Given, “John”) (_, VCARD.Family, “Smith”)• Note that _ symbol represents a blank node www.iks-project.eu Copyright IKS Consortium
    41. 41. Page: 41 Apache Clerezza Provides an API regardless from the different triples stores it supports Its API provides a model to represent RDF graphs and manipulate those graphs Also provides an SPARQL endpoint to query the stored knowledge www.iks-project.eu Copyright IKS Consortium
    42. 42. Page: 42 Apache Clerezza Code Snippet Simple code snippet adding two triples to the graph:String base = “http://www.example.org#”;MGraph g = new SimpleMGraph();g.add( new TripleImpl( new UriRef(base + “JohnSmith”), new UriRef(rdf:Type) new UriRef(foaf:Person)));g.add( new TripleImpl( new UriRef(base + “JohnSmith”), new UriRef(VCARD:FN) LiteralFactory.getInstance().createTypedLiteral(“John”))); www.iks-project.eu Copyright IKS Consortium
    43. 43. Page: 43 Linked Data Interrelated datasets on the Web so that computers can explore them Has a standard format to be accessed and managed Provides integration and reasoning on a huge amount of data on the Web www.iks-project.eu Copyright IKS Consortium
    44. 44. Page: 44 Linked Data Fourfamous principles of linked data represented by Tim Berners-Lee  Use URIs as names of things  Use HTTP URIs to provide dereferencable data to people  When an URI is dereferenced provide useful information in standard format (RDF, SPARQL)  Provide links to other URIs to make possible discovery of related data www.iks-project.eu Copyright IKS Consortium
    45. 45. Page: 45Linked Data www.iks-project.eu Copyright IKS Consortium
    46. 46. Page: 46 Linking Open Data Project Isan W3C SWEO Project Aims to make data freely to everyone Aims to publish open data sets as RDF and set semantic relationships between them  Serves information in a machine readable format  Enriches content  Reduces duplication Linked datasets increasing rapidly  A large number of datasets are linked already www.iks-project.eu Copyright IKS Consortium
    47. 47. Page: 47Linked Datasets As of October2008 www.iks-project.eu Copyright IKS Consortium
    48. 48. Page: 48Linked Datasets As of September2010 www.iks-project.eu Copyright IKS Consortium
    49. 49. Page: 492011 www.iks-project.eu Copyright IKS Consortium
    50. 50. Page: 50 Access Data In The Cloud Follow the RDF links representing the “things” SPARQL Endpoints Ready to use software to discover linked data (See the next slide) www.iks-project.eu Copyright IKS Consortium
    51. 51. Page: 51 Linked Data Applications Lots of application on top of the linked data  Tabulator  Marbles  Openlink RDF Browser  … Just google  RDF Crawlers  RDF Browsers Also see the following link containing a number of linked data applications:  http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/ LinkingOpenData/Applications www.iks-project.eu Copyright IKS Consortium
    52. 52. Page: 52 Available SPARQL Endpoints http://dbpedia.org/sparql http://www4.wiwiss.fu-berlin.de/dblp/ Tosee possible SPARQL endpoints providing a certain URI see  http://void.rkbexplorer.com/endpoint-search/ See also a list of alive SPARQL endpoints  http://www.w3.org/wiki/SparqlEndpoints www.iks-project.eu Copyright IKS Consortium
    53. 53. Page: 53 References http://www.w3.org/TR/rdf-sparql-query http://jena.sourceforge.net/tutorial/RDF_API/index.html http://www.slideshare.net/ldodds/sparql-tutorial http://www.slideshare.net/shamod/a-hands-on-overview-of-the-semantic- web?src=related_normal&rel=1702851 http://www.cambridgesemantics.com/2008/09/sparql-by-example http://linkeddata-specs.info/ http://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData http://www.bioontology.org/wiki/images/6/6a/Triple_Stores.pdf Sofia Alexaki, Vassilis Christophides, Gregory Karvounarakis, Dimitris Plexousakis, Karsten Tolle. The ICS-FORTH RDFSuite: Managing Voluminous RDF Description Bases , SemWeb, 2001 Jeen Broekstra and Arjohn Kampman and Frank van Harmelen, Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema, Proceedings of the First International, Semantic Web Conference, 2002 Kevin Wilkinson, Craig Sayers, Harumi A. Kuno, Dave Reynolds: Efficient RDF Storage and Retrieval in Jena2, Proceedings of SWDB03, The first International Workshop on Semantic Web and Databases http://jena.sourceforge.net/DB/index.html http://virtuoso.openlinksw.com/ www.iks-project.eu Copyright IKS Consortium

    ×