Comparative Survey-Java APIs For RDF

Comparative survey ‐ Java APIs for RDF

Anisoara Sava, anisoara.sava@infoiasi.ro, OC2

Marcela Daniela Mihai, marcela.mihai@infoiasi.ro, OC2

2009

2 Comparative Survey – Java APIs for RDF, Anisoara Sava, Marcela Daniela Mihai

Table of Contents

1 Introduction .............................................................................................. 3
2 Jena ........................................................................................................... 3
2.1 The Model ......................................................................................... 6
2.2 Creating and serializing an RDF model ............................................. 7
2.2.1 Adding more complex structures ........................................... 11
.
2.3 Parsing and querying an RDF document ......................................... 14
2.4 In memory versus persistent model storage .................................. 16
2.5 Jena query mechanism evaluation ................................................. 18
3 SESAME ................................................................................................... 20
3.1 Introduction .................................................................................... 20
3.2 Licence ............................................................................................ 20
3.3 Architecture .................................................................................... 21
3.4 The Repository API .......................................................................... 23
3.5 Transactions .................................................................................... 23
3.6 Query Language .............................................................................. 24
4 Jena RDF storage ..................................................................................... 25
5 Sesame RDF storage ............................................................................... 26
6 Conclusions ............................................................................................. 26
Bibliography .................................................................................................... 28


1 Introduction

The Semantic Web is essentially a metadata management system. We may
use metadata to describe everything: entities have sets of properties and
relationships to various other classes. Languages such as RDF, RDFS and
DAML provide markup tags that can be used to express (or assert) these
relationships.

At the moment there are a lot of tools for creating and manipulating large
scale ontologies. These tools can be classified in two sets: turnkey products
and API/library implementations. Turnkey solutions, such as OntoMat and
Protégé, can be used immediately to visually author metadata. But we may
want some general API and libraries that we can use to construct our own
applications. Jena and Sesame are such APIs for Java.

2 Jena

Java programmers who want to develop semantic web applications have a
growing range of tools and libraries to choose from. One such tool, the Jena
platform, is an open source API and toolkit for processing Resource
Description Framework (RDF), Web Ontology Language (OWL) and other
semantic web data.

Jena is accessible at Source Forge (http://sourceforge.net/projects/jena) or at
http://www.hpl.hp.com/semweb/jena.htm. In addition, there’s a Jena
developers’ discussion forum at http://groups.yahoo.com/group/jena‐dev/ .

The Jena platform has been developed by Hewlett‐Packard’s Semantic Web
team. In fact, the cochair of the RDF Working Group is Brian McBride, one of
the creators of Jena. Jena is derived from SiRPAC API. It can parse, create and
search RDF models.

The Jena Framework includes:

a RDF API

4 Comparative Survey – Java APIs for RDF, Anisoara Sa
ava, Marcela Da
aniela Mihai

rea
ading and w
writing RDF in RDF/XML, N3 and N‐Triples (the R
n RDF
parser – ARP – Another RDF F Parser is also accessible as a standalo
one
oduct)
pro
an OWL API
in‐
‐memory and persistent st torage and
a S
SPARQL query y engine

Figure 1 Jena API imp
plementation
n architecture
e
The implementation ha been desi
as igned to per
rmit the easy integration of
y
alternative
e processing m
modules such
h as parsers, serializers, st
tores and que
ery
processorss.

The most important Jen
na packages a
are:

jen
na.model: ke package for application developer; it contains
ey
int
terfaces for m
model, resource…
jen
na.mem: cont tains an implementation of Jena API w
which stores all
moodel state in m
main memory y
jen
na.common: c contains impl lementation c
classes

5 Com
mparative Surve
ey – Java APIs for
r RDF, Anisoara Sava, Marcela D
Daniela Mihai

The API itself consists of a collecti of Java interfaces for accessing a
ion r and
manipulati ing RDF statements:

Mo odel: a set o s
statements
Staatement: a triple of {R, P, O
O}
Reesource: subje ect, URI
Prooperty: “item
m” of resource e
Obbject: may be a resource or a literal
Literal: non‐nes sted “object”
Co
ontainer: spec cial resource, collection of things

Figure 2 Jen
na API structure
We presen below an example of RDF schema where Jen
nt a, na’s concept of
interfaces is highlighted
d:

6 Comparative Survey – Java APIs for RDF, Anisoara Sa
ava, Marcela Da
aniela Mihai

Fig
gure 3 Jena in
nterfaces ‐ ex
xample

2.1 The Model

Jena API ca
an be used too create and m
manipulate RDF graphs. In
n Jena a graph is
h
called a mo
odel and is re
epresented byy the Model interface.

The sample
e code for cre
eating a mode
el is very simple:

//some def finitions
static Strin
ng tutorialURII = “http://hosstname/rdf/ttutorial/”;
static Strin
ng author = “SSava Anisoara a”;
static Strin
ng title = “Com
mparative surrvey of Java AAPIs for RDF”;
static Strin
ng date = “30//11/2009”;
//create an n empty graph
Model mod del = new Mo odelMem();
//create thhe resource
Resource tutorial = mod del.createReso ource(tutoria
alURI);
//add some properties
tutorial.adddProperty(DC C.creator, autthor);


tutorial.addProperty(DC.title, title);
tutorial.addProperty(DC.date, date);

The ModelMem class creates an RDF model in memory.

The ModelRDB it is also a model implementation used to manipulate RDF
persisted within a relational database such as MySQL or Oracle. The RDF data
persisted with ModelRDB can be accessed later, because this model opens
and maintains a connection to a relational database. Also, using this model,
we can specify how to store the RDF model within the relational database –
as a flat table of statements, as a hash or through stored procedures.

2.2 Creating and serializing an RDF model

We can create an RDF model very simple only by creating resources, add a
couple of properties and then serialized them (as shown before).

However, there is a problem with this approach, of creating the Property
and Resource objects directly in the application that builds the model: we
have to duplicate this functionality across all applications that want to use
this vocabulary.

A possible solution is to build a vocabulary object, using a Java wrapper
class.

Some examples of Jena’s wrap are Dublin Core (DC) RDF, VCARD RDF and so
on. If we use a wrapper class for the properties and resources of our
vocabulary, we have a way of defining them in one spot, an approach that
simplifies both implementation and maintenance.

We will now create a vocabulary class for PostCon using the existing Jena’s
vocabulary wrapper classes as a template. The PostCon wrapper class
contains a set of static strings holding property or resource labels and a set
of associated RDF properties (see below).

The PostCon vacabulary wrapper class


package com.burningbird.postcon.vocabulary;
import com.hp.hpl.mesa.rdf.jena.common.ErrorHelper;
import com.hp.hpl.mesa.rdf.jena.common.PropertyImpl;
import com.hp.hpl.mesa.rdf.jena.common.ResourceImpl;
import com.hp.hpl.mesa.rdf.jena.model.Model;
import com.hp.hpl.mesa.rdf.jena.model.Property;
import com.hp.hpl.mesa.rdf.jena.model.Resource;
import com.hp.hpl.mesa.rdf.jena.model.RDFException;

public class POSTCON extends Object {
// URI for vocabulary elements
protected static final String uri =
"http://burningbird.net/postcon/elements/1.0/";
// Return URI for vocabulary elements
public static String getURI( )
{
return uri;
}
// Define the property labels and objects
static final String nbio = "bio";
public static Property bio = null;
static final String nrelevancy = "relevancy";
public static Property relevancy = null;
static final String npresentation = "presentation";
public static Resource presentation = null;
static final String nhistory = "history";
public static Property history = null;
static final String nmovementtype = "movementType";
public static Property movementtype = null;
static final String nreason = "reason";
public static Property reason = null;
static final String nstatus = "currentStatus";
public static Property status = null;
static final String nrelated = "related";
public static Property related = null;


static final String ntype = "type";
public static Property type = null;
static final String nrequires = "requires";
public static Property requires = null;
// Instantiate the properties and the resource
static {
try {
// Instantiate the properties
bio = new PropertyImpl(uri, nbio);
relevancy = new PropertyImpl(uri, nrelevancy);
presentation = new PropertyImpl(uri, npresentation);
history = new PropertyImpl(uri, nhistory);
related = new PropertyImpl(uri, nrelated);
type = new PropertyImpl(uri, ntype);
requires = new PropertyImpl(uri, nrequires);
movementtype = new PropertyImpl(uri, nmovementtype);
reason = new PropertyImpl(uri, nreason);
status = new PropertyImpl(uri, nstatus);
} catch (RDFException e) {
ErrorHelper.logInternalError("POSTCON", 1, e);
}
}
}

At this point the PostCon vocabulary is ready to use, we should only import
the class in our applications.

import com.burningbird.postcon.vocabulary.POSTCON;

We present below an example of using the PostCon wrapper class to add
properties to resource:

import com.hp.hpl.mesa.rdf.jena.mem.ModelMem;
import com.hp.hpl.mesa.rdf.jena.model.*;
import com.hp.hpl.mesa.rdf.jena.vocabulary.*;


import java.io.FileOutputStream;
import java.io.PrintWriter;

public class pracRDF extends Object {

public static void main (String args[]) {
// Resource names
String sResource = "http://burningbird.net/articles/monsters1.htm";
String sRelResource1 =
"http://burningbird.net/articles/monsters2.htm";
try {
// Create an empty graph
Model model = new ModelMem( );
// Create the resource
// and add the properties cascading style
Resource article = model.createResource(sResource)
.addProperty(POSTCON.related,
model.createResource(sRelResource1))
model.createResource(sRelResource2));
// Print RDF/XML of model to system output
model.write(new PrintWriter(System.out));
} catch (Exception e) {
System.out.println("Failed: " + e);
}
}
}

We can easily observe that using the wrapper class simplified the code
considerably.


The generated RDF/XML from the serialized PostCon submodel looks like
this:

<rdf:RDF
xmlns:rdf='http://www.w3.org/1999/02/22‐rdf‐syntax‐ns#'
xmlns:NS0='http://burningbird.net/postcon/elements/1.0/'
>
<rdf:Description
rdf:about='http://burningbird.net/articles/monsters1.htm'>
<NS0:related
rdf:resource='http://burningbird.net/articles/monsters2.htm'/>
<NS0:related
</rdf:Description>
</rdf:RDF>

2.2.1 Adding more complex structures

However, many of the RDF’s models use more complex structures, including
nesting resource following the RDF node‐edge‐node pattern. We will take a
look on how Jena can easily handle more complex RDF model structures.

If we have the following RDF graph,

<rdf:RDF
xmlns:rdf='http://www.w3.org/1999/02/22‐rdf‐syntax‐ns#'
xmlns:NS0='http://burningbird.net/postcon/elements/1.0/'
xmlns:dc='http://purl.org/dc/elements/1.0/'
>
<rdf:Description rdf:nodeID='A0'>
<dc:creator>Shelley Powers</dc:creator>
<dc:publisher>Burningbird</dc:publisher>
<dc:title xml:lang='en'>Tale of Two Monsters: Legends</dc:title>
</rdf:Description>
<rdf:Description rdf:about='http://burningbird.net/articles/monsters1.htm'>


<NS0:related
<NS0:related
<NS0:bio rdf:nodeID='A0'/>
</rdf:Description>
</rdf:RDF>

it is very easy to generate it with Jena:

import com.hp.hpl.mesa.rdf.jena.mem.ModelMem;
import com.hp.hpl.mesa.rdf.jena.vocabulary.*;
import java.io.FileOutputStream;
import java.io.PrintWriter;

public class pracRDFSecond extends Object {

// Resource names
String sResource = "http://burningbird.net/articles/monsters1.htm";
String sType =
"http://burningbird.net/postcon/elements/1.0/Resource";
try {
// Create an empty graph
Model model = new ModelMem( );
// Create the resource
// and add the properties cascading style
Resource article
= model.createResource(sResource)


model.createResource(sRelResource1))
model.createResource(sRelResource2));
// Create the bio bnode (blank node) resource
// and add properties
Resource bio
= model.createResource( )
.addProperty(DC.creator, "Mihai Marcela")
.addProperty(DC.publisher, "Infoiasi")
.addProperty(DC.title, model.createLiteral("Tale of Two
Monsters: Legends", "en"));
// Attach to main resource
article.addProperty(POSTCON.bio, bio);
// Print RDF/XML of model to system output
model.write(new PrintWriter(System.out));
}
}
}

The bio property is a resource that does not have a specific URI (blank node).
Anyway, it is not a literal so we create a new resource for bio, and add
several properties to it. As you have seen, these properties are defined in the
DC vocabulary, so we use the DC wrapper class to create them.

Jena offers support for creating typed nodes (rdf:type) and containers.

An RDF container is a grouping of related items. It represents a collection of
things:

Bag – unordered collection
Alt – unordered collection except first element
Seq – ordered collection


2.3 Parsing and querying an RDF document

After we store the data in a model, our next concern is how we can query it.

The data stored in a RDF model can be accessed directly using specific API
functions or via RDQL – an RDF query language. A very effective way of
pulling data from an RDF model (stored in memory or in a relational
database) is querying using SQL‐like syntax.

Jena’s RDQL is implemented as an object called Query. This object is passed
to a query engine (QueryEngine) and the results are stored in a query result
(QueryResult).

As soon as data is retrieved from the RDF/XML we can iterate through it
using any number of iterators (NodeIterator – for general RDF nodes,
ResIterator, StmtIterator).

Examples of RDF iterating:

Go through statements
A statement is often called a triple, because of its three parts. An
RDF graph is represented as a set of statements. The statement
interface provides accessor methods to subject, predicate and
objects of a statement.

//list the statements in a graph
StmtIterator iter = model.listStatements();
//print out the predicate, subject and object of each statement
while( iter.hasNext() ) {
Statement stmt = iter.next(); //get the next statement
Resource subject = stmt.getSubject(); //get the subject
Property predicate = stmt.getPredicate(); //get the predicate
RDFNode object = stmt.getObject(); //get the object
System.out.print(subject.toString());
System.out.print(“ ” + predicate.toString() + “ ”);
if(object instanceOf Resource) {
System.out.print(object.toString());
} else {


//object is a literal
System.out.print(“ ” + object.toString() + “”);
}
}
Instead of listStatements we could also dump out the subjects
(listSubjects, ResIterator) or   more generally  the objects (listObjects,
NodeIterator).

Also, instead of listing all statements or all objects, we can fine‐tune
the code to list only   subjects, statements, or objects matching specific
properties, using the property   implementations created within the wrapper
class.

Navigating a model

Property email = model.createProperty (tutorialURI, “emailAddress”);
Resource tutorial = model.getResource(tutorialURI);
Resource author = tutorial.getProperty(DC.creator).getResource();
//list all author’s email addresses
StmtIterator iter = author.listProperties(email);
while(iter.hasNext()) {
System.out.println(“ ” + iter.next().getObject().toString());
}

Querying a model

ResIterator iter = model.listSubjectsWithProperty(DC.date, date);
while(iter.hasNext()) {
System.out.println(iter.next().getProperty(DC.title).toString();
}
NodeIterator iter2 = model.listObjectsOfProperty(DC.creator);
while(iter2.hasNext()) {
System.out.println(iter2.next().getProperty(name).toString());
}


2.4 In memory versus persistent model storage

As we have mentioned before Jena offers also the possibility to persist data
to relational database storage. The supported databases are MySQL,
PostgreSQL, Interbase and Oracle. Within each database system, Jena also
supports differing storage layouts:

Generic: all statements are stored in a single table, and resources
and literals are indexed using integer identifiers generated by the
database
GenericProc: similar to generic, but data access is through stored
procedures
MMGeneric: similar to generic but can store multiple models
Hash: similar to generic but uses MD5 hashes to generate the
identifiers
MMHash: similar to hash but can store multiple models

For storing a model in a database we must first create the structure to store
the data (the tables must be created in an already existing database, which
has been formatted).

We present below an example of persisting a model to a database. We will
create ourselves the JDBC connection. The model is based on a MySQL
database, using the MMGeneric layout (we aren’t using the slightly more
efficient hash method – MMHash – because the generic layout is the better
one to take if we access the data directly through JDBC rather than through
Jena).

import com.hp.hpl.mesa.rdf.jena.rdb.ModelRDB;
import com.hp.hpl.mesa.rdf.jena.rdb.DBConnection;
import java.sql.*;

public class pracRDFThree extends Object {


// Pass one RDF document, connection string,
String sUri = args[0];
String sConn = args[2];
String sUser = args[3];
String sPass = args[4];
try {
// Load driver class
Class.forName("com.mysql.jdbc.Driver").newInstance( );
// Establish connection ‐ replace with your own conn info
Connection con = DriverManager.getConnection(sConn,
"user", "pass");
DBConnection dbcon = new DBConnection(con);
// Format database
ModelRDB.create(dbcon, "MMGeneric", "Mysql");
// Create and read the model
ModelRDB model1 = ModelRDB.createModel(dbcon,
"modelOne");
model1.read(sUri);
}
}
}

Once the model is persisted any number of applications can access it.

We will present also how we can access the model stored in a MySQL
database and dump all its objects:

import com.hp.hpl.mesa.rdf.jena.rdb.ModelRDB;
import com.hp.hpl.mesa.rdf.jena.rdb.DBConnection;
import java.sql.*;
public class pracRDFFour extends Object {


String sConn = args[0];
String sUser = args[1];
String sPass = args[2];
try {
// load driver class
Class.forName("com.mysql.jdbc.Driver").newInstance( );
// Establish connection ‐ replace with your own conn info
Connection con = DriverManager.getConnection(sConn,
sUser, sPass);
DBConnection dbcon = new DBConnection(con);
// Open the existing model
ModelRDB model1 = ModelRDB.open(dbcon, "modelOne");
// Print out objects in the model using toString
NodeIterator iter = model1.listObjects( );
while (iter.hasNext( )) {
System.out.println(" " + iter.next( ).toString( ));
}
}
}
}

In addition to accessing the data through the Jena API, you can also access it
directly using whatever database connectivity you prefer.

As you can see the Jena API for managing RDF databases is very
straightforward and easy to use.

2.5 Jena query mechanism evaluation

Some tests were done for exploring the benefits of semantic web
technologies RDF, RDF Schema over existing such as XML, XML Schema in the
context of a scientific application. It was suggested to replace the Chemistry


Markup Language (CML, well‐established XML format for exchanging
information about molecules, reactions and experiment s) with RDF.

The specific technical goal of this study was to evaluate whether Jena would
be suitable for storing large numbers of molecules, and whether the RDQL
query functionality would support searches for substructures on the scale
that would be required in a realistic application.

The ability of RDF to support graph‐based query was identified as a potential
benefit in this application, allowing chemists to search a repository of
molecules for molecular sub‐structures.

But, queries for simple molecular substructures (such as N‐C‐N) within
modest repositories (100 molecules, 26118 statements) took a prohibitively
large amount of time to complete (>24 hours).  So, unless efficiency of graph
query engines is improved, RDF technologies remain inadequate for data‐
intensive applications. (See more details on:
www.w3c.rl.ac.uk/SWAD/papers/RDFMolecules_final.doc ).

Undoubtedly, RDF offers powerful and flexible alternatives for modelling
complex data structures, and its application in this area deserves further
study.


3 SESAME

3.1 Introduction

Sesame is an open source framework for storage, inferencing and querying
of RDF data. The community site for information about development with
Seseme is http://openrdf.org. At beginning it was developed by the Dutch
company Aduna as a research prototype for the EU research project On‐To‐
Knowledge. In the present, Sesame framework is maintained and developed by
Arduna in collaboration with NLnet Foundation and a number of volunteer
developers who contribute with ideas, bug reports and fixes.

Sesame has an excellent administration interface included in the distribution,
it's easy to install and is consider one of the leading graph stores and
running with excellence performance.

The most important characteristics of Sesame 2.x are:

completely targeted at Java 5; all APIs use Java 5 features such as
typed collections and iterators.
support for the SPARQL Query Language
support for context/provenance, allowing to keep track of individual
RDF data units (like files, for instance).
proper transaction/rollback support.

3.2 Licence

If the version 1.x of Sesame was available under the terms of GNU Lesser
General Public License (LGPL), version 2.1, the version 2.x is available under a BSD‐
style license. This versions includes also software developed by the Apache
Software Foundation.


3.3 Architecture

Sesame have an architecture that allows persistent storage of RDF data and
schema information and subsequent querying of that information.

For keeping Sesame DBMS‐independent, all DBMS‐specific code is
concentrated in a single arhitectural layer: the Storage And Inference
Layer (SAIL).

This SAIL is an application programming interface (API) that offers
RDF specific methods to its clients and translates these methods to
calls to its specific

DBMS. An important advantage of the introduction of such a separate
layer

is that it makes it possible to implement Sesame on top of a wide
variety of repositories without changing any of Sesame's other
components.


The functional modules in Sesame are clients of the SAIL API. Currently, there
are three such modules: The RQL query engine, the RDF admin module
and the RDF export module.

To have direct access to functional modules from Sesame, must be
use Access APIs. Those can be use also for access the next
component of Sesame’s architecture, The Sesame Server. This is a
component that provides HTTP‐ and RMI‐based access to Sesame's APIs.

The Model API is represented by The Model interface, which holds a collection of indexed
statements and implements Set<Statement>. Model, unlike the MemoryStore, does preserve
the statement ordering, but unlike the MemoryStore it does not provide concurrent
modification, transaction, or query support. The Model is a welcome addition for programs
that simply need to read and write RDF files.

For having access to the RFD parsers, writers and basic model types such URI, Literal,
Statement and Model from Sesame, we have only to include sesame‐client.jar.

The next code is for demonstrate how can work with RDF files: read an RDF file, validate it
and after that write it back in an organized format:

File file = new File(“somefile.rdf”);
StatementCollector collector = new StatementCollector();
InputStream in = new FileInputStream(file);
try {
RDFParser parser = new RDFXMLParser();
parser.setRDFHandler(collector);
parser.parse(in, file.toURI().toString());
} finally {
in.close();
}
Model model = collector.getModel();
// perform some validation
for (Value obj : model.filter(null, RDFS.SUBCLASSOF, null).objects()) {
if (!model.contains((Resource) obj, RDF.TYPE, RDFS.CLASS))
throw new Exception(“missing super class”);
}
// sort the statements and write them back out


model = new ModelOrganizer(model).organize();
OutputStream out = new FileOutputStream(file);
try {
RDFWriter writer = new RDFXMLPrettyWriter(out);
writer.startRDF();
for (String prefix : model.getNamespaces().keySet()) {
writer.handleNamespace(prefix,
model.getNamespace(prefix));
}
for (Statement st : model) {
writer.handleStatement(st);
}
writer.endRDF();
} finally {
in.close();
}

3.4 The Repository API

The Repository API is the central access point for Sesame repositories. It can
be used to query and update the contents of both local and remote
repositories. The Repository API handles all the details of client‐server
communication, allowing you to handle remote repositories as easily as local
ones.

The main interfaces for the repository API can be found in package
org.openrdf.sesame.repository. The implementations of these interface
for local and remote repositories can be found in subpackages of this
package.

3.5 Transactions

Sesame 3.0 introduces isolation levels to the API. In 2.x each store used its
own fixed isolation level (MemoryStore, NativeStore, and RdbmsStore). In
Sesame 3.0, each connection can choose its own isolation level. These are


similar to the ANSI SQL isolation levels, also Sesame has created its own
RDF‐specific definitions and provides a few more isolation levels that are
relevant to distributed and optimistic stores.

Unlike the ANSI SQL isolation levels, Sesame distinguishes between snapshot
and serializable isolation levels. Furthermore, Sesame's definition of
snapshot isolation allows independent RDF stores to use eventual
consistency to manage the changes to the store. This makes clusters of RDF
stores easier to implement for connections that allow eventual consistency,
but still require a consistent non‐changing view of the store. Some of
Sesame's lower levels of isolation explicitly permit the use of HTTP caching
(with some restrictions). This allows connections that only need weak
consistency to operate at a significantly enhanced speed, by not having to
check for new modifications every time.

Like  future work for this framework can be consider:

the transaction rollback support .  While the SAIL API has support for
transactions, it currently has no transaction rollback feature.
Transaction rollbacks, especially in the case of uploading
information, are crucial if we wish to guarantee database
consistency.
Versioning support. The current version of Sesame has no support for
versioning. The basic type of versioning will enable more elaborate
versioning schemes.
DAML+OIL support.

3.6 Query Language

Sesame currently supports two query languages: SeRQL and SPARQL.

SeRQL ("Sesame RDF Query Language", pronounced "circle") is a new
RDF/RDFS query language that is currently being developed by Aduna as part
of Sesame. It combines the best features of other (query) languages (RQL,


RDQL, N‐Triples, N3) and adds some of its own. This document briefly shows
all of these features.

Some of SeRQL's most important features are:

Graph transformation.
RDF Schema support.
XML Schema datatype support.
Expressive path expression syntax.
Optional path matching.

4 Jena RDF storage

Jena architecture provides an abstract RDF model to manage an internal
graph that store the RDF model. The applications interact with an abstract
Model which translates higher‐level operations into low‐level operations
with triples stored in an RDF graph.

The Jena database subsystem implements persistence for RDF graphs using a
relational database through a JDBC connection.

The first Jena version had a reduce response time because of the uses of a
denormalized relational schema. The current version of Jena trades‐off
space for time. Both resource URI and simple literal values are stored
directly in the statement table. By storing values directly in the statement
table it is possible to perform many find operations without a join. Although
the size of the statement table is a problem Jena2 provides several options
to reduce it, such as, compress namespaces (by defining a prefix and using
this prefix like a reference to the namespace), to use long values only once
etc..


5 Sesame RDF storage

Sesame is an open source RDF database with support for RDF Schema
inferencing and querying information. The progress in the development of
Sesame has showed some RDF store features.

The performance of Sesame together with object‐relational DBMS has been
proved in several studies [Jeen Broekstra, Arjohn Kampman, and Frank van
Harmelen. Sesame: A Generic Architecture for Storing and Querying RDF and
RDF Schema. http://wwwis.win.tue.nl/~jbroekst/papers/ISWC02.pdf ]. All
agree in the same conclusion: The performance is very low, if the database
system creates a table whenever a new class or property is added.

The study realized with PostgreSQL in [Jeen Broekstra, Arjohn Kampman, and
Frank van Harmelen. Sesame: A Generic Architecture for Storing and
Querying RDF and RDF Schema.
http://wwwis.win.tue.nl/~jbroekst/papers/ISWC02.pdf] uses a different
approach (similar to Jena) in which all RDF statements are inserted in a single
table with three columns: “Subject, Predicate, Object”. In scenarios where
the schema changes often, this approach is better than the previous.

It has been proved that we can improve the performance of inferences over
big amount of data using an object oriented database (also for Jena and
Sesame), because it implements “on demand materialization” and some
concepts as “class extent”. One of the most important features of an object
oriented database is that accessing objects in the database is done in a
transparent manner such that interaction with persistent objects is no
different from interacting with in‐memory objects.

6 Conclusions

Jena is the most known RDF triple. It provides a programmatic environment
for RDF, RDFS and OWL, SPARQL and includes a rule‐based inference engine.


The documentation is very good and it also comes with some extra tool
developed by third parties.

A plus for Jena is that it works very well on Linux, FreeBSD and Windows.

Because a lot of programmers used it, the support for developing using Jena
(integration in existing IDE) is very good.

A minus for Jena is that it doesn’t have a web framework and SPARQL
interface. On the other hand, Sesame has all these and also comes with a
more modern architecture. So, recently the Semantic Web people were in
favour of Sesame.

Also Jena and Sesame make use of the advantages of conventional database
system. Although, Sesame supports only MySQL and PostgreSQL.

At this moment Sesame seems to be the most modern approch. It comes
with a plugin architecture which makes it very modular (in the second
version it offers interfaces to the Spring Framework).

A major advantage for using this framework is for the fact that it offers a
pretty well documentation for users and developers. Also, because it is used
by a lot of programmers, it’s not difficult to find code examples and
opinions about certains aspects of the framework. Also, because is
implemented in Java language, it offers interoperability for many platforms
for programming.

As we said before the most popular aspect of Sesame is the fact that it can
be deployed on tomcat such that one can upload an ontology and test
queries in the browser.

Another important feature of the Sesame architecture is its abstraction from
the details of any particular repository used for the actual storage. This
makes it possible to port Sesame to a large variety of different repositories,
including relational databases, RDF triple stores, and even remote storage
services on the Web.


As Jena, Sesame has a vivid community and a lot of documentation that
make us believe that it will be developed over the years and the interest will
only increase.

Currently, Jena and Sesame are the two most popular implementations for
RDF store. Because there is no RDF API specification accepted by the Java
community, programmers use either Jena API or Sesame API to publish,
inquire, and reason over RDF triples. Thus the resulted RDF application
source code is tightly coupled with either Jena or Sesame API. The
interoperability between Jena and Sesame arises as a problem when a Jena
API based application needs to access a Sesame backend, or a Sesame API
based application needs to access a Jena backend. This problem is partly
solved by Sesame‐Jena Adapter, which provides access to a Jena model
through the Sesame SAIL API.

Jena and Sesame are distributed under a BSD license, that allows to the
developers to modify and distribute packages and sources.

In our opinion, Jena is a very strong and mature tool, which is a little bit
slower that Sesame, because it respects ad literam the SPARQL
specifications. That why the developers prefer more and more Sesame API,
which comes with a modern approch, with some more speed performances
but a very low scalability.

So, it’s developers choice what they can cut in their applications, the speed
and specifications or the scalability...

Another appreciated semantic store is Mulgara, which is an open
source,massively scalable, transaction‐safe, purpose‐built database, for the
storage and retrieval of RDF, written in Java.

Bibliography

An introduction to RDF and Jena RDF API
http://jena.sourceforge.net/tutorial/RDF_API/


Joe Verzulli, Using the Jena API to process RDF
http://www.xml.com/pub/a/2001/05/23/jena.html
Meet Jena, a semantic web platform for Java
http://www.devx.com/semantic/Article/34968
RDF with Jena
http://www.docstoc.com/docs/13613350/RDF‐with‐Jena
Shelley Powers, Practical RDF, O’Reilly
Brian McBride, Jena: Implementing the RDF Model and Syntax
Specification
http://sunsite.informatik.rwth‐aachen.de/Publications/CEUR‐
WS/Vol‐40/mcbride.pdf
Jesús Soto, Oscar Sanjuan, Luis Joyanes, Semantic Web Servers: A
new approach to query on big datasets of metadata
http://sites.google.com/site/wjfang2/jenasesamemodel
Create Scalable Semantic Applications with Database‐Backed RDF
Stores
http://www.devx.com/semantic/Article/35480/0/page/3
http://www.openrdf.org/
http://answers.oreilly.com/topic/447‐how‐to‐use‐the‐sesame‐java‐
api‐to‐power‐a‐web‐or‐client‐server‐application/

Comparative Survey-Java APIs For RDF

Recommended

Recommended

More Related Content

Similar to Comparative Survey-Java APIs For RDF

Similar to Comparative Survey-Java APIs For RDF (20)

Recently uploaded

Recently uploaded (20)

Comparative Survey-Java APIs For RDF