.Net and Rdf APIs


Published on

Published in: Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

.Net and Rdf APIs

  1. 1. .NET Framework RDF APIs Lucian Nistor, Denis Recean Universitatea “Alexandru Ioan Cuza”, Iasi 1 Introduction In this paper we intend to do a comparative study about how RDF, one of the most important bricks of Semantic Web, is processed in .NET. The utility of such a study is obvious, .NET is one of the most used frameworks in software development (desktop of web based), and Semantic Web, with RDF at its foundation, represents the next step in web evolution, so they have to interact with each other. Before we start comparing the tools, we do a short presentation of the main technologies. Semantic Web The Web has begun to “understand” the meaning of the information it is composed of and this is the new phase of Web, the Semantic Web. This process of “understanding” the data is realized using various formal ways, like RDF (Resource Description Frameworks), RDFS (RDF Schema), interchangeable data formats (like N3 or Turtle) or WOL (Web Ontology Language). But the Semantic Web is like a living organism that is growing and evolving right in front of our eyes. RDF The Resource Description Framework (RDF) is a standard for storing data on Semantic Web. Semantic Web compliant applications use structured information that is transmitted in a decentralized and distributed way. In order to store the information in small, discrete pieces an abstract model was created, RDF. This model is stored in a multitude of formats, but the most popular is RDF/XML.
  2. 2. Ontology Even thow there is no unique definition to Semantic Web Ontologies, they are very important for Semantic Web. In philosophical way ontology is “the study of entities and their relations” – Clay Shirky. Extrapolating that definition to computer science we can say that an ontology is a formal representation of a set of entities from a certain domain and the relations between those entities. SPARQL SPARQL (SPARQL Protocol and RDF Query Language) is a query language for RDF. A SPARQL interrogation is querying required and optional graph patterns (RDF stored information forms an informational graph). These patterns can be connected by conjunctions or disjunctions. SPARQL can be used to interrogate any data source that is stored in RDF format or can be transformed in RDF. The result of a query can take the form of sets or RDFs. http://en.wikipedia.org/wiki/SPARQL - SARQL query example: PREFIX abc: <http://example.com/exampleOntology#> SELECT ?capital ?country WHERE { ?x abc:cityname ?capital ; abc:isCapitalOf ?y . ?y abc:countryname ?country ; abc:isInContinent abc:Africa . } .NET Framework and VS The .NET Framework is a software framework developed by Microsoft and it used by the latest software applications that run on Windows. The framework includes a big library with solutions to common programming problems and a virtual machine. Developers that write applications in .NET have the advantage that they can use one of many programming language to write their code (C#, Visual Basic, C++,…), they have access to the Base Class Library, have a common development environment for desktop and web applications, and have access to a extraordinary documentation.
  3. 3. The Base Class Library is a component of the framework that provides features like database connectivity, cryptography, web application development and so on. Visual Studio is the Microsoft IDE for software development. It includes .NET Framework and Microsoft encourages the use of the framework in software development. VS provides advanced features and RIA development support. Besides Microsoft there is a .NET Framework developed for Linux, called Mono, but supports only .Net 1.0 and .NET 2.0 and unlike Windows which is on .Net 3.5 and prepares for .Net 4.0 2 API comparation Even if, in the latest years, Semantic Web evolved considerably and RDF has become a common data storing standard, Microsoft didn’t include native support for RDF processing. Understanding the growing importance of RDF in new web software development, some independent developers have implemented solutions that offer support to RDF processing. We will discuss about three such API’s, SemWeb, a library that provides low level RDF interaction, LinqToRdf and Rowlex, thow API’s that use SemWeb internally and provide a flexible and easy to use API. SemWeb Was developed by Joshua Tauberer and, according to the author, it can be used to read and write RDF files in XML and N3 formats, persistently store RDFs in memory, in SQL databases, to query persistent storage or remote endpoints using SPARQL.Is also can provide limited RDFS interaction. LinqtoRdf Developed by Andrew Matthews this tool’s main aim is to allow .NET programmers to use LINQ query technology to interrogate a RDF information graph with the help of classes that have been defined using RDFS or OWL. The tool includes extensions for visual studio that allows the user to model
  4. 4. ontologies using VS.NET class designer. Its main features are converting LINQ query to SPARQL and to generating .NET classes that map ontologies. Rowlex Rowlex is a toolkit used for creating and browsing RDF documents. It uses ontology to model classes and properties and then models RDF tipples like instances of those classes. ROWLEX is the acronym for Relaxed OWL Experience. On other words Rowlex maps the object oriented programming advantages over RDF processing using OWL (Web Ontology Language). It offers the ability to generate .NET classes from ontologies and ontologies from .NET classes. This API was developed by NC3A Semantic Interoperability tem. 3 RDF data storage The way the RDF information or the RDF itself is stored is very important. It influences performance and the interoperability with other platforms and applications. SemWeb is capable to work with RDF in XML and N3 formats. The abstractization of a RDF triple in SemWeb library is done with the Entity class, which stores an RDF entity, the Literal class which stores a relation and the Statement class which combines two entities and a literal to obtain a RDF triple. LinqToRdf API uses N3 format to store RDF files. In .Net LinqToRdf creates classes that map the ontology describing the RDF and then uses the Linq mechanisms to query, delete or add information to a certain RDF file. The classes are created using attributes to map ontology features. A triple is stored as an instance of a class and the relation between classes are modeled with OOP means. For instance ontology class hierarchy is modeled with class derivation and one-to-many relation is modeled with list of objects.
  5. 5. ROWLEX uses XML and N3 format to store RDF files. When processes documents the library stores RDF triples as instances of classes that map ontology, in a similar way that LinqToRdf does. As a common characteristic of using the .NET framework all these APIs have the possibility to serialize the data in the .NET ways. For instance the RDFs can be stored in a SQL database or in binary format, as any .NET object. 4 SPARQL support Only two of the three APIs have support for SPARQL interrogations, both on local RDF files and on Remote SPARQL Endpoints. These are SemWeb and LinqToRdf. In the SemWeb the interrogations are stored in special objects: Query class objects for local queries and SparqlHttpSource for remote ones. Example of SPARQL interrogation written using SemWeb API. SparqlHttpSource source = new SparqlHttpSource("http://DBpedia.org/sparql"); source.RunSparqlQuery("SELECT * WHERE { ?a ?b "Michael Jackson" . }", Console.Out); LinqToRdf uses the LINQ mechanism to create queries. As a data context for a query a RDF object is used. The constructed LINQ query is then translated into SPARQL. In order to interrogate remote data that is not in RDF format, special tools that transform in to RDF format need to be used. For instance, in order to interrogate OpenLink data the Virtuoso platform can be used. Example of LINQ interrogation over a RDF file, using LinqToRdf TripleStore ts = new TripleStore(); ts.EndpointUri = @"://DBpedia.org/sparql "; ts.QueryType = QueryType.RemoteSparqlStore;
  6. 6. var q = from p in new RDF(ts).ForType<Person>() where p.Name == " Michael Jackson “ select p; 5 Support for developers Two of the projects are one man projects and the third is developed by a company that has interest in other fields of computer science, like information security. So the information is rather little and the support is obviously insufficient. 5.1 Documentation All three APIs have a documentation that shows their main features using examples, all of them lack serious, detailed information. SemWeb is an older project so the documentation is a bit more structured. The forum activities concerning the three APIs are low because people who work with RDF and want a specialized API for it usually use other development frameworks like Java or C++. 5.2 Integration with VS SemWeb is essentially a dll library which is included in VS project and used as any other assembly. LinqToRdf besides the dlls provides extensions to create .NET classes that map ontologies and to create your own ontologies. It is the only tool that has an installing kit. Rowlex is integrated using the same dll method, but is also provides two .exe files that can be used to generate an ontology from .NET classes and to generate .NET classes that map an existing ontology. There is a problem that needs to be mentioned here. All three APIs are developed with .NET 2.0 and with Visual Studio 2008 without SP1 and have problems when used with higher versions of .NET of VS. LinqToRdf is
  7. 7. impossible to install on versions of VS2008 with at least SP1 because of the tools extensions. 5.3 Learning curve The learning curve is almost the same for each of the three tools. Performing simple tasks with all of them is relatively quick to learn, but when it comes to serious, complicated tasks, that require a good understanding of the API there are big problems due to lack of documentation and poor support. Rowlex is slightly easier to learn because it lacks SPARQL capability and LinqToRdf is a bit easier that SemWeb if you know LINQ, else it can be harder as you have to learn LINQ as well. But, for a .NET programmer it is easier to learn LINQ than SPARQL. Taking into account these considerations Rowlex is the easiest to learn, SemWeb is the hardest and LinqToRdf is in between. 6 Performance SemWeb has the best performance of all because it stores the information in a lightweight manner (with three classes, Entity, Literal and Statement) and the SPARQL interrogations need no transformation as they are passed to the Query object as a string. Other reason why SemWeb is more performant is the fact that the other two APIs use it to do their low level interaction with the RDFs. Rowlex is worse in terms of performance than SemWeb because it uses more classes to store the triples during processing, but LinqToRdf is the least performant of all because the classes it uses to map the ontology are LINQ compatible and because the LINQ queries have to be transformed into SPARQL queries before they are run. 7 Interoperability In terms of interoperability all the APIs benefit from two sides. One is the RDF format which is specially designed to be used by many web
  8. 8. development frameworks. The other side is .NET which allows the interoperability with SQL databases or with other platforms via web services or different network communication protocols. 8 Project development and licensing All the projects leave the impression that they are still in beta phase. All have installing or integration problems, but the work on them ceased in 2008 or 2009. SemWeb is open source, LinqToRdf is under New BSD License and Rowlex is under GNU LESSER GENERAL PUBLIC LICENSE 9 Conclusions Before drawing any conclusions we like to state the opinion of Joshua Tauberer, the author to the most stabile API of the three, SemWeb, about the fate of his project: “May 19, 2009. I'm taking an indefinite hiatus from this project. That means that while I'll try to apply any patches to fix existing bugs, I won't be actively developing the library further, and I won't be answering questions for help on the mail list. Over the last four years it's been fun to work on it, but I don't think there has been enough uptake of the Semantic Web in the .NET world (or otherwise) for me to justify spending more time on this when I have other things in life I'd rather be working on.” Now to conclusions. The three APIs where selected because they were the “loudest” on the internet, so we considered them to be the best candidates for our comparative study. The ideas where good, the work is outstanding but all of them need a better documentation, better support, bug fixing and further development in order to make them usable and reliable tools for big, serious applications.
  9. 9. 10 Reference http://www.w3.org/TR/rdf-sparql-query/ http://en.wikipedia.org/wiki/SPARQL http://rowlex.nc3a.nato.int/default.aspx http://www.hookedonlinq.com/linqtordf.ashx http://aabs.wordpress.com/LINQ/ http://razor.occams.info/code/semweb/ http://en.wikipedia.org/wiki/.NET_Framework http://www.microsoft.com/NET/ http://en.wikipedia.org/wiki/Web_Ontology_Language http://semanticweb.org/wiki/Ontology http://www.w3.org/RDF/ This article was processed using Microsoft Word with Springer LNCS style and it is released under the Creative Commons Attribution-Share Alike 3.0 license http://creativecommons.org/licenses/by-sa/3.0/