WebSpa

             Monica Macoveiciuc and Constantin Stan

  Faculty of Computer Science, Alexandru Ioan Cuza University...
Introduction


The Web is a universal medium for information and data exchange. Exploiting
the huge amount of knowledge di...
Technologies


1   SPARQL

SPARQL[2] (which is pronounced “sparkle” and has as recursive acronym -
SPARQL Protocol and RDF...
results, jump forward through results using OFFSET, and LIMIT the amount of
data returned. The SPARQL Query Results XML Fo...
– the subject - the resource from which the arch leaves - implements the Re-
   source interface;
 – the predicate - the p...
markup language format recognizable to anyone with HTML or XML develop-
ment experience. The Flex Framework provides compo...
patched by the views cannot directly communicate with a separate application
controller, unless it is a view. To have a se...
WebSpa


4   Functionality
WebSpa is a tool with which one can interrogate arbitrary SPARQL endpoints
in a more intuitive ...
The first group contains actions related to the queries. A logged user can choose
to create, open or save a query. The same...
WHERE
   {
        ?s ?p ?o .
    }




The “Statement” menu contains code editing features (comment/uncomment
or indent/r...
start to happen. In this MXML file the layout and content of the header and of
the footer of the application are described....
webspa.servlet
webspa.sparql

The package webspa.servlet contains the central point of the application, the
class WebSpaSe...
class variables and table columns. For example, the User class has the following
variables:
    @Id
    @GeneratedValue(st...
checkField
checkEndPointByUserUrl
getLoggedUserId
getEndPoints
getModels
getQueries
Along with the class that communicates...
Conclusion and Perspectives


WebSpa is a web based application that manages SPARQL endpoints and allows
users to perform ...
Upcoming SlideShare
Loading in...5
×

Web Spa

1,479

Published on

WebSpa is a tool that allows the quick, intuitive (and even fun) interrogation of arbitrary SPARQL endpoints. WebSpa runs in the web browser and does not require the installation of any additional software. The tool manages a large variety of pre-defi ned SPARQL endpoints and allows the addition of new ones. An user account gives the possibility of saving both the interrogation and its results on the local computer, as well as further editing of the queries. The application is written in both Java and Flex. It uses Jena and ARQ application programming interface in order to perform the queries, and the results are processed and displayed using Flex.

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,479
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
15
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Web Spa

  1. 1. WebSpa Monica Macoveiciuc and Constantin Stan Faculty of Computer Science, Alexandru Ioan Cuza University, Iasi Abstract. WebSpa is a tool that allows the quick, intuitive (and even fun) interrogation of arbitrary SPARQL endpoints. WebSpa runs in the web browser and does not require the installation of any additional soft- ware. The tool manages a large variety of pre-defined SPARQL endpoints and allows the addition of new ones. An user account gives the possibility of saving both the interrogation and its results on the local computer, as well as further editing of the queries. The application is written in both Java and Flex. It uses Jena and ARQ application programming interface in order to perform the queries, and the results are processed and displayed using Flex.
  2. 2. Introduction The Web is a universal medium for information and data exchange. Exploiting the huge amount of knowledge distributed on the Web is a significant challenge. Humans can understand the information, but it takes great effort to find and combine data from such a large number of sources; on the other hand, computers can easily browse through millions of pages in no time, but they are not capable of understanding the content. The Semantic Web is an extension of the World Wide Web, “in which information is given well-defined meaning, better enabling computers and people to work in cooperation”[1]. RDF (Resource Description Format), together with SPARQL, provide a power- ful mechanism for describing and interchanging metadata on the web. RDF is the W3C standard for encoding knowledge. It is a structure for describing and interchanging metadata on the Web in numerous forms and purposes. SPARQL is both a query language and a remote access protocol. A SPARQL endpoint enables users (human or machine) to query a knowledge base via the SPARQL language. Results are typically returned in one or more machine-processable for- mats. Therefore, a SPARQL endpoint is mostly conceived as a machine-friendly interface towards a knowledge base. WebSpa is a web based application that manages SPARQL endpoints and allows users to perform interrogations over them. It provides pre-defined SPARQL end- points and supports the addition of new ones. Anyone can use the application, but an user account is required in order to performs certain operations, such as adding endpoints, saving and later editing a query. The results are displayed in the browser and they can also be stored locally, in XML format. The application is written in both Java and Flex. It uses Jena and ARQ ap- plication programming interface in oder to perform the queries, and the results are processed and displayed using Flex. This paper describes the WebSpa application, as well as the way it is built. The first chapter contains a brief explanation of the concepts of SPARQL and SPARQL endpoint, and the technologies used for building the application - Java (Jena, ARQ) and Flex. The next chapter presents the interface, functionality and storage capabilities of the application. The front and back-end are then de- scribed and. Conclusions are drawn and perspectives are suggested in the final chapter.
  3. 3. Technologies 1 SPARQL SPARQL[2] (which is pronounced “sparkle” and has as recursive acronym - SPARQL Protocol and RDF Query Language) is an RDF query language. It’s a fresh W3C Recommendation about which Sir Tim Berners-Lee said that “will make a huge difference”. RDF is pretty foundational to the Semantic Web. Until SPARQL’s launch, RDF had a data model, a formal semantics, and a concrete serialization (in XML), but what it didnt have was a standard query language. SPARQL came in place and now offers to the Semantic Web and to Web 2.0 a common data manipulation language in the form of expressive query against the RDF data model. Using WSDL 2.0, SPARQL Protocol for RDF describes a very simple web service with one operation, query which is available with both HTTP and SOAP bindings. This operation is the way you send SPARQL queries to other sites and the way you get back the results. The HTTP bindings are REST-friendly and a simple SPARQL protocol client takes little amount of code in order to implement. SPARQL consists of 3 separate specifications. The first one is the query lan- guage specification (which makes up the core). The second is the query results XML format (which describes an XML format or serializing the results of an SPARQL queries - SELECT, ASK). The third specification is the data access protocol (which uses WSDL 2.0 to define simple SOAP and HTTP protocols for remotely querying RDF databases - or any data repository that can be mapped to the RDF model). Altogether it consists of a query language, a mean of con- veying a query to a query processor service and defining the XML format in which the results will arrive. Some issues are not addressed yet by SPARQL. The most notable is that it can’t modify an RDF dataset (it’s read-only). As we mentioned previously, RDF is build on the triple pattern (a 3-tuple consisting of subject, predicate, and ob- ject). Similar to RDF, SPARQL is built on the triple pattern, which also consists of a subject, predicate and object. SPARQL allows to match patterns in an RDF graph using triple patterns, which are like triples except they may contain vari- ables in place of concrete values (the variables are used as “wildcards” to match RDF terms in the dataset). The SELECT query can be used to extract data from an RDF graph, returning it as an array result set. For more complex graph patterns one should use re- quired and/or OPTIONAL data. UNION queries are also a way of dealing with selecting alternatives from the dataset. It is possible to apply ordering to the
  4. 4. results, jump forward through results using OFFSET, and LIMIT the amount of data returned. The SPARQL Query Results XML Format specification includes several relevant examples. Given its obvious simplicity and regular structure, manipulating this format with XSLT or XQuery is fairly trivial. The syntax shortcuts make writing queries much simpler. These are especially useful with repetitive graph patterns and long URIs. SPARQL presents itself as being the missing and long waited part from the Semantic Web and Web 2.0. A SPARQL endpoint is a conformant SPARQL protocol service as defined in the SPROT (SPARQL Protocol for RDF) specification. A SPARQL endpoint enables users (human or other) to query a knowledge base via the SPARQL language. Results are typically returned in one or more machine-processable for- mats. Therefore, a SPARQL endpoint is mostly conceived as a machine-friendly interface towards a knowledge base. Both the formulation of the queries and the human-readable presentation of the results should typically be implemented by the calling software, and not be done manually by human users. At the time, there is no agreed description for a SPARQL endpoint. Endpoint descriptions can be used to announce endpoint capabilities and contents, support discovery through service directories, supply browsing and federation hints. 2 Jena and ARQ Jena[3] is an open source Semantic Web framework for Java. It provides an API to extract data from and write to RDF graphs, which are presented as an abstract “model”. This model can be queried through SPARQL and updated through SPARUL[4]. Jena uses the concept of graph for dealing with the data: the nodes correspond to URIs, while the edges are the triples. The graphs are represented through the Model interface, which has different implementations: a memory-based one, one which uses a relational database etc. The memory-based model is the simplest and easier to use one. A triple is represented through an interface called Statement. A statement cor- responds to an edge in the graph and consists of three parts:
  5. 5. – the subject - the resource from which the arch leaves - implements the Re- source interface; – the predicate - the property (the label of the arch) - implements the Property interface; – the object - the resource that is pointed by the arch - implements the Re- source or the Literal interface. The components of the statement have a common base - the RDFNode interface. The object component is more complex. A statement can be used as the object component of the triple, since RDF allows nested statements. Objects imple- menting the Container, Alt, Bag, or Seq interface can also be used as objects. An RDF Model is represented as a set of statements. Accessing the compo- nents of the statement can be achieved through the getSubject, getPredicate and getObject methods of the Statement class. The API provides methods for the most common operations: – addProperty - adds a new statement (triple) to the model; – listSubjects - lists the subject component of each triple from the model; – listObjects - lists the object component of each triple from the model; – write - writes the model in RDF XML format to the output stream given as parameter; – read - reads the statements in RDF XML format into a model. ARQ is a query engine for Jena that supports the SPARQL language. In addition to implementing SPARQL, ARQ’s query engine can also parse queries expressed in RDQL[5] or its own internal query language. 3 Flex The Flex framework provides the declarative language, application services, com- ponents, and data connectivity developers need to rapidly build rich Internet applications (RIAs) for the browser or desktop. Flex 3 is a powerful framework that provides enterprise-level components for the Flash Player platform in a
  6. 6. markup language format recognizable to anyone with HTML or XML develop- ment experience. The Flex Framework provides components for visual layout, visual effects, data grids, server communication, charts, and much more. MXML is the language developers use to define the layout, appearance, and be- haviors of a Flex application. ActionScript 3, an object-oriented language based on industry-standard ECMAScript, is the language that defines the client-side application logic. Your MXML and ActionScript are compiled together into a single SWF file that makes up your Flex application. Because the compiler is available both as a standalone utility in the Flex 3 SDK and as part of Adobe Flex Builder 3 software, developers can choose to develop in the Eclipse based Flex Builder IDE or in an IDE of their choice. Flex includes a pre-built class library and application services that help develop- ers assemble and build RIAs. These services include data binding, drag-and-drop management, the display system that manages the interface layout, the style sys- tem that manages the look and feel of interface components, and the effects and animation system that manages motion and transitions. The component library provides all of the user interface controls that developers need, from simple but- tons, checkboxes, and radio buttons to complex data grids, combo boxes, and rich text editors. Use the provided containers to design complex, adaptive lay- outs with ease, and use (or modify) the visually stunning skins to achieve an ideal look and feel. The Adobe AIR runtime extends web applications to the desktop, creating new opportunities for more engaging, higher performing online/offline applications. The Flex framework provides native support for the new AIR APIs, and Flex Builder 3 provides all the tools necessary to build, debug, package, and sign applications built on Adobe AIR. Most Flex applications that are designed using just the Flex framework are as follows: On one side, you have properly decoupled and reusable view compo- nents, that know nothing about the rest of the application, dispatching events and using data binding from parents and to children views. On the other side, a main application container (the Application root tag) acts both as a controller and a model, sometimes delegating tasks to some “utils” classes that handle things such as RPC communications. In other words, a big Master object, which can quickly take the form of a hideous spaghetti-code monster. If the application is quite simple, and no long time maintenance is required, this might be not such a bad choice. Using Events and dataBinding, Flex can achieve a very good view components decoupling. But your main view container will either have to handle all the logic by itself, or explicitly delegate it to a controller class with which it will then be very tightly coupled. The main problem is that user interaction events dis-
  7. 7. patched by the views cannot directly communicate with a separate application controller, unless it is a view. To have a separate controller handle these events and take actions, the events have to climb the display list up to the main ap- plication container (the root application tag) which may then lead them to the controller.
  8. 8. WebSpa 4 Functionality WebSpa is a tool with which one can interrogate arbitrary SPARQL endpoints in a more intuitive way. It provides pre-defined SPARQL endpoints and supports the addition of new ones. Anyone can use the application, but an user account is required in order to performs certain operations, such as adding endpoints, saving and later editing a query. The results are displayed in the browser and they can also be stored locally, in XML format. The interface is easy to use and intuitive. The top bar contains user-related information: the name of the currently logged user, a form for either registering, logging in or out. The top right menu can be used for both login and register, by ticking or un-ticking the “I’m new” checkbox. The submit button changes according to this action, into either “create”, or “log in”. The menu contains three groups of actions: Query Result More
  9. 9. The first group contains actions related to the queries. A logged user can choose to create, open or save a query. The same actions are present in the second menu group, but they are related with the results of the query. The last menu item provides information about the application. The screen is divided in two parts - the top window is bound to the query, while the bottom window is used for managing the results. The first one contains a menu which helps a less trained user to write queries. The “select endpoint” dropdown contains predefined SPARQL endpoints. A user that has created an account can add and remove his own endpoints. These actions can be performed through the group of options next to the menu - “Refresh”, “Save”, “Delete”. The next dropdown contains predefined RDF Schemas that are automatically inserted in a query when the user selects one. The “template” menu contains four options, corresponding to the four types of SPARQL queries supported - SELECT, ASK, DESCRIBE, CONSTRUCT. When a user selects one of these items, a pattern of the query is written in the query window. The user can then personalize this pattern. For example, choosing the “SELECT” option inserts the following content: SELECT DISTINCT ?s ?p ?o
  10. 10. WHERE { ?s ?p ?o . } The “Statement” menu contains code editing features (comment/uncomment or indent/remove indent), selection applicable stateaments and example (fill in) statements. The last item of the bar is a group of buttons that provide quick access to the most important actions - creating a new query, opening an existing one, saving a query. A bar with options divides the screen in two windows. It contains items that help users manage the queries. The “Run Query” button runs a query, and the provided results are displayed in the window below. This query can be given a name and saved in the database for later use. A user can choose to load his saved queries, which can then be edited or saved as new queries. 5 Front-End The front-end, realized in Adobe Flex Builder, is using as backbone the Adobe Flex Framework 3.2.0. Not using a stronger MVC constrain from the framework side has its advantages and/or disadvantages and this was discussed in previous material. As most of the projects built with Adobe Flex Builder, WebSpa is an Web appli- cation (means that it runs in Flash Player - the Adobe runtime for web browsers). The whole project targets as Player the version 10.0.0 of Flash Player. This is a technical requirement because this version of the Player offers an unprecedented API when comes to work with files: saving and opening local files on the client’s machine. Previous versions of Flash Player made this possible only via server side scripts (server was used as a proxy). From now on we’ll refer to Adobe Flex Builder simply with Flex. The project has as “entry point” the WebSpa.mxml file. This is where things
  11. 11. start to happen. In this MXML file the layout and content of the header and of the footer of the application are described. Also this class manages users (creates, logs in or logs out users) and offers an application menu. Into the project we created a Config class which acts as configuration data holder. This class holds data as the server paths, project internal data and other type of data. Changing the project’s server side application location is much easier this way. After 1 line edit and a recompile the application is “good to go”. All properties within this class are static. All styles within the application (if altered) are managed from a *.css file (Main.css). For the other pages within the project we used MXML components. This fact assures us the possibility of an easy extension in the future if desired. We’ve got the About.mxml and ClassicWebSpa.mxml. The About page is quite self ex- plainable. The ClassicWebSpa is our SparQL interrogation tool which interacts with the Java server side for providing results to queries written by users. Inside this page the user is presented the option to refresh / save / delete end- points or to save / load queries on the server side and have access to them via his account. This option is available only if the user is logged in. Also the use has the option to create new / open / save queries or results locally. We even offer the option to save under 2 different file formats (*.wsq which stands for WebSpaQuery and *.wsr which stands for WebSpaResult). This is offered to the user in order to manage/locate files related with our application more easily. Basically our application aids the user to execute a SparQL query against an endpoint and retrieve the result of that query for a later manipulation. The ap- plication menu offers pretty much the same functionality as the ClassicWebSpa page, plus the connection to the About page. For the whole communication between front-end and back-end we use an wrap- per class that automates the server calls (offers the server side script, the GET parameters, the POST parameters and the callback on execution success). This make things much more readable and easy to manage in the rest of the classes. For local files management we use two static classes which handled pretty well saving / opening queries or/and results. For the future the project may be extended easily because of the decoupled way things function within it. 6 Back-End The back-end is written in Java, using the Jena framework and the ARQ support for SPARQL that it provides. The code is organized in three packages: webspa.persistance
  12. 12. webspa.servlet webspa.sparql The package webspa.servlet contains the central point of the application, the class WebSpaServlet. It is a servlet that communicates with the Flex front-end of the application and redirects each call to the appropriate method. The servlet is also responsible with starting the SQL transaction, as well as with committing the changes. The servlet contains a HibernateBean object, whose methods are called for each action. The requests comming from the front-end are analyzed in order to determine which method to be accessed, based on the value of the “fn” parameter. Once the method identified, it is passed some general parame- ters - the request, response and the EntityManager object (used for managing the communication with the database tables). Some of the methods require ad- ditional parameters, such as boolean values or certain parameters specified in the request. Each of the HibernateBean’s methods return the response as a String variable, which is then passed to Flex for further processing, using the HttpServletResponse’s PrintWriter object: if (method.equalsIgnoreCase("create-user")) { response.getWriter().write(hBean. registerUser(request, response, entityManager)); } The method that executes the SPARQL query is accessed differently. A SparQL object is created and its executeSpaQuery method is called, with the following parameters: the query’s text, the endpoint’s URL and the PrintWriter object. The method itself is responsible with writing the response to this object. Package webspa.persistance contains the classes used for the persistence layer of the application, which is implemented with Hibernate and JavaPersistence API (JPA). The data is stored in a MySQL database. There are five classes mapped to the database tables: class User - table USERS - manages the users of the application. class EndPoint - table ENDPOINT - stores endpoint information, such as the url, the type (default or added by user), the status etc. class UserEndPoint - table USERENDPOINT - contains a one to many mapping between an user and his additional endpoints. class Model - table MODEL - stores RDF Schemas. class Query - table QUERY - stores save queries. Each class has attached an .hbm.xml file, that specifies the mapping between
  13. 13. class variables and table columns. For example, the User class has the following variables: @Id @GeneratedValue(strategy = GenerationType.AUTO) private Integer uid; private String username; private String password; private String lastIp; @Column(name="created") private Timestamp creationDate; @Column(name="lastlogin") private Timestamp lastLoginDate; The tokens preceded by “@” are JPA annotations. The corresponding .hbm.xml file contains the mapping: <hibernate-mapping> <class dynamic-insert="false" dynamic-update="false" mutable="true" optimistic-lock="version" polymorphism="implicit" select-before-update="false" name="webspa.persistance.User" table="users"> <id column="UID" name="uid"> <generator class="increment"/> </id> <property column="username" name="username"/> <property column="password" name="password"/> <property column="lastip" name="lastIp"/> <property column="created" name="creationDate"/> <property column="lastlogin" name="lastLoginDate"/> </class> </hibernate-mapping> A class named HibernateBean is responsible with the communication with the database. HibernateBean contains methods for retrieving, saving and updating endpoints, queries and/or results. Each of the methods follows a similar pattern - validation of the parameters, creation and execution of the query, formatting and sending the result. The methods implemented are: registerUser loginUser logoutUser createEndPoint hideShowEndPoint saveQuery updateQuery checkUserById
  14. 14. checkField checkEndPointByUserUrl getLoggedUserId getEndPoints getModels getQueries Along with the class that communicates with the database, the package web- spa.sparql contains the class SparQL. Using Jena and ARQ APIs, the class runs queries over SPARQL endpoints. The process of sending a query and getting a result is simple. First, a QueryExecution object is obtained using the ARQ’s class QueryExecutionFactory and its sparqlService method. This method requires two arguments - the query and the endpoint URL, both provided by the user through the interface. QueryExecution queryExecution = QueryExecutionFactory. sparqlService(endpoint, query); A pattern is used in order to determine the type of the query - SELECT, ASK, DESCRIBE, CONSTRUCT. Based on the type, different methods are called: execSelect execAsk execDescribe execConstruct The ResultSetFormatter’s static method - asXMLString - is used for returning the results of the SELECT and ASK queries as XML. ResultSet results = queryExecution.execSelect(); printWriter.write(ResultSetFormatter.asXMLString(results)); The other to interrogation types retrieve the results as a Model object. These objects write directly to the PrintWriter that communicates with Flex, via their “write” method. Model model = queryExecution.execDescribe(); model.write(printWriter);
  15. 15. Conclusion and Perspectives WebSpa is a web based application that manages SPARQL endpoints and allows users to perform interrogations over them. It provides pre-defined SPARQL end- points and supports the addition of new ones. Anyone can use the application, but an user account is required in order to performs certain operations, such as adding endpoints, saving and later editing a query. The results are displayed in the browser and they can also be stored locally, in XML format. The application is written in both Java and Flex. It uses Jena and ARQ ap- plication programming interface in order to perform the queries, and the results are processed and displayed using Flex. There are two versions of WebSpa - the classic one, which has been presented in this paper, and the graphical one, which is still under development. The latter allows the user to handle the RDF graph by visualizing its nodes and arches and directly manipulating them. Limitations of the Classic WebSpa refer mainly to the way content is displayed, eg. it might be a little difficult for the users to manually browse to the XML file in order to view the results. References [1] Tim Berners-Lee, James Hendler and Ora Lassila. The Semantic Web. Scientific American, May 2001. [2] W3C. SPARQL Query Language for RDF. http://www.w3.org/TR/rdf-sparql- query/ [3] Jena Project Page. http://jena.sourceforge.net/ [4] W3C. SPARQL Update. http://www.w3.org/Submission/SPARQL-Update/ [5] W3C. RDQL - A Query Language for RDF. http://www.w3.org/Submission/RDQL/ [6] Adobe Flex 3. http://www.adobe.com/ro/products/flex/ [7] ARQ - A SPARQL Processor for Jena. http://jena.sourceforge.net/ARQ/ [7] onica Macoveiciuc, Constantin Stan. Flex Framewoek Presentation.

×