Semantic User Profiles<br />Ancuta Ionel, Sorin Alexandru Damian<br />Abstract. This paper describes how the application associated with it works, from the premises from which the authors started, the techniques they applied and ending the results obtained in specifying from a semantic point of view a profile belonging to an user from Faculty of Computer Science Iasi. It also describes security issues that can appear and how they could be overcome. A summary of programming techniques are presented and applied in the study case - the application that completes the paper.<br />Keywords: SPARQL endpoint, RDFa, DCMI, FOAF, semantic web<br />Introduction<br />Exposing profile information for all the Faculty of Computer Science accounts in a machine readable format like RDF XML requires gathering information from multiple systems.<br />The application is based on information already available to all faculty members. The data sources used are the /etc/passwd file on the students’ server that offers basic information about the account on the server. Detailed information regarding the LDAP accounts is not directly available. This information was gathered using the People Search application at http://students.info.uaic.ro/people . This application provides both html and xml output. The xml was parsed and cached locally in order to build a triple store that can be queried with SPARQL.<br />Implementation<br />The application that exposes the SPARQL endpoint and the RDFa annotated profile information is build using ASP.net MVC for the front end and uses the SEMWEB library to provide a triple store, RDF/XML serialization and the SPARQL query engine.<br />Input data is gathered by parsing a classic unix passwd file to get all the user account and by crawling the HTML or the XML from the People Search application at http://students.info.uaic.ro/people <br />All profile information is linked to the user identity using RDF triples and stored into a local database for later queries. Let’s say that the profile of the user “sorin.damian” is crawled and needs to be stored. For all the profile properties we generate and store triples like:<br /><http://profiles.sorindamian.ro/people/rdf/sorin.damian> foaf:name “Damian T. Sorin-Alexandru”<br />The relation to the original user profile exposed by people search application at http://students.info.uaic.ro/people is maintained using the sameAs verb from the OWL vocabulary<br /><http://profiles.sorindamian.ro/people/rdf/sorin.damian> owl:sameAs <http://students.info.uaic.ro/people/xml/uid/sorin.damian><br />The implementation of the profiles crawler and of the profiles endpoint is decoupled from the implementation of the site and automated unit tests can be used to check for regressions.<br />Exposing the SPARQL endpoint<br />To expose the endpoint we created a controller who’s action returns the SPARQL query results in a human readable html format or using RDF/XML notation depending on the content types in the “Accept” HTML header sent by the requesting agent.<br />When a request is made using a regular web browser is made to the /sparql URL, the application responds with an html form that allows the user to write and submit a query. The query results will also be displayed in html by applying a XSL transformation on the XML resulted from the query.<br />RDF capable agents can make requests to the SPARQL endpoint by sending query parameters in the standard way, which is by encoding the query string and sending it through the query parameter (eg: /spaql?query=url_encoded_query). <br />Content Negotiation<br />The application can serve the proper representation of the profiles to the users. If a profile from an URL like <http://profiles.sorindamian.ro/people/rdf/ancuta.ionel> is requested and the client cannot accept the RDF/XML content type the a 303 redirect is made to the human readable URL resource (http://profiles.sorindamian.ro/people/html/ancuta.ionel)<br />Annotating profile descriptions with RDFa<br />A human readable html page is annotated with linked data using the RDFa specification. A profile property like the full name of the user can is linked to the user entity with a XHTML fragment like:<br /><div class="
>Damian T. Sorin Alexandru</div><br />Usage<br />The SPARQL endpoint user interface<br />Query results formats<br />The results are returned in html format for regular web browsers and in RDF format for browsers that accept the application/sparql-results+xml content type.<br />User profiles can be accessed in plain RDF or in RDFa annotated html<br />Html profile page with embedded FOAF metadata using RDFa<br />RDF output automatically converted to human readable HTML<br />RDF profile displayed in Twinkle<br />Security considerations<br />Exposing personal profile information using both human and machine readable formats raises additional privacy risks. Special care should be taken when exposing sensitive information like email address and account names. Such detailed information should at most be available to authenticated users or based on digital signatures.<br />Issues like trust, spam, phishing, and verified semantic web statements are discussed when it comes to RDFa and not only. Here is the page where a group of passionate people get together to discuss these concepts: http://rdfa.info/wiki/Security-and-trust .<br />Regarding the SPARQL endpoint, w3.org signals as security concerns the denial of service attacks towards the endpoint itself or towards others. <br />Syntax errors that could reveal sensitive data could be avoided by tempering with results brought to the user that issues the query and presenting him with a standard error page.<br />Future <br />The application could allow the users to further annotate the profile themselves with linked resources. Users would link their faculty account with their online identity, using OpenId for example. Also the FOAF vocabulary could be used to associate users from the same groups or users that are related to each other using information from the social networking related features in the SharePoint portal at https://portal.info.uaic.ro <br />References<br />http://razor.occams.info/code/semweb/<br />http://www.asp.net/mvc/<br />http://esw.w3.org/topic/SparqlEndpointDescription<br />http://dublincore.org/2008/01/14/dcelements.rdf<br />http://www.foaf-project.org/docs/specs<br />http://profs.info.uaic.ro/~busaco/teach/courses/wade/web-film.html<br />http://www.w3.org/TR/rdf-sparql-query/<br />http://www.ldodds.com/projects/twinkle/<br />http://www.w3.org/2009/sparql/wiki/Feature:Query_by_reference#Security_Issues<br />http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/#ExampleHTTP<br />http://semanticweb.org/wiki/SPARQL_endpoint<br />