1. Semantic User Profiles
Ancuta Ionel, Sorin Alexandru Damian
Abstract. This paper describes how the application associated with it works,
from the premises from which the authors started, the techniques they applied
and ending the results obtained in specifying from a semantic point of view a
profile belonging to an user from Faculty of Computer Science Iasi. It also
describes security issues that can appear and how they could be overcome. A
summary of programming techniques are presented and applied in the study
case - the application that completes the paper.
Keywords: SPARQL endpoint, RDFa, DCMI, FOAF, semantic web
1 Introduction
Exposing profile information for all the Faculty of Computer Science accounts in a
machine readable format like RDF XML requires gathering information from
multiple systems.
The application is based on information already available to all faculty members. The
data sources used are the /etc/passwd file on the students’ server that offers basic
information about the account on the server. Detailed information regarding the
LDAP accounts is not directly available. This information was gathered using the
People Search application at http://students.info.uaic.ro/people . This application
provides both html and xml output. The xml was parsed and cached locally in order to
build a triple store that can be queried with SPARQL.
2 Implementation
The application that exposes the SPARQL endpoint and the RDFa annotated profile
information is build using ASP.net MVC for the front end and uses the SEMWEB
library to provide a triple store, RDF/XML serialization and the SPARQL query
engine.
Input data is gathered by parsing a classic unix passwd file to get all the user account
and by crawling the HTML or the XML from the People Search application at
http://students.info.uaic.ro/people
All profile information is linked to the user identity using RDF triples and stored into
a local database for later queries. Let’s say that the profile of the user “sorin.damian”
is crawled and needs to be stored. For all the profile properties we generate and store
triples like:
2. <http://profiles.sorindamian.ro/people/rdf/sorin.damian
> foaf:name “Damian T. Sorin-Alexandru”
The relation to the original user profile exposed by people search application at
http://students.info.uaic.ro/people is maintained using the sameAs verb from the
OWL vocabulary
<http://profiles.sorindamian.ro/people/rdf/sorin.damian
> owl:sameAs
<http://students.info.uaic.ro/people/xml/uid/sorin.dami
an>
The implementation of the profiles crawler and of the profiles endpoint is decoupled
from the implementation of the site and automated unit tests can be used to check for
regressions.
2.1 Exposing the SPARQL endpoint
To expose the endpoint we created a controller who’s action returns the SPARQL
query results in a human readable html format or using RDF/XML notation
depending on the content types in the “Accept” HTML header sent by the requesting
agent.
When a request is made using a regular web browser is made to the /sparql URL,
the application responds with an html form that allows the user to write and submit a
query. The query results will also be displayed in html by applying a XSL
transformation on the XML resulted from the query.
RDF capable agents can make requests to the SPARQL endpoint by sending query
parameters in the standard way, which is by encoding the query string and sending it
through the query parameter (eg: /spaql?query=url_encoded_query).
2.2 Content Negotiation
The application can serve the proper representation of the profiles to the users. If a
profile from an URL like <http://profiles.sorindamian.ro/people/rdf/ancuta.ionel> is
requested and the client cannot accept the RDF/XML content type the a 303 redirect
is made to the human readable URL resource
(http://profiles.sorindamian.ro/people/html/ancuta.ionel)
2.3 Annotating profile descriptions with RDFa
A human readable html page is annotated with linked data using the RDFa
specification. A profile property like the full name of the user can is linked to the user
entity with a XHTML fragment like:
3. <div class="display-field"
about=”http://profiles.sorindamian.ro/people/rdf/sorin.
damian” property="foaf:name">Damian T. Sorin
Alexandru</div>
3 Usage
3.1 The SPARQL endpoint user interface
3.2 Query results formats
The results are returned in html format for regular web browsers and in RDF format
for browsers that accept the application/sparql-results+xml content type.
4. 3.3 User profiles can be accessed in plain RDF or in RDFa annotated html
5. 3.4 Html profile page with embedded FOAF metadata using RDFa
3.5 RDF output automatically converted to human readable HTML
6. 3.6 RDF profile displayed in Twinkle
4 Security considerations
Exposing personal profile information using both human and machine readable
formats raises additional privacy risks. Special care should be taken when exposing
sensitive information like email address and account names. Such detailed
information should at most be available to authenticated users or based on digital
signatures.
Issues like trust, spam, phishing, and verified semantic web statements are discussed
when it comes to RDFa and not only. Here is the page where a group of passionate
people get together to discuss these concepts: http://rdfa.info/wiki/Security-and-trust .
Regarding the SPARQL endpoint, w3.org signals as security concerns the denial of
service attacks towards the endpoint itself or towards others.
Syntax errors that could reveal sensitive data could be avoided by tempering with
results brought to the user that issues the query and presenting him with a standard
error page.
5 Future
The application could allow the users to further annotate the profile themselves with
linked resources. Users would link their faculty account with their online identity,
using OpenId for example. Also the FOAF vocabulary could be used to associate
7. users from the same groups or users that are related to each other using information
from the social networking related features in the SharePoint portal at
https://portal.info.uaic.ro
6 References
http://razor.occams.info/code/semweb/
http://www.asp.net/mvc/
http://esw.w3.org/topic/SparqlEndpointDescription
http://dublincore.org/2008/01/14/dcelements.rdf
http://www.foaf-project.org/docs/specs
http://profs.info.uaic.ro/~busaco/teach/courses/wade/web-film.html
http://www.w3.org/TR/rdf-sparql-query/
http://www.ldodds.com/projects/twinkle/
http://www.w3.org/2009/sparql/wiki/Feature:Query_by_reference#Security_Issues
http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/#ExampleHTTP
http://semanticweb.org/wiki/SPARQL_endpoint