• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Semantic Technologies to Support the User-Centric Analysis of Activity Data
 

Semantic Technologies to Support the User-Centric Analysis of Activity Data

on

  • 1,109 views

Presentation at the Social Data on the Web workshop 2011

Presentation at the Social Data on the Web workshop 2011

Statistics

Views

Total Views
1,109
Views on SlideShare
884
Embed Views
225

Actions

Likes
0
Downloads
2
Comments
0

4 Embeds 225

http://people.kmi.open.ac.uk 221
http://tweetedtimes.com 2
http://a0.twimg.com 1
http://paper.li 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Change the image to real stuff
  • Real screenshot

Semantic Technologies to Support the User-Centric Analysis of Activity Data  Semantic Technologies to Support the User-Centric Analysis of Activity Data Presentation Transcript

  • Semantic Technologies to Support the User-Centric Analysis of Activity Data Mathieu d’Aquin, Salman Elahi, Enrico Motta Knowledge Media Institute, The Open University
  • Consumer/user centric data
  • Activity Data ResourceActor on Action by realizes Event (Trace)
  • Usual Web Analytics ResourceGroups of on Actors Actions by realizes Set of Events (Traces)
  • User-centric Activity Data Analysis Set of ResourcesActor on Actions by realizes Set of Events (Trace)
  • Challenges in user centric activity data• Activity data that sit in logs are – Heterogeneous – different models for different sites/systems – Raw – uninterpreted – Horribly big – thousands of pieces of information generated every minute – Hard to exploit, understand, analyze
  • User Centric Activity Data Activity analysis Consolidation for and by Integration Ontologies individual users Interpretation Logs Logs Logs 2 4 1 Logs 3 Website 2 Website 4 Website 1 Website 3OrganisationUsers
  • User support PREFIX tr:<http://uciad.info/ontology/trace/> PREFIX actor:<http://uciad.info/ontology/actor/> User Logging Detect setting construct { or register (agent+IP) ?trace ?p ?x. ?x ?p2 ?x2. User name: mathieu ?x2 ?p3 ?x3. ?x3 ?p4 ?x4 unknown setting non-ambiguous Password: ****** } where{ It is the first time you log into <http://uciad.info/actor/mathieu> actor:knownSetting ?set. Check settingUCIAD with this setting (detail) non- ?trace tr:hasSetting ?set.do you want to attach it to your Your current?trace ?p ?x. setting is: account? ambiguous ?x ?p2 ?x2. Computer IP: 137.108.2x.1xx ambiguous ?x2 ?p3 ?x3. User Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) ?x3 ?p4 ?x4 known setting for user AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.A.B.C Safari/525.13 } This setting is not currently attached to a user, so it will be added to your yes Register known settings as you log into the system Display Activity Data Add setting to setting as related to all known known setting ambiguous settings of the user no
  • <rdf:RDF> <rdf:Description rdf:about="http://uciad.info/trace/kmi- web13/ede2ab38da27695eec1e0b375f9b20da"> User support <rdf:type rdf:resource="http://uciad.info/ontology/trace/Trace"/> for graph http://uciad.info/users/mathieu <hasAction rdf:resource="http://uciad.info/action/GET"/> Export my data <hasPageInvolved rdf:resource="http://uciad.info/page/0b9abc62fcf90afc53797b938af435dd"/> <hasResponse rdf:resource="http://uciad.info/response/ea95add1414aba134ff9e0482b921a33"/> <hasSetting User Logging Detect setting rdf:resource="http://uciad.info/actorsetting/119696ec92c5acec29397dc7ef98817f"/> or register <hasTime (agent+IP) rdf:datatype="http://www.w3.org/2001/XMLSchema#string">13/Jun/2011:01:37:23+0100</hasTi me> </rdf:Description> </rdf:RDF> unknown setting <rdf:Description rdf:about="http://uciad.info/page/0b9abc62fcf90afc53797b938af435dd"> non-ambiguous <rdf:type rdf:resource="http://uciad.info/ontology/sitemap/WebPage"/> <isPartOf rdf:resource="http://uciad.info/ontology/test1/dataopenacuk"/> It is the first time you log into <onServer rdf:resource="http://kmi-web13.open.ac.uk"/> Check settingUCIAD with this setting (detail) <url rdf:datatype="http://www.w3.org/2001/XMLSchema#string"> non-do you want to attach it to your /resource/person/ext-718a372e10788bb58d562a8bf6fb864e account? </url> ambiguous </rdf:Description> ambiguous <rdf:Description rdf:about="http://uciad.info/ontology/test1/dataopenacuk"> <rdf:type rdf:resource="http://uciad.info/ontology/sitemap/Website"/> known setting for user <rdf:type rdf:resource="http://uciad.info/ontology/test1/LinkedDataPlatform"/> <onServer rdf:resource="http://kmi-web13.open.ac.uk"/> <urlPattern rdf:datatype="http://www.w3.org/2001/XMLSchema#string">/*</urlPattern> </rdf:Description> yes Register Display Activity Data <rdf:Description rdf:about="http://uciad.info/response/ea95add1414aba134ff9e0482b921a33"> Add setting to setting as related to all known <rdf:type rdf:resource="http://uciad.info/ontology/trace/HTTPResponse"/> known setting <hasResponseCode rdf:resource="http://uciad.info/ontology/trace/200"/> <hasSizeInBytes ambiguous settings of the user rdf:datatype="http://www.w3.org/2001/XMLSchema#int">1085</hasSizeInBytes> </rdf:Description> no
  • Technical infrastructure Semantic Triple Store Scheduler/Manager Daily RDF Daily RDF traces traces Parser/RDF Parser/RDFDaily RDF Daily RDF Daily RDF renderer renderer traces traces tracesParser/RDF Parser/RDF Parser/RDF renderer Log Log renderer renderer Applicat Applicat Log Log ion Log ion Server1 Server2 Server3
  • OntologiesFormal conceptual models of a domain: online user activityKey Concepts: – Actor: the things accessing resources (through agents) – Resources: Webpages, Websites – Actions: realized by actors on resources, e.g., requests – Events: an actor realizing an action on a resource
  • Ontologies
  • User support User Logging Detect setting or register (agent+IP) unknown setting non-ambiguous It is the first time you log into Check settingUCIAD with this setting (detail) non-do you want to attach it to your account? ambiguous ambiguous known setting for user yes Register Display Activity Data Add setting to setting as related to all known known setting ambiguous settings of the user no
  • Authenticated SPARQL Query: Protected SPARQL Select ?x Access right info: endpoint where {?x a uciad:Website} User->graphs Credentials: User=mathieu Pass=mypass matgraph mathieu? onto HTTP + basic auth Query: SPARQL endpoint Select ?xStandard interface with From matgraph,ontoSPARQL authentication where {?x aresults uciad:Website}
  • Customizing the Ontologies = Customizing the AnalysisThe User ActivityOntologies for the basis Base Activityto describe generic Ontologiesactivity data in a sharablewayCustomized extensions: – Specific categories of User Activity resources, actions and Data events – Formally defined to Inference allow inferenceCreate customizedaggregations,classifications and Specificdistributions in the data Classifications, Distributions,that allow for specific Aggregations…analyses
  • ExamplesIn the ontology: 1. vhs-wiki is a Wiki 2. Data.open.ac.uk is a DataPlatform S 3. Actions on a Page which is part- u of a Wiki are called usingWiki b 4. Similarly for usingDataPlatform - c lAnd… a 1. Activities usingAWiki with a user- s Pages involved in agent which is an RSS-Reader s usingWikiThoughtBro are e wser checkingWikiUpdatesWithRSS s 2. Otherwise, they are usingWikiThroughBrowser o f u
  • Examples Sub-classes of usingDataPlatformIn the Ontology 1. The page http://data.open.ac.uk/query is a SPARQLEndpoint Settings used in executingASparqlQuery 2. An action on a The most used is curl on SPARQLEndpoint with a query the user’s laptop parameter is ExecutingASparqlQuery 3. Pages of the form http://data.open.ac.uk/page/* are DataPages 4. An action on a DataPage with a BrowserAgent is ConsultingADataPage Sub-classes of DataPages consulted by the user
  • Browsing Interface: LDI ClassSub-classeswithdistributionofinstancesProperties Details of awith memberdistribution (instance)of Values List of members (instances)
  • Conclusion• The idea of the UCIAD project was to investigate and experiment with the use of semantic technologies for the user centric integration of activity data• Demonstrated the value of the approach, as well as current technical limitations: – Scalability – Flexible Access-control – Usability
  • Future Work/Next Steps• User studies: what can people do with their activity data? In which form?• Scenarios for user centric activity data – Project Danube, Higgins, Mydex, personal.com, … with semantics?• Licensing User Data?
  • Personal Monitoring of Web Information Exchange:Towards Web LifeloggingMathieu d’Aquin, Salman Elahi and Enrico Motta – m.daquin@open.ac.uk Future Work/Next Steps With more and more services relying on the Web to communicate with their users, the amount of information exchanged daily by an individual through various Web channels has become difficult to control. While in principle this gives better possibilities to share and exchange information with various people and organizations, it also makes it more difficult for Web users to fully comprehend, explore and exploit exchanges of their own data. We developed a Web lifelogger, dedicated to tracking every ex- changes realized over the Web by an individual Web user, and to store these logs using semantic technologies. We ran an experi- ment on using such a tool for a period of 2.5 months for a particular user. The collected data (100M Triples) can be used by the user to monitor and study his own online behavior based in particular on basic analytics, models of the perceived trust relationship this Our previous work on using user has with different websites and on what can be learnt from analyzing the use of Web search engines. local proxy to collectBasic Analytics Trust in Domains and Criticality of Data information on user generated Web traffic… … and linking thisNumber of requests per hour of the day (Sum). Allowsto identify events appearing on a typical day. information to web resources…Map of the locations of the servers where requests havebeen sent. Allows to identify the physical space of Web A simple iterative model is defined to compute the perceived trust in websites (top), and the per- ceived criticality of personal data (bottom) based on observing the exchange of this data. The … to create online personal information/personalinteractions. simple intuition on which we rely is that a trusted website receives critical data, and that critical data is shared only with a few trusted websites. Exposing this model to the user in an interactive way can help aligning the perceived behavior with the intended one, and detect possible conflicts between data exchange and personal privacy rules. Analyzing Search History analytics interfaces..Cloud of the most commonly access websites. Showsthe impact of ‘implicit’ requests. … Web search history is known to provide interesting indications of the user’s interests. Using Open-49 different tools accessing the Web (User-Agent) can Calais SemanticProxy (ht t p: / / ht t p: / / sem i cpr oxy. opencal ai s. com ), we detect general themes ant /be identified, including Web browsers, twitter clients, e- from the analysis of search keywords, directly pointing to additional resources. Also, we see pat-mail clients, update utilities, social applications, etc. terns emerging from the use of search engines, in terms of navigational and informational searches.
  • ResourcesSites Time Entities Friday 14th October 2011 (number of requests) People Peter Scott Kurt Cobain Adele Ashley MacIsaac Steve Jobs Bach Vincent Cassel Enrico Motta Virginia Woolf Terry Pratchett Jane Austen William Gibson Neil Gaiman Martin Bean Nicolas Sarkozy Fouad Zablith David Cameron Marta Sabou Michael Jackson Jimi Hendrix Tim Berners Lee Stuart Brown Carlo Allocca Profile Scott Adams Organizations British Broadcasting Coorporation The Gardian The Open University Joint Information Systems Committee Engineering and Physical Science Resource Council Google Amazon La compagnie des branques Facebook Arts and Humanities Research Council Knowledge Media Institute Wikimedia Foundation By Hour By Week By Month Agence National de la Recherche Apple European Commission Locations Places United Kingdom Euston Walton Hall France Paris Luxembourg Heathrow Metz Nancy Birmingham Coulsdon New York London Washington Manchester Dublin Bonn Dusseldorf Rome Thionville Chamonix Milton Keynes Mont Blanc England Alderaan Nice Gare de lEst Croydon Saint Pancras Bletchley Luton Graph View Other KeywordsLanguages Education Semantics iPad Summer School Semantic Web Cajon Case-Based ReasoningEnglish 68% Artificial Intelligence Dataset PHP Data Mining School University Educational Resources OpenLearnFrench 24% SocialLearn Ontologies OWL Editor Journal 5 Conference Linked Data Teaching MusicGerman Workshop iPhone Java Javascript Discovery RDF % Guitar PiratesItalian 2 % Filters
  • ResourcesSites Time Entities Friday 14th October 2011 (number of requests) People Peter Scott Kurt Cobain Adele Ashley MacIsaac Steve Jobs Bach Vincent Cassel Enrico Motta Virginia Woolf Terry Pratchett Jane Austen William Gibson Neil Gaiman Martin Bean Nicolas Motta Enrico Sarkozy Fouad Zablith David Cameron the Professor at Marta Sabou Michael Jackson JimiKnowledge media Hendrix Tim Berners Lee Stuart Brown Carlo Allocca Institute Profile Scott Adams Relation to you: Colleague, Friend, Line Organizations Manager British Broadcasting Coorporation The Gardian The Open University Joint Information Systems Committee Engineering and Physical Science Resource Council Google Amazon La compagnie des branques Facebook Arts and Humanities Research Council Knowledge Media Institute Wikimedia Foundation By Hour By Week By Month Agence National de la Recherche Apple European Commission Locations Places United Kingdom Euston Walton Hall France Paris Luxembourg Heathrow Metz Nancy Birmingham Coulsdon New York London Washington Manchester Dublin Bonn Dusseldorf Rome Thionville Chamonix Milton Keynes Mont Blanc England Alderaan Nice Gare de lEst Croydon Saint Pancras Bletchley Luton Graph View Other KeywordsLanguages Education Semantics iPad Summer School Semantic Web Cajon Case-Based ReasoningEnglish 86% Artificial Intelligence Dataset PHP Data Mining School University Educational Resources OpenLearn 14Italian SocialLearn Ontologies OWL Editor Journal % 5 Conference Linked Data Teaching MusicGerman Workshop iPhone Java Javascript Discovery RDF % Guitar PiratesFrench Filters
  • More infoUCIAD Blog: http://uciad.infoCode base: http://github.com/uciadTwitter: #uciad @mdaquin