SlideShare a Scribd company logo
User Centric Integration of Activity Data Mathieu d’Aquin, Stuart Brown, SalmanElahi, Enrico Motta The Open University
Agenda Introduction of the Team Objectives and Hypothesis Overview of technical realization Challenges Summary of results so far and dissemination
Team Dr Mathieu d’Aquin– Research fellow, KMi – project director Stuart Brown – Web developments and online communities, communication services – member of the steering group, liaison with online services SalmanElahi– Resarch assistant and PhD student, KMi – developer/researcher  Prof Enrico Motta – Professor of knowledge technologies, KMi – Chair of the steering group
Objectives and Hypothesis Hypothesis Taking a user centric point of view can allow different types of analysis of logs/activity data, which are valuable to the organisation and the user Ontologiesand Ontology-based reasoning can support the integration, consolidation and interpretation of activity data from multiple sources
Organisation Centric Activity Data  Analytics = aggregated stats Consolidation Consolidation Consolidation Logs 2 Logs 4 Logs 1 Logs 3 Website 2 Website 4 Website 1 Website 3 Organisation Users
At the Open University An analytics system building aggregated data from various university’s websites Based on a manually defined sitemaps Good for website optimization, marketing campaigns, etc. But the data being pre-aggregated, it is limited with respect to what it can do Limited control No user view
User Centric Activity Data Activity analysis for and by individual users Consolidation Integration Interpretation Ontologies Logs 2 Logs 4 Logs 1 Logs 3 Website 2 Website 4 Website 1 Website 3 Organisation Users
Ontologies Formal conceptual models of a domain Here, the domain is online user activity  At the basis of Semantic Web technologies Standard languages for expressing ontologies and ontological data (RDF, OWL) Tools to manipulate and work with ontologies and semantic data (NeOn Toolkit, OWLIM) Many ontologies to reuse (cf. Watson) Adhere to a logical formalism Enable inferences on the data
Objectives and Deliverables Build the technical infrastructure that can hold traces of activity data as semantic data Include triple store with reasoning capability, log parsers for different formats of logs, and renderers as semantic data (RDF) Build the ontologies to interpret and reason upon activity data Including various aspects of activity data in a way which is extensible  Tools to support users in analyzing their own activity data Recognize a user from the different settings and provide view on his/her own data  Allow him/her to customize the view, by customizing the ontology Test, validate, deploy, distribute
Technical infrastructure Semantic Triple Store Scheduler/Manager Daily RDF traces Daily RDF traces Parser/RDF renderer Parser/RDF renderer Daily RDF traces Daily RDF traces Daily RDF traces Log Log Parser/RDF renderer Parser/RDF renderer Parser/RDF renderer Application Log Log Log Application Server1 Server2 Server3
Technical infrastructure Development of parsers for different kinds a log formats  Currently handle Apache web server log files, parameterized from the Apache configuration Easily extensible for dedicated log formats Provide a common data structure serialized in RDF by the RDF renderer Each server produces a daily extract from the logs in RDF, which is being used to populate the semantic triple store The triple store includes multiple repositories and sub-spaces depending on time/user/server
Ontologies Key concepts to be represented: Actors (human users and robots) Sitemaps Traces (broad notion of logs) Activities Reusing existing ontologies FOAF: for people and documents Time Ontology: for traces Action ontology: for traces and activities (Planned) OPO: Online presence (Planner) SIOC: Online communities
Iterative and extensible construction of the ontologies Provide a base with actors, sitemaps and traces Specific extensions with typologies of activities, depending on user and site Dynamically building and integrating
Tool for analysis Need a tool which given A set of ontologies A data repository (which can be the overall one, the one restricted by time, and one for a given user) 	can provide a meaningful and interactive overview of the activity data To be used for  Provide an ontology-specific view of data analytics Support the iterative development of the ontologies Provide a user centric view of the data
Tools for analysis
Example In the ontology: /robot.txt is a RobotTXT page A Spider is an RobotAgent (ActorAgent) An agent used to access a RobotTXT is a Spider An AutomaticActivity is a Trace realized by a RobotAgent Result: Thousands of traces automatically classified as automatic activities.
Example In the ontology: UCIAD-Blog and LUCERO-Blog are Blogs (Website) A BlogPage is a page which is part of a Blog An activity onBlog is an activity happening on a Blog Page Result: Can look specifically at activities happening on a Blog and specialize them (same applies to Wikis, and other types of websites)
Example In the ontology: A SPARQLEndpoint is a specific type of Webpage AccessingSparqlEnpoint is an activity on a SPARQLEndpoint SPARLQQueryParameter is a parameter with the name “query” used in an AccessingSPARQLEndpoint activity ExecutingSPARQLQuery is an AccessingSPARQLQuery activity attached to a SPARQLQueryParameter Result: Can explore the specific activity of executing SPARQL queries and its parameters Can combine: Detect the activity of Automatically Accessing a SPARQL endpoint: and automatic activity and accessing a SPARQL endpoint.
Next step: User support Allow users  to log-in detect setting  bring up the relevant data  explore it But also,  to customize the view of the data to extend the ontologies to provide a personalized analysis of activity data to export (interpreted) activity data for reuse
User support User Logging or register Detect setting (agent+IP) unknown setting It is the first time you log into UCIAD with this setting (detail) do you want to attach it to your account? Check setting non-ambiguous non-ambiguous ambiguous known setting for user Add setting to known setting Register setting as ambiguous Display Activity Data related to all known settings of the user yes no
User support: data for a user For a user <u> the SPARQL query     Construct {?trace ?p ?y.  							?y ?q ?z} where 				{<u> actor:hasKnownSetting ?s.          ?trace trace:hasSetting ?s.          ?trace ?p ?y. ?trace ?q ?z} builds the traces of activities around the known setting of <u> Used to populate a specific repository with sub-spaces for each registered users
Deployment, test, validation At the moment, testing for websites of projects and events hosted on KMi servers: Sssw.org, sssw09.org, loted.eu, lucero-project.info, uciad.info, data.open.ac.uk, lucero.open.ac.uk, … Next level up, websites/systems from main open university website: www.open.ac.uk, study at the OU, podcasts.open.ac.uk, VLE Extend to deployment of instances for specific projects with distributed websites
Challenges Scalability OWLIM triple store can handle billions of triples But struggle with millions when inference is “on”  1 repository without inference with all historical data, 1 with inference with 1 week of data only, and 1 with inference for registered users User management and privacy Ensuring that the user who logs in from a particular setting is the one having the activity is difficult (e.g., in the case of shared computers) Is this really a problem? Check ambiguity – ask verification questions – moderate? Distribution and IPR Code and ontologies under open licenses (small uncertainty regarding code developed in other projects) Overall data: privacy issues (is k-anonymity actually applicable? Would it work?) Overall data: institutional issues (can we show the traffic on our websites to everybody) User data export: what license?
Summary and dissemination Promising initial results Can create new ways of analysis at run-time by editing the ontologies! Mechanisms to provide personal views on own activity data across websites First version of the ontologies: ongoing task First version of the tools: test and validate! Dissemination Blog / Twitter #uciad KMi’sinternal news letter (KMi Planet) Salman’s paper at the ESWC 2011 PhD symposium: “Personal Semantics: Personal information management in the Web with Semantic Technologies” Position paper at the W3C Web tracking and privacy workshop: “Self-Tracking on the Web: Why and How” Submission to the Personal Semantic Data workshop at K-CAP 2011
More info UCIAD Blog: http://uciad.info Code base: http://github.com/uciad Twitter: #uciad @mdaquin

More Related Content

What's hot

IEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia Communities
IEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia CommunitiesIEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia Communities
IEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia Communities
Kalman Graffi
 
Introduction to Text Mining and Visualization with Interactive Web Application
Introduction to Text Mining and Visualization with Interactive Web ApplicationIntroduction to Text Mining and Visualization with Interactive Web Application
Introduction to Text Mining and Visualization with Interactive Web Application
Olga Scrivner
 
Annotating Digital Texts in the Brown University Library
Annotating Digital Texts in the Brown University LibraryAnnotating Digital Texts in the Brown University Library
Annotating Digital Texts in the Brown University Library
Timothy Cole
 
Citation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research ObjectsCitation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research Objects
Daniel S. Katz
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Ian Foster
 
Python tool to data analysis and artificial intelligence
Python tool to data analysis and artificial intelligencePython tool to data analysis and artificial intelligence
Python tool to data analysis and artificial intelligence
Md Aksam VK
 
Natural Language Processing & Semantic Models in an Imperfect World
Natural Language Processing & Semantic Modelsin an Imperfect WorldNatural Language Processing & Semantic Modelsin an Imperfect World
Natural Language Processing & Semantic Models in an Imperfect WorldVital.AI
 
SageCite demonstrator overview
SageCite demonstrator overviewSageCite demonstrator overview
SageCite demonstrator overview
monicaduke
 
Mduke sagecite-jisc-march11
Mduke sagecite-jisc-march11Mduke sagecite-jisc-march11
Mduke sagecite-jisc-march11
monicaduke
 
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Keiichiro Ono
 
One IOTA at a time: A Case Study of OpenURL Success Metrics
One IOTA at a time: A Case Study of OpenURL Success MetricsOne IOTA at a time: A Case Study of OpenURL Success Metrics
One IOTA at a time: A Case Study of OpenURL Success Metrics
Charleston Conference
 
Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)
Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)
Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)
Keiichiro Ono
 
2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial
Alexander Pico
 
NTCIR-12 MobileClick-2 Overview
NTCIR-12 MobileClick-2 OverviewNTCIR-12 MobileClick-2 Overview
NTCIR-12 MobileClick-2 Overview
kt.mako
 
Open Annotation Collaboration Briefing
Open Annotation Collaboration BriefingOpen Annotation Collaboration Briefing
Open Annotation Collaboration Briefing
Timothy Cole
 
WoSC19: Serverless Workflows for Indexing Large Scientific Data
WoSC19: Serverless Workflows for Indexing Large Scientific DataWoSC19: Serverless Workflows for Indexing Large Scientific Data
WoSC19: Serverless Workflows for Indexing Large Scientific Data
University of Chicago
 
Using Implicit Preference Relations to Improve Content-based Recommendations,...
Using Implicit Preference Relations to Improve Content-based Recommendations,...Using Implicit Preference Relations to Improve Content-based Recommendations,...
Using Implicit Preference Relations to Improve Content-based Recommendations,...
Ladislav Peska
 
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Bertram Ludäscher
 
IntAct and data distribution with PSICQUIC
IntAct and data distribution with PSICQUICIntAct and data distribution with PSICQUIC
IntAct and data distribution with PSICQUIC
Rafael C. Jimenez
 

What's hot (19)

IEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia Communities
IEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia CommunitiesIEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia Communities
IEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia Communities
 
Introduction to Text Mining and Visualization with Interactive Web Application
Introduction to Text Mining and Visualization with Interactive Web ApplicationIntroduction to Text Mining and Visualization with Interactive Web Application
Introduction to Text Mining and Visualization with Interactive Web Application
 
Annotating Digital Texts in the Brown University Library
Annotating Digital Texts in the Brown University LibraryAnnotating Digital Texts in the Brown University Library
Annotating Digital Texts in the Brown University Library
 
Citation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research ObjectsCitation and Research Objects: Toward Active Research Objects
Citation and Research Objects: Toward Active Research Objects
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
Python tool to data analysis and artificial intelligence
Python tool to data analysis and artificial intelligencePython tool to data analysis and artificial intelligence
Python tool to data analysis and artificial intelligence
 
Natural Language Processing & Semantic Models in an Imperfect World
Natural Language Processing & Semantic Modelsin an Imperfect WorldNatural Language Processing & Semantic Modelsin an Imperfect World
Natural Language Processing & Semantic Models in an Imperfect World
 
SageCite demonstrator overview
SageCite demonstrator overviewSageCite demonstrator overview
SageCite demonstrator overview
 
Mduke sagecite-jisc-march11
Mduke sagecite-jisc-march11Mduke sagecite-jisc-march11
Mduke sagecite-jisc-march11
 
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
 
One IOTA at a time: A Case Study of OpenURL Success Metrics
One IOTA at a time: A Case Study of OpenURL Success MetricsOne IOTA at a time: A Case Study of OpenURL Success Metrics
One IOTA at a time: A Case Study of OpenURL Success Metrics
 
Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)
Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)
Cytoscape Tutorial Session 1 at UT-KBRIN Bioinformatics Summit 2014 (4/11/2014)
 
2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial
 
NTCIR-12 MobileClick-2 Overview
NTCIR-12 MobileClick-2 OverviewNTCIR-12 MobileClick-2 Overview
NTCIR-12 MobileClick-2 Overview
 
Open Annotation Collaboration Briefing
Open Annotation Collaboration BriefingOpen Annotation Collaboration Briefing
Open Annotation Collaboration Briefing
 
WoSC19: Serverless Workflows for Indexing Large Scientific Data
WoSC19: Serverless Workflows for Indexing Large Scientific DataWoSC19: Serverless Workflows for Indexing Large Scientific Data
WoSC19: Serverless Workflows for Indexing Large Scientific Data
 
Using Implicit Preference Relations to Improve Content-based Recommendations,...
Using Implicit Preference Relations to Improve Content-based Recommendations,...Using Implicit Preference Relations to Improve Content-based Recommendations,...
Using Implicit Preference Relations to Improve Content-based Recommendations,...
 
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
 
IntAct and data distribution with PSICQUIC
IntAct and data distribution with PSICQUICIntAct and data distribution with PSICQUIC
IntAct and data distribution with PSICQUIC
 

Similar to UCIAD overview

UCIAD - quick overview
UCIAD - quick overviewUCIAD - quick overview
UCIAD - quick overview
Mathieu d'Aquin
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication Repositories
ASIS&T
 
Semantic Web in Action: Ontology-driven information search, integration and a...
Semantic Web in Action: Ontology-driven information search, integration and a...Semantic Web in Action: Ontology-driven information search, integration and a...
Semantic Web in Action: Ontology-driven information search, integration and a...
Amit Sheth
 
Web analytics webinar
Web analytics webinarWeb analytics webinar
Web analytics webinar
Jim Jansen
 
OSFair2017 Workshop | EGI applications database
OSFair2017 Workshop | EGI applications databaseOSFair2017 Workshop | EGI applications database
OSFair2017 Workshop | EGI applications database
Open Science Fair
 
Web analytics presentation
Web analytics presentationWeb analytics presentation
Web analytics presentation
Jim Jansen
 
Biocatalogue Talk Slides
Biocatalogue Talk SlidesBiocatalogue Talk Slides
Biocatalogue Talk Slides
BioCatalogue
 
Author's workflow and the role of open access
Author's workflow and the role of open accessAuthor's workflow and the role of open access
Author's workflow and the role of open access
Paola Gargiulo
 
CREW VRE Release 5 - 2009 May
CREW VRE Release 5 - 2009 MayCREW VRE Release 5 - 2009 May
CREW VRE Release 5 - 2009 May
Martin Turner
 
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEEMEMTECHSTUDENTPROJECTS
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble
 
Tracking user activity logs using Loggastic #ApiPlatformCon
Tracking user activity logs using Loggastic #ApiPlatformConTracking user activity logs using Loggastic #ApiPlatformCon
Tracking user activity logs using Loggastic #ApiPlatformCon
Paula Čučuk
 
Suricate
SuricateSuricate
Suricate
befreax
 
Seminario eMadrid sobre "Nuevas experiencias en laboratorios remotos". Estand...
Seminario eMadrid sobre "Nuevas experiencias en laboratorios remotos". Estand...Seminario eMadrid sobre "Nuevas experiencias en laboratorios remotos". Estand...
Seminario eMadrid sobre "Nuevas experiencias en laboratorios remotos". Estand...
eMadrid network
 
A Distributed Architecture for Sharing Ecological Data Sets with Access and U...
A Distributed Architecture for Sharing Ecological Data Sets with Access and U...A Distributed Architecture for Sharing Ecological Data Sets with Access and U...
A Distributed Architecture for Sharing Ecological Data Sets with Access and U...
Javier González
 
A Look into the Apache OODT Ecosystem
A Look into the Apache OODT EcosystemA Look into the Apache OODT Ecosystem
A Look into the Apache OODT Ecosystem
Chris Mattmann
 
Executable papers
Executable papersExecutable papers
Executable papers
Anita de Waard
 
Getting Started with Splunk Enterprise
Getting Started with Splunk EnterpriseGetting Started with Splunk Enterprise
Getting Started with Splunk Enterprise
Splunk
 
Archonnex at ICPSR
Archonnex at ICPSRArchonnex at ICPSR
Archonnex at ICPSR
Harshakumar Ummerpillai
 
Prototype Design of Open Access Institutional Repository
Prototype Design of Open Access Institutional RepositoryPrototype Design of Open Access Institutional Repository
Prototype Design of Open Access Institutional Repository
DMR (Directorate of Mushroom Research), ICAR, GOI
 

Similar to UCIAD overview (20)

UCIAD - quick overview
UCIAD - quick overviewUCIAD - quick overview
UCIAD - quick overview
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication Repositories
 
Semantic Web in Action: Ontology-driven information search, integration and a...
Semantic Web in Action: Ontology-driven information search, integration and a...Semantic Web in Action: Ontology-driven information search, integration and a...
Semantic Web in Action: Ontology-driven information search, integration and a...
 
Web analytics webinar
Web analytics webinarWeb analytics webinar
Web analytics webinar
 
OSFair2017 Workshop | EGI applications database
OSFair2017 Workshop | EGI applications databaseOSFair2017 Workshop | EGI applications database
OSFair2017 Workshop | EGI applications database
 
Web analytics presentation
Web analytics presentationWeb analytics presentation
Web analytics presentation
 
Biocatalogue Talk Slides
Biocatalogue Talk SlidesBiocatalogue Talk Slides
Biocatalogue Talk Slides
 
Author's workflow and the role of open access
Author's workflow and the role of open accessAuthor's workflow and the role of open access
Author's workflow and the role of open access
 
CREW VRE Release 5 - 2009 May
CREW VRE Release 5 - 2009 MayCREW VRE Release 5 - 2009 May
CREW VRE Release 5 - 2009 May
 
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
Tracking user activity logs using Loggastic #ApiPlatformCon
Tracking user activity logs using Loggastic #ApiPlatformConTracking user activity logs using Loggastic #ApiPlatformCon
Tracking user activity logs using Loggastic #ApiPlatformCon
 
Suricate
SuricateSuricate
Suricate
 
Seminario eMadrid sobre "Nuevas experiencias en laboratorios remotos". Estand...
Seminario eMadrid sobre "Nuevas experiencias en laboratorios remotos". Estand...Seminario eMadrid sobre "Nuevas experiencias en laboratorios remotos". Estand...
Seminario eMadrid sobre "Nuevas experiencias en laboratorios remotos". Estand...
 
A Distributed Architecture for Sharing Ecological Data Sets with Access and U...
A Distributed Architecture for Sharing Ecological Data Sets with Access and U...A Distributed Architecture for Sharing Ecological Data Sets with Access and U...
A Distributed Architecture for Sharing Ecological Data Sets with Access and U...
 
A Look into the Apache OODT Ecosystem
A Look into the Apache OODT EcosystemA Look into the Apache OODT Ecosystem
A Look into the Apache OODT Ecosystem
 
Executable papers
Executable papersExecutable papers
Executable papers
 
Getting Started with Splunk Enterprise
Getting Started with Splunk EnterpriseGetting Started with Splunk Enterprise
Getting Started with Splunk Enterprise
 
Archonnex at ICPSR
Archonnex at ICPSRArchonnex at ICPSR
Archonnex at ICPSR
 
Prototype Design of Open Access Institutional Repository
Prototype Design of Open Access Institutional RepositoryPrototype Design of Open Access Institutional Repository
Prototype Design of Open Access Institutional Repository
 

More from Mathieu d'Aquin

A factorial study of neural network learning from differences for regression
A factorial study of neural network learning from  differences for regressionA factorial study of neural network learning from  differences for regression
A factorial study of neural network learning from differences for regression
Mathieu d'Aquin
 
Recentrer l'intelligence artificielle sur les connaissances
Recentrer l'intelligence artificielle sur les connaissancesRecentrer l'intelligence artificielle sur les connaissances
Recentrer l'intelligence artificielle sur les connaissances
Mathieu d'Aquin
 
Data and Knowledge as Commodities
Data and Knowledge as CommoditiesData and Knowledge as Commodities
Data and Knowledge as Commodities
Mathieu d'Aquin
 
Unsupervised learning approach for identifying sub-genres in music scores
Unsupervised learning approach for identifying sub-genres in music scoresUnsupervised learning approach for identifying sub-genres in music scores
Unsupervised learning approach for identifying sub-genres in music scores
Mathieu d'Aquin
 
Is knowledge engineering still relevant?
Is knowledge engineering still relevant?Is knowledge engineering still relevant?
Is knowledge engineering still relevant?
Mathieu d'Aquin
 
A data view of the data science process
A data view of the data science processA data view of the data science process
A data view of the data science process
Mathieu d'Aquin
 
Dealing with Open Domain Data
Dealing with Open Domain DataDealing with Open Domain Data
Dealing with Open Domain Data
Mathieu d'Aquin
 
Web Analytics for Everyday Learning
Web Analytics for  Everyday LearningWeb Analytics for  Everyday Learning
Web Analytics for Everyday Learning
Mathieu d'Aquin
 
Presentation a in ovive montpellier - 26%2 f06%2f2018 (1)
Presentation a in ovive   montpellier - 26%2 f06%2f2018 (1)Presentation a in ovive   montpellier - 26%2 f06%2f2018 (1)
Presentation a in ovive montpellier - 26%2 f06%2f2018 (1)
Mathieu d'Aquin
 
Learning Analytics: understand learning and support the learner
Learning Analytics: understand learning and support the learnerLearning Analytics: understand learning and support the learner
Learning Analytics: understand learning and support the learner
Mathieu d'Aquin
 
The AFEL Project
The AFEL ProjectThe AFEL Project
The AFEL Project
Mathieu d'Aquin
 
Assessing the Readability of Policy Documents: The Case of Terms of Use of On...
Assessing the Readability of Policy Documents: The Case of Terms of Use of On...Assessing the Readability of Policy Documents: The Case of Terms of Use of On...
Assessing the Readability of Policy Documents: The Case of Terms of Use of On...
Mathieu d'Aquin
 
Data ethics
Data ethicsData ethics
Data ethics
Mathieu d'Aquin
 
Data for Learning and Learning with Data
Data for Learning and Learning with DataData for Learning and Learning with Data
Data for Learning and Learning with Data
Mathieu d'Aquin
 
Towards an “Ethics in Design” methodology for AI research projects
Towards an “Ethics in Design” methodology  for AI research projects Towards an “Ethics in Design” methodology  for AI research projects
Towards an “Ethics in Design” methodology for AI research projects
Mathieu d'Aquin
 
AFEL: Towards Measuring Online Activities Contributions to Self-Directed Lear...
AFEL: Towards Measuring Online Activities Contributions to Self-Directed Lear...AFEL: Towards Measuring Online Activities Contributions to Self-Directed Lear...
AFEL: Towards Measuring Online Activities Contributions to Self-Directed Lear...
Mathieu d'Aquin
 
Profiling information sources and services for discovery
Profiling information sources and services for discoveryProfiling information sources and services for discovery
Profiling information sources and services for discovery
Mathieu d'Aquin
 
Analyse de données et de réseaux sociaux pour l’aide à l’apprentissage infor...
Analyse de données et de réseaux sociaux pour  l’aide à l’apprentissage infor...Analyse de données et de réseaux sociaux pour  l’aide à l’apprentissage infor...
Analyse de données et de réseaux sociaux pour l’aide à l’apprentissage infor...
Mathieu d'Aquin
 
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent SystemsFrom Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
Mathieu d'Aquin
 
Data analytics beyond data processing and how it affects Industry 4.0
Data analytics beyond data processing and how it affects Industry 4.0Data analytics beyond data processing and how it affects Industry 4.0
Data analytics beyond data processing and how it affects Industry 4.0
Mathieu d'Aquin
 

More from Mathieu d'Aquin (20)

A factorial study of neural network learning from differences for regression
A factorial study of neural network learning from  differences for regressionA factorial study of neural network learning from  differences for regression
A factorial study of neural network learning from differences for regression
 
Recentrer l'intelligence artificielle sur les connaissances
Recentrer l'intelligence artificielle sur les connaissancesRecentrer l'intelligence artificielle sur les connaissances
Recentrer l'intelligence artificielle sur les connaissances
 
Data and Knowledge as Commodities
Data and Knowledge as CommoditiesData and Knowledge as Commodities
Data and Knowledge as Commodities
 
Unsupervised learning approach for identifying sub-genres in music scores
Unsupervised learning approach for identifying sub-genres in music scoresUnsupervised learning approach for identifying sub-genres in music scores
Unsupervised learning approach for identifying sub-genres in music scores
 
Is knowledge engineering still relevant?
Is knowledge engineering still relevant?Is knowledge engineering still relevant?
Is knowledge engineering still relevant?
 
A data view of the data science process
A data view of the data science processA data view of the data science process
A data view of the data science process
 
Dealing with Open Domain Data
Dealing with Open Domain DataDealing with Open Domain Data
Dealing with Open Domain Data
 
Web Analytics for Everyday Learning
Web Analytics for  Everyday LearningWeb Analytics for  Everyday Learning
Web Analytics for Everyday Learning
 
Presentation a in ovive montpellier - 26%2 f06%2f2018 (1)
Presentation a in ovive   montpellier - 26%2 f06%2f2018 (1)Presentation a in ovive   montpellier - 26%2 f06%2f2018 (1)
Presentation a in ovive montpellier - 26%2 f06%2f2018 (1)
 
Learning Analytics: understand learning and support the learner
Learning Analytics: understand learning and support the learnerLearning Analytics: understand learning and support the learner
Learning Analytics: understand learning and support the learner
 
The AFEL Project
The AFEL ProjectThe AFEL Project
The AFEL Project
 
Assessing the Readability of Policy Documents: The Case of Terms of Use of On...
Assessing the Readability of Policy Documents: The Case of Terms of Use of On...Assessing the Readability of Policy Documents: The Case of Terms of Use of On...
Assessing the Readability of Policy Documents: The Case of Terms of Use of On...
 
Data ethics
Data ethicsData ethics
Data ethics
 
Data for Learning and Learning with Data
Data for Learning and Learning with DataData for Learning and Learning with Data
Data for Learning and Learning with Data
 
Towards an “Ethics in Design” methodology for AI research projects
Towards an “Ethics in Design” methodology  for AI research projects Towards an “Ethics in Design” methodology  for AI research projects
Towards an “Ethics in Design” methodology for AI research projects
 
AFEL: Towards Measuring Online Activities Contributions to Self-Directed Lear...
AFEL: Towards Measuring Online Activities Contributions to Self-Directed Lear...AFEL: Towards Measuring Online Activities Contributions to Self-Directed Lear...
AFEL: Towards Measuring Online Activities Contributions to Self-Directed Lear...
 
Profiling information sources and services for discovery
Profiling information sources and services for discoveryProfiling information sources and services for discovery
Profiling information sources and services for discovery
 
Analyse de données et de réseaux sociaux pour l’aide à l’apprentissage infor...
Analyse de données et de réseaux sociaux pour  l’aide à l’apprentissage infor...Analyse de données et de réseaux sociaux pour  l’aide à l’apprentissage infor...
Analyse de données et de réseaux sociaux pour l’aide à l’apprentissage infor...
 
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent SystemsFrom Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
From Knowledge Bases to Knowledge Infrastructures for Intelligent Systems
 
Data analytics beyond data processing and how it affects Industry 4.0
Data analytics beyond data processing and how it affects Industry 4.0Data analytics beyond data processing and how it affects Industry 4.0
Data analytics beyond data processing and how it affects Industry 4.0
 

Recently uploaded

UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 

Recently uploaded (20)

UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 

UCIAD overview

  • 1. User Centric Integration of Activity Data Mathieu d’Aquin, Stuart Brown, SalmanElahi, Enrico Motta The Open University
  • 2. Agenda Introduction of the Team Objectives and Hypothesis Overview of technical realization Challenges Summary of results so far and dissemination
  • 3. Team Dr Mathieu d’Aquin– Research fellow, KMi – project director Stuart Brown – Web developments and online communities, communication services – member of the steering group, liaison with online services SalmanElahi– Resarch assistant and PhD student, KMi – developer/researcher Prof Enrico Motta – Professor of knowledge technologies, KMi – Chair of the steering group
  • 4. Objectives and Hypothesis Hypothesis Taking a user centric point of view can allow different types of analysis of logs/activity data, which are valuable to the organisation and the user Ontologiesand Ontology-based reasoning can support the integration, consolidation and interpretation of activity data from multiple sources
  • 5. Organisation Centric Activity Data Analytics = aggregated stats Consolidation Consolidation Consolidation Logs 2 Logs 4 Logs 1 Logs 3 Website 2 Website 4 Website 1 Website 3 Organisation Users
  • 6. At the Open University An analytics system building aggregated data from various university’s websites Based on a manually defined sitemaps Good for website optimization, marketing campaigns, etc. But the data being pre-aggregated, it is limited with respect to what it can do Limited control No user view
  • 7. User Centric Activity Data Activity analysis for and by individual users Consolidation Integration Interpretation Ontologies Logs 2 Logs 4 Logs 1 Logs 3 Website 2 Website 4 Website 1 Website 3 Organisation Users
  • 8. Ontologies Formal conceptual models of a domain Here, the domain is online user activity At the basis of Semantic Web technologies Standard languages for expressing ontologies and ontological data (RDF, OWL) Tools to manipulate and work with ontologies and semantic data (NeOn Toolkit, OWLIM) Many ontologies to reuse (cf. Watson) Adhere to a logical formalism Enable inferences on the data
  • 9. Objectives and Deliverables Build the technical infrastructure that can hold traces of activity data as semantic data Include triple store with reasoning capability, log parsers for different formats of logs, and renderers as semantic data (RDF) Build the ontologies to interpret and reason upon activity data Including various aspects of activity data in a way which is extensible Tools to support users in analyzing their own activity data Recognize a user from the different settings and provide view on his/her own data Allow him/her to customize the view, by customizing the ontology Test, validate, deploy, distribute
  • 10. Technical infrastructure Semantic Triple Store Scheduler/Manager Daily RDF traces Daily RDF traces Parser/RDF renderer Parser/RDF renderer Daily RDF traces Daily RDF traces Daily RDF traces Log Log Parser/RDF renderer Parser/RDF renderer Parser/RDF renderer Application Log Log Log Application Server1 Server2 Server3
  • 11. Technical infrastructure Development of parsers for different kinds a log formats Currently handle Apache web server log files, parameterized from the Apache configuration Easily extensible for dedicated log formats Provide a common data structure serialized in RDF by the RDF renderer Each server produces a daily extract from the logs in RDF, which is being used to populate the semantic triple store The triple store includes multiple repositories and sub-spaces depending on time/user/server
  • 12. Ontologies Key concepts to be represented: Actors (human users and robots) Sitemaps Traces (broad notion of logs) Activities Reusing existing ontologies FOAF: for people and documents Time Ontology: for traces Action ontology: for traces and activities (Planned) OPO: Online presence (Planner) SIOC: Online communities
  • 13.
  • 14. Iterative and extensible construction of the ontologies Provide a base with actors, sitemaps and traces Specific extensions with typologies of activities, depending on user and site Dynamically building and integrating
  • 15. Tool for analysis Need a tool which given A set of ontologies A data repository (which can be the overall one, the one restricted by time, and one for a given user) can provide a meaningful and interactive overview of the activity data To be used for Provide an ontology-specific view of data analytics Support the iterative development of the ontologies Provide a user centric view of the data
  • 17. Example In the ontology: /robot.txt is a RobotTXT page A Spider is an RobotAgent (ActorAgent) An agent used to access a RobotTXT is a Spider An AutomaticActivity is a Trace realized by a RobotAgent Result: Thousands of traces automatically classified as automatic activities.
  • 18. Example In the ontology: UCIAD-Blog and LUCERO-Blog are Blogs (Website) A BlogPage is a page which is part of a Blog An activity onBlog is an activity happening on a Blog Page Result: Can look specifically at activities happening on a Blog and specialize them (same applies to Wikis, and other types of websites)
  • 19. Example In the ontology: A SPARQLEndpoint is a specific type of Webpage AccessingSparqlEnpoint is an activity on a SPARQLEndpoint SPARLQQueryParameter is a parameter with the name “query” used in an AccessingSPARQLEndpoint activity ExecutingSPARQLQuery is an AccessingSPARQLQuery activity attached to a SPARQLQueryParameter Result: Can explore the specific activity of executing SPARQL queries and its parameters Can combine: Detect the activity of Automatically Accessing a SPARQL endpoint: and automatic activity and accessing a SPARQL endpoint.
  • 20. Next step: User support Allow users to log-in detect setting bring up the relevant data explore it But also, to customize the view of the data to extend the ontologies to provide a personalized analysis of activity data to export (interpreted) activity data for reuse
  • 21. User support User Logging or register Detect setting (agent+IP) unknown setting It is the first time you log into UCIAD with this setting (detail) do you want to attach it to your account? Check setting non-ambiguous non-ambiguous ambiguous known setting for user Add setting to known setting Register setting as ambiguous Display Activity Data related to all known settings of the user yes no
  • 22. User support: data for a user For a user <u> the SPARQL query Construct {?trace ?p ?y. ?y ?q ?z} where {<u> actor:hasKnownSetting ?s. ?trace trace:hasSetting ?s. ?trace ?p ?y. ?trace ?q ?z} builds the traces of activities around the known setting of <u> Used to populate a specific repository with sub-spaces for each registered users
  • 23. Deployment, test, validation At the moment, testing for websites of projects and events hosted on KMi servers: Sssw.org, sssw09.org, loted.eu, lucero-project.info, uciad.info, data.open.ac.uk, lucero.open.ac.uk, … Next level up, websites/systems from main open university website: www.open.ac.uk, study at the OU, podcasts.open.ac.uk, VLE Extend to deployment of instances for specific projects with distributed websites
  • 24. Challenges Scalability OWLIM triple store can handle billions of triples But struggle with millions when inference is “on”  1 repository without inference with all historical data, 1 with inference with 1 week of data only, and 1 with inference for registered users User management and privacy Ensuring that the user who logs in from a particular setting is the one having the activity is difficult (e.g., in the case of shared computers) Is this really a problem? Check ambiguity – ask verification questions – moderate? Distribution and IPR Code and ontologies under open licenses (small uncertainty regarding code developed in other projects) Overall data: privacy issues (is k-anonymity actually applicable? Would it work?) Overall data: institutional issues (can we show the traffic on our websites to everybody) User data export: what license?
  • 25. Summary and dissemination Promising initial results Can create new ways of analysis at run-time by editing the ontologies! Mechanisms to provide personal views on own activity data across websites First version of the ontologies: ongoing task First version of the tools: test and validate! Dissemination Blog / Twitter #uciad KMi’sinternal news letter (KMi Planet) Salman’s paper at the ESWC 2011 PhD symposium: “Personal Semantics: Personal information management in the Web with Semantic Technologies” Position paper at the W3C Web tracking and privacy workshop: “Self-Tracking on the Web: Why and How” Submission to the Personal Semantic Data workshop at K-CAP 2011
  • 26. More info UCIAD Blog: http://uciad.info Code base: http://github.com/uciad Twitter: #uciad @mdaquin