Requirements for Processing Datasets for Recommender Systems
1. Requirements for Processing Datasets
for Recommender Systems
Preliminary Experiences from Three Case
Studies
Giannis Stoitsis
University of Alcala, Spain
Agro-Know Technologies, Greece
RecSys Challenge 2012, Dublin
2. the learning case
• technology-enhanced learning investigates how
information and communication technologies can
be used to support learning and teaching, and
competence development throughout life.
• various levels/contexts
– school
– higher education and research
– vocational education and training
– adult education
5. handling multiple, diverse sets &
streams
• various types of social data
• different schemas and formats
• multiple languages and dimensions
Single criteria Multi-criteria
6. why?
• support various usage and recommendation
scenarios
• combining data from various sources may
boost the way recommender work in
education
– bigger data
– federated recommender systems
– open science platform
7. a European social data infrastructure
for learning
…portals…
Meta Social Meta Social Meta Social
Social data data Data
Data data Data
Data
API API API API
Federated Aggregation of metadata, social and usage data
Recommendation
services
Resolution
services
Social Metadata
Data per URI
Anonymised
9. challenges
• define common metadata schema
• harvest/crawl social data
• transform each social data schema
• uri resolution
• scalability
• anonymised approach
• develop item-based non personalized
algorithms that can perform well
11. web app for testing neighborhood-based recommendation
algorithms with multi-criteria rating dataset
Export data
(sql, csv)
I need
Refine
more!!! Login
data
Transfom
Import dataset
dataset (sql,
csv, xml) Create
Prepare dataset
dataset Data
characteristics
Visualize
dataset
Visualize
RecSys Export results
researcher/ results
developer
12. architecture
Web UI Developers
API
Components
Refine and Prepare/p
Import Visualize Evaluate
transform rocess
API
Cloud/Grid infra
Monte Carlo Social Social Social Recommender
Data Data Data
Simulator services
14. experience from multi-criteria rating
dataset from a teachers portal
e.g. integration in classroom,
relevance to topics, ability to help
students learn
Size of the neighborhood Correlation Weight Threshold value
Example of using Recommendation API: recommend(itemURI,limit_of_resources), recommend(itemURI,user_tags) Example of social data API provided by the aggregator: get_tags(itemURI), get_reviews(itemURI) etc
Here we present the architecture of such an environment and the proposed software stackMonte Carlo will be a separate component that can run also on the Grid and that will br provided through an API. The API will be documented.