4. collected data
•id user item
•Name* URL
•Email*
tags
Organic.Edunet •Value
social data schema •Date
•Value •Dimension
•Date reviews ratings •Value
5. current service
• recommendation of potentially interesting
learning resources to users
– not very “loud”
• one recommendation algorithm based on
collaborative filtering
– rating history
– neighborhood-based
– multi-attribute over 3 criteria
[Subject Relevance, Educational Usefulness, Metadata]
– parameters defined & hard-coded
6. issues
• lots of parameters could be different
– selected recommendation methods
– neighborhood size
– similarity measures
• parameterization took place using a similar
dataset [but not the same]
– EUN’s Learning Resource Exchange (MELT) multi-
attribute ratings dump
• Organic.Edunet’s user/content base
continuously evolves
9. problem outline
• How do we know that the selected
algorithm is still(?) good for the given
portal?
– specific rating dimensions (criteria)
– selected parameterization
– alternative algorithms
– specific dataset & its expected evolution
11. approach
• carry out same experiment: simulation of
how multi-attribute collaborative
filtering algorithms perform
– real data from Organic.Edunet users
– simulated/synthetic data from expected
future scenario (when more ratings will be
provided)
– base algorithms from 2007 vs.
additional/alternative algorithms
12. real data from Organic.Edunet
• 477 ratings
– 99 users (only 0.02% of registered ones)
– 345 items (only 0.03% of indexed resources)
14. 2007 base algorithms
• Manouselis & Costopoulou (2006;2007)
• classic neighborhood-based
collaborative filtering
– extended for multi-criteria ratings
– prediction per criterion (PG)
– many parameters open for
tweaking/experimentation
• different algorithm variations
15.
16.
17. additional/alternative algorithms
• Adomavicius & Kwon (2007)
• similar approach, neighborhood-based
collaborative filtering extended for multi-
criteria ratings
– weights prediction based with average (AS)
or minimum (WS) similarities per criterion
– same parameters open for
tweaking/experimentation
• different algorithm variations
18. overall experiment setting
• 18 variations of each examined algorithm (PG,
AW, WS)
– plus some base non-personalised ones
• various values for parameters defining the
neighborhood size
-> over 1,080 algorithmic variations executed
and compared over each dataset
21. best over both
Algorithm Similarity Normalization method AVG Coverage AVG MAE
MNN variations
PG Cosine Deviation-from-Mean 61.33% 0.8855
PG Euclidian Simple Mean 61.33% 0.8626
CWT variations
PG Cosine Deviation-from-Mean 57.91% 0.8908
PG Cosine Simple Mean 57.91% 0.8673
2007:
22. implementation implications
• based on existing dataset and the
foreseen future scenario
– keep same algorithm (PG) for
recommendation service
– adapt selection of options and their
parameterization
– “actual” performance (vs. 2007) is probably
worse
24. lessons learnt
• after 2 years of service operation
– tried to repeat an offline experimental simulation
– candidate multi-criteria recommendation
algorithms
– data from real usage vs. synthetic data
• feeling better about algorithm choice
– some insight into expected performance
– not real impact into the actual service
25. to explore
• would be interesting to experiment with more
future scenarios
– make various estimations/projections about
dataset size and sparseness
– execute algorithms over synthetic datasets
simulating these projections
• would be interesting to make a service that is
really used
– get more ratings, on more items
– provide visible recommendations
– measure impact to search/discovery behaviour
27. experiments beyond a single dataset
• combining data from various sources to boost
the way recommenders work
• design algorithms that could provide cross-
border recommendations
• provide many parallel/cascading/competing
options for recommendation algorithms
• not really care about data size & storage
28. a social data infrastructure for learning
…portals…
Meta Social Meta Social Meta Social
Social data data Data
Data data Data
Data
API API API API
Federated
Recommendation
Aggregation of metadata, social and usage data
Services
Resolution
services
Social Metadata
Data per URI
www.opendiscoveryspace.eu Anonymised
29.
30.
31. challenges
• define common metadata schema(s)
• aggregate (e.g. harvest/crawl) social data
• transform each social data schema
• URI resolution
• scalability
• anonymised approach
• …