Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

2,164 views
2,090 views

Published on

Presentation give at the dataTEl Marketplace at the RecSysTEL workshop a

2 Comments
3 Likes
Statistics
Notes
No Downloads
Views
Total views
2,164
On SlideShare
0
From Embeds
0
Number of Embeds
365
Actions
Shares
0
Downloads
37
Comments
2
Likes
3
Embeds 0
No embeds

No notes for slide

Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

  1. 1. 1
  2. 2. Free the data by Tom Raftery http://www.flickr.com/photos/traftery/4773457853/sizes/l 2
  3. 3. Why ? by Tom Raftery http://www.flickr.com/photos/traftery/4773457853/sizes/l 2
  4. 4. Because we will get new insights by Tom Raftery http://www.flickr.com/photos/traftery/4773457853/sizes/l 2
  5. 5. Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL 29.09.2010 RecSysTEL workshop at RecSys and ECTEL conference, Barcelona, Spain Hendrik Drachsler #dataTEL Centre for Learning Sciences and Technology @ Open University of the Netherlands 3
  6. 6. What is dataTEL • dataTEL is a Theme Team funded by the STELLAR network of excellence. • It address 2 STELLAR Grand Challenges 1. Connecting Learner 2. Contextualisation 4
  7. 7. Recommender in TEL 5
  8. 8. The TEL recommender are a bit like this... 6
  9. 9. The TEL recommender are a bit like this... We need to select for each application an appropriate recsys that fits its needs. 6
  10. 10. But... “The performance results of different research efforts in recommender systems are hardly comparable.” (Manouselis et al., 2010) Kaptain Kobold http://www.flickr.com/photos/ kaptainkobold/3203311346/ 7
  11. 11. But... The TEL recommender “The performance results experiments lack of different research transparency. They need efforts in recommender to be repeatable to test: systems are hardly comparable.” • Validity • Verificationet al., 2010) (Manouselis • Compare results Kaptain Kobold http://www.flickr.com/photos/ kaptainkobold/3203311346/ 7
  12. 12. How others compare their recommenders 8
  13. 13. What kind of data set we can Fin f c expect in TEL M Ma s f W a Graphic by J. Cross, Informal learning, Pfeifer (2006) 9
  14. 14. The Long Tail Graphic Wilkins, D., (2009); Long10 concept Anderson, C. (2004) tail
  15. 15. The Long Tail of Learning Graphic Wilkins, D., (2009); Long10 concept Anderson, C. (2004) tail
  16. 16. The Long Tail of Learning Formal Data Informal Data Graphic Wilkins, D., (2009); Long10 concept Anderson, C. (2004) tail
  17. 17. Guidelines for Data Set Aggregation Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/ 267285816 11
  18. 18. Guidelines for Data Set Aggregation 1. Create a data set that realistically reflects the variables of the learning setting. Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/ 267285816 11
  19. 19. Guidelines for Data Set Aggregation 1. Create a data set that realistically reflects the variables of the learning setting. 2. Use a sufficiently large set of user profiles Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/ 267285816 11
  20. 20. Guidelines for Data Set Aggregation 1. Create a data set that realistically reflects the variables of the learning setting. 2. Use a sufficiently large set of user profiles 3. Create data sets that are comparable to others Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/ 267285816 11
  21. 21. Guidelines for Data Set Aggregation 1. Create a data set that realistically reflects the variables of the learning setting. 2. Use a sufficiently large set of user profiles 3. Create data sets that are comparable to others Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/ 267285816 11
  22. 22. dataTEL::Objectives Standardize research on recommender systems in TEL Five core questions: 1.How can data sets be shared according to privacy and legal protection rights? 2.How to development a policy to use and share data sets? 3.How to pre-process data sets to make them suitable for other researchers? 4.How to define common evaluation criteria for TEL recommender systems? 5.How to develop overview methods to monitor the performance of TEL recommender systems on data sets? 12
  23. 23. 1. Protection Rights 13
  24. 24. 1. Protection Rights 13
  25. 25. 1. Protection Rights OVERSHARING 13
  26. 26. 1. Protection Rights OVERSHARING Were the founders of PleaseRobMe.com actually allowed to grab the data from the web and present it in that way? 13
  27. 27. 1. Protection Rights OVERSHARING Were the founders of PleaseRobMe.com actually allowed to grab the data from the web and present it in that way? Are we allowed to use data from social handles and reuse it for research purposes? 13
  28. 28. 1. Protection Rights OVERSHARING Were the founders of PleaseRobMe.com actually allowed to grab the data from the web and present it in that way? Are we allowed to use data from social handles and reuse it for research purposes? and there is more company protection rights! 13
  29. 29. 2. Policies to use Data 14
  30. 30. 2. Policies to use Data 14
  31. 31. 2. Policies to use Data 14
  32. 32. 2. Policies to use Data 14
  33. 33. 2. Policies to use Data 14
  34. 34. 2. Policies to use Data 14
  35. 35. 2. Policies to use Data 14
  36. 36. 3. Pre-Process Data Sets For informal data sets: 1. Collect data 2. Process data 3. Share data For formal data sets from LMS: 1. Data storing scripts 2. Anonymisation scripts 15
  37. 37. 4. Evaluation Criteria Combine approach by Kirkpatrick model by Drachsler et al. 2008 Manouselis et al. 2010 16
  38. 38. 4. Evaluation Criteria 1. Accuracy 2. Coverage 3. Precision Combine approach by Kirkpatrick model by Drachsler et al. 2008 Manouselis et al. 2010 16
  39. 39. 4. Evaluation Criteria 1. Accuracy 2. Coverage 3. Precision 1. Effectiveness of learning 2. Efficiency of learning 3. Drop out rate 4. Satisfaction Combine approach by Kirkpatrick model by Drachsler et al. 2008 Manouselis et al. 2010 16
  40. 40. 4. Evaluation Criteria 1. Accuracy 1. Reaction of learner 2. Coverage 2. Learning improved 3. Precision 3. Behaviour 4. Results 1. Effectiveness of learning 2. Efficiency of learning 3. Drop out rate 4. Satisfaction Combine approach by Kirkpatrick model by Drachsler et al. 2008 Manouselis et al. 2010 16
  41. 41. 4. Evaluation Criteria 1. Accuracy 1. Reaction of learner 2. Coverage 2. Learning improved 3. Precision 3. Behaviour 4. Results 1. Effectiveness of learning 2. Efficiency of learning 3. Drop out rate 4. Satisfaction Combine approach by Kirkpatrick model by Drachsler et al. 2008 Manouselis et al. 2010 16
  42. 42. 4. Evaluation Criteria 1. Accuracy 1. Reaction of learner 2. Coverage 2. Learning improved 3. Precision 3. Behaviour 4. Results 1. Effectiveness of learning 2. Efficiency of learning 3. Drop out rate 4. Satisfaction Combine approach by Kirkpatrick model by Drachsler et al. 2008 Manouselis et al. 2010 16
  43. 43. 5. Data Set Framework to Monitor Performance Datasets Formal Informal Data A Data B Data C Algorithms: Algorithms: Algorithms: Algoritmen A Algoritmen D Algoritmen B Algoritmen B Algoritmen E Algoritmen D Algoritmen C Models: Models: Models: Learner Model A Learner Model C Learner Model A Learner Model B Learner Model E Learner Model C Measured attributes: Measured attributes: Measured attributes: Attribute A Attribute A Attribute A Attribute B Attribute B Attribute B Attribute C Attribute C Attribute C 17
  44. 44. Join us for a Coffee ... http://www.teleurope.eu/pg/groups/9405/datatel/ 18
  45. 45. Upcoming event ... Workshop: dataTEL- Data Sets for Technology Enhanced Learning Date: March 30th to March 31st, 2011 Location: Ski resort La Clusaz in the French Alps, Massif des Aravis Funding: Food and lodging for 3 nights for 10 selected participants Submissions: For being funded, please send extended abstracts to http://www.easychair.org/conferences/?conf=datatel2011 Deadline for submissions: October 25th, 2010 CfP: http://www.teleurope.eu/pg/pages/view/46082/ 19
  46. 46. Many thanks for your interests This silde is available at: http://www.slideshare.com/Drachsler Email: hendrik.drachsler@ou.nl Skype: celstec-hendrik.drachsler Blogging at: http://www.drachsler.de Twittering at: http://twitter.com/HDrachsler 20

×