1
Free
                       the data




by Tom Raftery http://www.flickr.com/photos/traftery/4773457853/sizes/l
          ...
Why ?




by Tom Raftery http://www.flickr.com/photos/traftery/4773457853/sizes/l
                                2
Because we
                  will get new
                    insights




by Tom Raftery http://www.flickr.com/photos/traf...
Issues and Considerations regarding Sharable Data
      Sets for Recommender Systems in TEL
 29.09.2010 RecSysTEL workshop...
What is dataTEL
• dataTEL is a Theme Team funded by the
  STELLAR network of excellence.
• It address 2 STELLAR Grand Chal...
Recommender in TEL




         5
The TEL recommender
  are a bit like this...




           6
The TEL recommender
  are a bit like this...
  We need to select for each application an
    appropriate recsys that fits i...
But...
“The performance results
of different research
efforts in recommender
systems are hardly
comparable.”

(Manouselis ...
But...
The TEL recommender
“The performance results
experiments lack
of different research
transparency. They need
efforts...
How others compare their
    recommenders




           8
What kind of data set we can
                                                         Fin
                                ...
The Long Tail




Graphic Wilkins, D., (2009); Long10 concept Anderson, C. (2004)
                                  tail
The Long Tail of Learning




Graphic Wilkins, D., (2009); Long10 concept Anderson, C. (2004)
                            ...
The Long Tail of Learning
  Formal Data

                               Informal Data




Graphic Wilkins, D., (2009); Lon...
Guidelines for Data Set
     Aggregation




                Justin Marshall, Coded Ornament by
                rootoftwo
...
Guidelines for Data Set
        Aggregation
1. Create a data set that
realistically reflects the
variables of the learning
...
Guidelines for Data Set
        Aggregation
1. Create a data set that
realistically reflects the
variables of the learning
...
Guidelines for Data Set
        Aggregation
1. Create a data set that
realistically reflects the
variables of the learning
...
Guidelines for Data Set
        Aggregation
1. Create a data set that
realistically reflects the
variables of the learning
...
dataTEL::Objectives
                   Standardize research on
                recommender systems in TEL

Five core quest...
1. Protection Rights




        13
1. Protection Rights




        13
1. Protection Rights
  OVERSHARING




        13
1. Protection Rights
              OVERSHARING
Were the founders of PleaseRobMe.com actually
allowed to grab the data from...
1. Protection Rights
              OVERSHARING
Were the founders of PleaseRobMe.com actually
allowed to grab the data from...
1. Protection Rights
              OVERSHARING
Were the founders of PleaseRobMe.com actually
allowed to grab the data from...
2. Policies to use Data




          14
2. Policies to use Data




          14
2. Policies to use Data




          14
2. Policies to use Data




          14
2. Policies to use Data




          14
2. Policies to use Data




          14
2. Policies to use Data




          14
3. Pre-Process Data Sets
For informal data sets:

1. Collect data
2. Process data
3. Share data

For formal data sets
from...
4. Evaluation Criteria




Combine approach by           Kirkpatrick model by
 Drachsler et al. 2008        Manouselis et ...
4. Evaluation Criteria
 1. Accuracy
 2. Coverage
 3. Precision




Combine approach by           Kirkpatrick model by
 Dra...
4. Evaluation Criteria
 1. Accuracy
 2. Coverage
 3. Precision



1. Effectiveness of learning
2. Efficiency of learning
3....
4. Evaluation Criteria
 1. Accuracy                      1. Reaction of learner
 2. Coverage                      2. Learn...
4. Evaluation Criteria
 1. Accuracy                      1. Reaction of learner
 2. Coverage                      2. Learn...
4. Evaluation Criteria
 1. Accuracy                      1. Reaction of learner
 2. Coverage                      2. Learn...
5. Data Set Framework to Monitor
           Performance
               Datasets
Formal                                    ...
Join us for a Coffee ...
http://www.teleurope.eu/pg/groups/9405/datatel/




                      18
Upcoming event ...
Workshop: dataTEL- Data Sets for Technology Enhanced Learning
Date: March 30th to March 31st, 2011
Loca...
Many thanks for your interests
   This silde is available at:
   http://www.slideshare.com/Drachsler

   Email:       hend...
Upcoming SlideShare
Loading in …5
×

Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

2,292 views

Published on

Presentation give at the dataTEl Marketplace at the RecSysTEL workshop a

2 Comments
3 Likes
Statistics
Notes
No Downloads
Views
Total views
2,292
On SlideShare
0
From Embeds
0
Number of Embeds
365
Actions
Shares
0
Downloads
38
Comments
2
Likes
3
Embeds 0
No embeds

No notes for slide

Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

  1. 1. 1
  2. 2. Free the data by Tom Raftery http://www.flickr.com/photos/traftery/4773457853/sizes/l 2
  3. 3. Why ? by Tom Raftery http://www.flickr.com/photos/traftery/4773457853/sizes/l 2
  4. 4. Because we will get new insights by Tom Raftery http://www.flickr.com/photos/traftery/4773457853/sizes/l 2
  5. 5. Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL 29.09.2010 RecSysTEL workshop at RecSys and ECTEL conference, Barcelona, Spain Hendrik Drachsler #dataTEL Centre for Learning Sciences and Technology @ Open University of the Netherlands 3
  6. 6. What is dataTEL • dataTEL is a Theme Team funded by the STELLAR network of excellence. • It address 2 STELLAR Grand Challenges 1. Connecting Learner 2. Contextualisation 4
  7. 7. Recommender in TEL 5
  8. 8. The TEL recommender are a bit like this... 6
  9. 9. The TEL recommender are a bit like this... We need to select for each application an appropriate recsys that fits its needs. 6
  10. 10. But... “The performance results of different research efforts in recommender systems are hardly comparable.” (Manouselis et al., 2010) Kaptain Kobold http://www.flickr.com/photos/ kaptainkobold/3203311346/ 7
  11. 11. But... The TEL recommender “The performance results experiments lack of different research transparency. They need efforts in recommender to be repeatable to test: systems are hardly comparable.” • Validity • Verificationet al., 2010) (Manouselis • Compare results Kaptain Kobold http://www.flickr.com/photos/ kaptainkobold/3203311346/ 7
  12. 12. How others compare their recommenders 8
  13. 13. What kind of data set we can Fin f c expect in TEL M Ma s f W a Graphic by J. Cross, Informal learning, Pfeifer (2006) 9
  14. 14. The Long Tail Graphic Wilkins, D., (2009); Long10 concept Anderson, C. (2004) tail
  15. 15. The Long Tail of Learning Graphic Wilkins, D., (2009); Long10 concept Anderson, C. (2004) tail
  16. 16. The Long Tail of Learning Formal Data Informal Data Graphic Wilkins, D., (2009); Long10 concept Anderson, C. (2004) tail
  17. 17. Guidelines for Data Set Aggregation Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/ 267285816 11
  18. 18. Guidelines for Data Set Aggregation 1. Create a data set that realistically reflects the variables of the learning setting. Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/ 267285816 11
  19. 19. Guidelines for Data Set Aggregation 1. Create a data set that realistically reflects the variables of the learning setting. 2. Use a sufficiently large set of user profiles Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/ 267285816 11
  20. 20. Guidelines for Data Set Aggregation 1. Create a data set that realistically reflects the variables of the learning setting. 2. Use a sufficiently large set of user profiles 3. Create data sets that are comparable to others Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/ 267285816 11
  21. 21. Guidelines for Data Set Aggregation 1. Create a data set that realistically reflects the variables of the learning setting. 2. Use a sufficiently large set of user profiles 3. Create data sets that are comparable to others Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/ 267285816 11
  22. 22. dataTEL::Objectives Standardize research on recommender systems in TEL Five core questions: 1.How can data sets be shared according to privacy and legal protection rights? 2.How to development a policy to use and share data sets? 3.How to pre-process data sets to make them suitable for other researchers? 4.How to define common evaluation criteria for TEL recommender systems? 5.How to develop overview methods to monitor the performance of TEL recommender systems on data sets? 12
  23. 23. 1. Protection Rights 13
  24. 24. 1. Protection Rights 13
  25. 25. 1. Protection Rights OVERSHARING 13
  26. 26. 1. Protection Rights OVERSHARING Were the founders of PleaseRobMe.com actually allowed to grab the data from the web and present it in that way? 13
  27. 27. 1. Protection Rights OVERSHARING Were the founders of PleaseRobMe.com actually allowed to grab the data from the web and present it in that way? Are we allowed to use data from social handles and reuse it for research purposes? 13
  28. 28. 1. Protection Rights OVERSHARING Were the founders of PleaseRobMe.com actually allowed to grab the data from the web and present it in that way? Are we allowed to use data from social handles and reuse it for research purposes? and there is more company protection rights! 13
  29. 29. 2. Policies to use Data 14
  30. 30. 2. Policies to use Data 14
  31. 31. 2. Policies to use Data 14
  32. 32. 2. Policies to use Data 14
  33. 33. 2. Policies to use Data 14
  34. 34. 2. Policies to use Data 14
  35. 35. 2. Policies to use Data 14
  36. 36. 3. Pre-Process Data Sets For informal data sets: 1. Collect data 2. Process data 3. Share data For formal data sets from LMS: 1. Data storing scripts 2. Anonymisation scripts 15
  37. 37. 4. Evaluation Criteria Combine approach by Kirkpatrick model by Drachsler et al. 2008 Manouselis et al. 2010 16
  38. 38. 4. Evaluation Criteria 1. Accuracy 2. Coverage 3. Precision Combine approach by Kirkpatrick model by Drachsler et al. 2008 Manouselis et al. 2010 16
  39. 39. 4. Evaluation Criteria 1. Accuracy 2. Coverage 3. Precision 1. Effectiveness of learning 2. Efficiency of learning 3. Drop out rate 4. Satisfaction Combine approach by Kirkpatrick model by Drachsler et al. 2008 Manouselis et al. 2010 16
  40. 40. 4. Evaluation Criteria 1. Accuracy 1. Reaction of learner 2. Coverage 2. Learning improved 3. Precision 3. Behaviour 4. Results 1. Effectiveness of learning 2. Efficiency of learning 3. Drop out rate 4. Satisfaction Combine approach by Kirkpatrick model by Drachsler et al. 2008 Manouselis et al. 2010 16
  41. 41. 4. Evaluation Criteria 1. Accuracy 1. Reaction of learner 2. Coverage 2. Learning improved 3. Precision 3. Behaviour 4. Results 1. Effectiveness of learning 2. Efficiency of learning 3. Drop out rate 4. Satisfaction Combine approach by Kirkpatrick model by Drachsler et al. 2008 Manouselis et al. 2010 16
  42. 42. 4. Evaluation Criteria 1. Accuracy 1. Reaction of learner 2. Coverage 2. Learning improved 3. Precision 3. Behaviour 4. Results 1. Effectiveness of learning 2. Efficiency of learning 3. Drop out rate 4. Satisfaction Combine approach by Kirkpatrick model by Drachsler et al. 2008 Manouselis et al. 2010 16
  43. 43. 5. Data Set Framework to Monitor Performance Datasets Formal Informal Data A Data B Data C Algorithms: Algorithms: Algorithms: Algoritmen A Algoritmen D Algoritmen B Algoritmen B Algoritmen E Algoritmen D Algoritmen C Models: Models: Models: Learner Model A Learner Model C Learner Model A Learner Model B Learner Model E Learner Model C Measured attributes: Measured attributes: Measured attributes: Attribute A Attribute A Attribute A Attribute B Attribute B Attribute B Attribute C Attribute C Attribute C 17
  44. 44. Join us for a Coffee ... http://www.teleurope.eu/pg/groups/9405/datatel/ 18
  45. 45. Upcoming event ... Workshop: dataTEL- Data Sets for Technology Enhanced Learning Date: March 30th to March 31st, 2011 Location: Ski resort La Clusaz in the French Alps, Massif des Aravis Funding: Food and lodging for 3 nights for 10 selected participants Submissions: For being funded, please send extended abstracts to http://www.easychair.org/conferences/?conf=datatel2011 Deadline for submissions: October 25th, 2010 CfP: http://www.teleurope.eu/pg/pages/view/46082/ 19
  46. 46. Many thanks for your interests This silde is available at: http://www.slideshare.com/Drachsler Email: hendrik.drachsler@ou.nl Skype: celstec-hendrik.drachsler Blogging at: http://www.drachsler.de Twittering at: http://twitter.com/HDrachsler 20

×