Potentials and Limitations of        Educational Datasets        Hendrik Drachsler            Open University of the Nethe...
Hendrik Drachsler• Assistant professor at the Centre for Learning  Sciences and Technologies (CELSTEC)• Track record in TE...
dataTELPotentials and Limitations of Educational Datasets24.07.2011 MUP/PLE lecture series, Knowledge Media Institute, Ope...
Goals of the lecture 1.Motivation or dataTEL 2.The dataTEL project 3.Potentials of dataTEL 4.Open issues of dataTEL       ...
TEL RecSys Research         5
Survey on TEL Recommender           6
Survey on TEL RecommenderManouselis, N., Drachsler, H., Vuorikari, R., Hummel, H. G. K., & Koper, R. (2011).Recommender Sy...
Survey on TEL Recommender      Observation:      Half of the systems (11/20) still at design or prototyping stage       on...
Survey on TEL Recommender    Observation:  Conclusion:  Small-scale experiments with a fewdesign or that rate some     Hal...
The TEL recommender research is a           bit like this...                7
The TEL recommender research is a           bit like this...         We need to design for each domain anappropriate recom...
But...“The performance resultsof different researchefforts in TELrecommender systemsare hardly comparable.”(Manouselis et ...
But...“The performance resultsThe TEL recommenderof different researchexperiments lackefforts in TELtransparency. They nee...
How others compare their    recommenders           9
How others compare their        recommendersAlthough the TEL domain stores plenty ofdata everyday in e-learning environmen...
Goals of the lecture1.Motivation or dataTEL2.The dataTEL project3.Potentials of dataTEL4.Open issues of dataTEL           ...
Who is dataTEL ?      dataTEL is a Theme Team funded by the          STELLAR network of excellence  Riina   Stephanie    K...
Who is dataTEL ?         dataTEL is a Theme Team funded by the             STELLAR network of excellence  Riina   Stephani...
Who is dataTEL ?         dataTEL is a Theme Team funded by the             STELLAR network of excellence  Riina   Stephani...
dataTEL::ObjectivesMake the research on TEL RecSys more comparable bylowering the entrance barriers for other researchers ...
dataTEL::Objectives1.Collecting publicly available datasets2.Sharing policy to (re)use and share datasets3.Define dataset s...
dataTEL::Collection         14
dataTEL::Collection        15
dataTEL::CollectionDrachsler, H., Bogers, T., Vuorikari, R., Verbert, K., Duval, E., Manouselis, N.,Beham, G., Lindstaedt,...
dataTEL::Collection   •Collected data is very different with    respect to amount of users and    resources   •Most of the...
dataTEL::Collection         16
dataTEL::Collection         16
dataTEL::Collection         16
dataTEL::Collection         16
dataTEL::Body of knowledgeVerbert, K., Drachsler, H., Manouselis, N., Wolpers, M., Vuorikari, R., Beham, G., Duval, E.,(20...
dataTEL::Body of knowledge                                                  Outcomes:                                     ...
dataTEL::Body of knowledge                                                  Outcomes:                                     ...
Goals of the lecture 1.Motivation or dataTEL 2.The dataTEL project 3.Potentials of dataTEL 4.Open Issues of dataTEL       ...
Potentials of Open DataExample by Tim Berners-Lee: The year open data went worldwide, TED talk FEB 2010                   ...
Potentials of Open DataExample by Tim Berners-Lee: The year open data went worldwide, TED talk FEB 2010                   ...
Data = New Science Paradigm• Thousand years ago science was empirical (Describing natural phenomena)                      ...
Data = New Science Paradigm• Thousand years ago science was empirical (Describing natural phenomena)• Last few hundred yea...
Data = New Science Paradigm• Thousand years ago science was empirical (Describing natural phenomena)• Last few hundred yea...
Data = New Science Paradigm• Thousand years ago science was empirical (Describing natural phenomena)• Last few hundred yea...
Promises of Open Data for TEL              21
Promises of Open Data for TELUnexploited potentials for TEL:• The evaluation of learning theories and learning technology ...
Data Products      22
Data Products      22
Data Products      22
Data ProductsW. Reinhardt, C. Mletzko, H. Drachsler, and P. Sloep. AWESOME: A widget-baseddashboard for awareness-support ...
Data Products      Educational Data Products      • Drop-out Analyzer      • Group Formation Recommender      • Question-A...
Goals of the lecture1.Motivation or dataTEL2.The dataTEL project3.Potentials of dataTEL4.Open issues of dataTEL           ...
dataTEL::Open issues1.Privacy2.Prepare datasets3.Share datasets4.Body of knowledge              24
Privacy   25
Privacy   25
PrivacyOVERSHARING    25
Privacy               OVERSHARINGWere the founders of PleaseRobMe.com actuallyallowed to take the data from the web and pr...
Privacy               OVERSHARINGWere the founders of PleaseRobMe.com actuallyallowed to take the data from the web and pr...
Privacy   26
Privacy1.Privacy as confidentiality  The right to be let alone (Warren and Brandeis, 1890)                           26
Privacy1.Privacy as confidentiality  The right to be let alone (Warren and Brandeis, 1890)2.Privacy as control  The right o...
Privacy1.Privacy as confidentiality  The right to be let alone (Warren and Brandeis, 1890)2.Privacy as control  The right o...
Privacy solutions        27
Privacy solutions1.Privacy as confidentiality  Information services that minimizing, secure or  anonymize the collected inf...
Privacy solutions1.Privacy as confidentiality  Information services that minimizing, secure or  anonymize the collected inf...
Privacy solutions1.Privacy as confidentiality  Information services that minimizing, secure or  anonymize the collected inf...
Prepare datasets            Justin Marshall, Coded Ornament by            rootoftwo            http://www.flickr.com/photos...
Prepare datasets1. Create a dataset thatrealistically reflects thevariables of the learningsetting.                        ...
Prepare datasets1. Create a dataset thatrealistically reflects thevariables of the learningsetting.2. Use a sufficiently lar...
Prepare datasets1. Create a dataset thatrealistically reflects thevariables of the learningsetting.2. Use a sufficiently lar...
Prepare datasetsFor informal data sets:1. Collect data2. Process data3. Document data4. Share dataFor formal data setsfrom...
Prepare datasets       30
Share/cite datasets         31
Sharing policies        32
Sharing policies        32
Sharing policies        32
Sharing policies        32
Sharing policy guidelinesA brief guide on data licenses developed by SURF and the Centre forIntellectual Property Law (CIE...
Body of knowledge                         DatasetsFormal                                                Informal   Data A ...
Body of knowledge        35
Body of knowledge        35
Body of knowledge        35
Body of knowledge        35
Body of knowledge        35
dataTEL::SIG   http://www.teleurope.eu/pg/groups/9405/datatel/Objectives:• Representing dataTEL researchers to promote the...
Many thanks for your interests                                              37picture by Tom Raftery   http://www.flickr.co...
Many thanks for your interests                                       Free                                     the data    ...
Many thanks for your interests   This silde is available at:   http://www.slideshare.com/Drachsler   Email:       hendrik....
Upcoming SlideShare
Loading in...5
×

Potentials and Limitations of Educational Datasets

2,196

Published on

Lecture for Master students given at KMI podium, see podcast here http://stadium.open.ac.uk/podium/

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
2,196
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Potentials and Limitations of Educational Datasets"

  1. 1. Potentials and Limitations of Educational Datasets Hendrik Drachsler Open University of the Netherlands
  2. 2. Hendrik Drachsler• Assistant professor at the Centre for Learning Sciences and Technologies (CELSTEC)• Track record in TEL projects such as TENCompetence, SC4L, LTfLL, Handover, dataTEL.• Main research focus: – Personalization of learning with information retrieval technologies, recommender systems and educational datasets – Visualization of educational data, data mash-up environments, supporting context-awareness by data mining – Social and ethical implications of data mining in education• Leader of the dataTEL Theme Team of the STELLAR network of excellence (join the SIG on TELeurope.eu)• Just recently: new alterEGO project granted by the Netherlands Laboratory for Lifelong Learning (on limitations of learning analytics in formal and informal learning)
  3. 3. dataTELPotentials and Limitations of Educational Datasets24.07.2011 MUP/PLE lecture series, Knowledge Media Institute, Open University UKHendrik Drachsler #dataTELCentre for Learning Sciences and Technology@ Open University of the Netherlands3
  4. 4. Goals of the lecture 1.Motivation or dataTEL 2.The dataTEL project 3.Potentials of dataTEL 4.Open issues of dataTEL 4
  5. 5. TEL RecSys Research 5
  6. 6. Survey on TEL Recommender 6
  7. 7. Survey on TEL RecommenderManouselis, N., Drachsler, H., Vuorikari, R., Hummel, H. G. K., & Koper, R. (2011).Recommender Systems in Technology Enhanced Learning. In P. B. Kantor, F.Ricci, L. Rokach, & B. Shapira (Eds.), Recommender Systems Handbook (pp.387-415). Berlin: Springer. 6
  8. 8. Survey on TEL Recommender Observation: Half of the systems (11/20) still at design or prototyping stage only 8 systems evaluated through trials with human users.Manouselis, N., Drachsler, H., Vuorikari, R., Hummel, H. G. K., & Koper, R. (2011).Recommender Systems in Technology Enhanced Learning. In P. B. Kantor, F.Ricci, L. Rokach, & B. Shapira (Eds.), Recommender Systems Handbook (pp.387-415). Berlin: Springer. 6
  9. 9. Survey on TEL Recommender Observation: Conclusion: Small-scale experiments with a fewdesign or that rate some Half of the systems (11/20) still at learners prototyping stage resources only addsevaluated through trialsa knowledge base only 8 systems little contributions to with human users. on recommender systems and personalization in TEL.Manouselis, N., Drachsler, H., Vuorikari, R., Hummel, H. G. K., & Koper, R. (2011).Recommender Systems in Technology Enhanced Learning. In P. B. Kantor, F.Ricci, L. Rokach, & B. Shapira (Eds.), Recommender Systems Handbook (pp.387-415). Berlin: Springer. 6
  10. 10. The TEL recommender research is a bit like this... 7
  11. 11. The TEL recommender research is a bit like this... We need to design for each domain anappropriate recommender system that fits the goals, tasks, and particular constraints 7
  12. 12. But...“The performance resultsof different researchefforts in TELrecommender systemsare hardly comparable.”(Manouselis et al., 2010) Kaptain Kobold http://www.flickr.com/photos/ kaptainkobold/3203311346/ 8
  13. 13. But...“The performance resultsThe TEL recommenderof different researchexperiments lackefforts in TELtransparency. They needrecommender systemsto be repeatable to test:are hardly comparable.”• Validity(Manouselis et al., 2010)• Verification• Compare results Kaptain Kobold http://www.flickr.com/photos/ kaptainkobold/3203311346/ 8
  14. 14. How others compare their recommenders 9
  15. 15. How others compare their recommendersAlthough the TEL domain stores plenty ofdata everyday in e-learning environments(LMS, PLEs) there is a lack of shareableand publicly available datasets. 9
  16. 16. Goals of the lecture1.Motivation or dataTEL2.The dataTEL project3.Potentials of dataTEL4.Open issues of dataTEL 10
  17. 17. Who is dataTEL ? dataTEL is a Theme Team funded by the STELLAR network of excellence Riina Stephanie Katrien Nikos Martin HendrikVuorikari Lindstaedt Verbert Manouselis Wolpers Drachsler 11
  18. 18. Who is dataTEL ? dataTEL is a Theme Team funded by the STELLAR network of excellence Riina Stephanie Katrien Nikos Martin HendrikVuorikari Lindstaedt Verbert Manouselis Wolpers Drachsler MAVSEL CEN PT Social Data Miguel JorisAngel Sicillia Klerkx11
  19. 19. Who is dataTEL ? dataTEL is a Theme Team funded by the STELLAR network of excellence Riina Stephanie Katrien Nikos Martin HendrikVuorikari Lindstaedt Verbert Manouselis Wolpers Drachsler MAVSEL CEN PT Social Data Miguel JorisAngel Sicillia Klerkx11
  20. 20. dataTEL::ObjectivesMake the research on TEL RecSys more comparable bylowering the entrance barriers for other researchers andincrease the quality.The required benchmarks therefore are:1.A collection of public available datasets ranging from formal to non-formal learning settings2.An overview of the research results of certain RecSys technologies on different datasets3.A common approach to evaluate RecSys in the domain of TEL 12
  21. 21. dataTEL::Objectives1.Collecting publicly available datasets2.Sharing policy to (re)use and share datasets3.Define dataset standards (documentation, pre- processing)4.Address privacy and legal protection rights5.Create evaluation criteria for TEL recommender systems6.Create a body of knowledge on personalization in TEL 13
  22. 22. dataTEL::Collection 14
  23. 23. dataTEL::Collection 15
  24. 24. dataTEL::CollectionDrachsler, H., Bogers, T., Vuorikari, R., Verbert, K., Duval, E., Manouselis, N.,Beham, G., Lindstaedt, S., Stern, H., Friedrich, M., & Wolpers, M. (2010). Issuesand Considerations regarding Sharable Data Sets for RecommenderSystems in Technology Enhanced Learning. Presentation at the 1st WorkshopRecommnder Systems in Technology Enhanced Learning (RecSysTEL) in conjunctionwith 5th European Conference on Technology Enhanced Learning (EC-TEL 2010):Sustaining TEL: From Innovation to Learning and Practice. September, 28, 2010,Barcelona, Spain. 15
  25. 25. dataTEL::Collection •Collected data is very different with respect to amount of users and resources •Most of the data is very sparse •Privacy regulations harm data sharing •Mostly data from R., Verbert, K., Duval, E., Manouselis, N.,Drachsler, H., Bogers, T., Vuorikari, informal learning settingsBeham, G., Lindstaedt, S., Stern, H., Friedrich, M., & Wolpers, M. (2010). Issuesand Considerations regarding Sharable Data Sets for RecommenderSystems in Technology Enhanced Learning. Presentation at the 1st WorkshopRecommnder Systems in Technology Enhanced Learning (RecSysTEL) in conjunctionwith 5th European Conference on Technology Enhanced Learning (EC-TEL 2010):Sustaining TEL: From Innovation to Learning and Practice. September, 28, 2010,Barcelona, Spain. 15
  26. 26. dataTEL::Collection 16
  27. 27. dataTEL::Collection 16
  28. 28. dataTEL::Collection 16
  29. 29. dataTEL::Collection 16
  30. 30. dataTEL::Body of knowledgeVerbert, K., Drachsler, H., Manouselis, N., Wolpers, M., Vuorikari, R., Beham, G., Duval, E.,(2011). Dataset-driven Research for Improving Recommender Systems for Learning. LearningAnalytics & Knowledge: February 27-March 1,17 2011, Banff, Alberta, Canada
  31. 31. dataTEL::Body of knowledge Outcomes: Tanimoto similarity + item-based CF was the most accurate.Verbert, K., Drachsler, H., Manouselis, N., Wolpers, M., Vuorikari, R., Beham, G., Duval, E.,(2011). Dataset-driven Research for Improving Recommender Systems for Learning. LearningAnalytics & Knowledge: February 27-March 1,17 2011, Banff, Alberta, Canada
  32. 32. dataTEL::Body of knowledge Outcomes: Tanimoto similarity + item-based CF was the most accurate.Outcomes:Implicit ratings like downloadrates, bookmarks cansuccessfully used in TEL.Verbert, K., Drachsler, H., Manouselis, N., Wolpers, M., Vuorikari, R., Beham, G., Duval, E.,(2011). Dataset-driven Research for Improving Recommender Systems for Learning. LearningAnalytics & Knowledge: February 27-March 1,17 2011, Banff, Alberta, Canada
  33. 33. Goals of the lecture 1.Motivation or dataTEL 2.The dataTEL project 3.Potentials of dataTEL 4.Open Issues of dataTEL 18
  34. 34. Potentials of Open DataExample by Tim Berners-Lee: The year open data went worldwide, TED talk FEB 2010 19
  35. 35. Potentials of Open DataExample by Tim Berners-Lee: The year open data went worldwide, TED talk FEB 2010 19
  36. 36. Data = New Science Paradigm• Thousand years ago science was empirical (Describing natural phenomena) 20
  37. 37. Data = New Science Paradigm• Thousand years ago science was empirical (Describing natural phenomena)• Last few hundred years science: theoretical branch (Using models, generalizations) 20
  38. 38. Data = New Science Paradigm• Thousand years ago science was empirical (Describing natural phenomena)• Last few hundred years science: theoretical branch (Using models, generalizations)• Last few decades: computational branch (Simulating complex phenomena) 20
  39. 39. Data = New Science Paradigm• Thousand years ago science was empirical (Describing natural phenomena)• Last few hundred years science: theoretical branch (Using models, generalizations)• Last few decades: computational branch (Simulating complex phenomena)• Nowadays: data science (Unify theory, experiment, and simulation, data captured by instruments and processed by software, linked data) 20
  40. 40. Promises of Open Data for TEL 21
  41. 41. Promises of Open Data for TELUnexploited potentials for TEL:• The evaluation of learning theories and learning technology from the data side• More transparent, mutually comparable, trusted and repeatable experiments that lead to evidence-driven knowledge• Development of new educational data tools / products that combine different data sources in data mashups• Gain new insights / new knowledge by combining so far unconnected resources / tools 21
  42. 42. Data Products 22
  43. 43. Data Products 22
  44. 44. Data Products 22
  45. 45. Data ProductsW. Reinhardt, C. Mletzko, H. Drachsler, and P. Sloep. AWESOME: A widget-baseddashboard for awareness-support in Research Networks. In Proceedings of the 2nd PLEConference, Southampton, UK, 2011. 22
  46. 46. Data Products Educational Data Products • Drop-out Analyzer • Group Formation Recommender • Question-Answering Tool • Awareness ToolsW. Reinhardt, C. Mletzko, H. Drachsler, and P. Sloep. AWESOME: A widget-baseddashboard for awareness-support in Research Networks. In Proceedings of the 2nd PLEConference, Southampton, UK, 2011. 22
  47. 47. Goals of the lecture1.Motivation or dataTEL2.The dataTEL project3.Potentials of dataTEL4.Open issues of dataTEL 23
  48. 48. dataTEL::Open issues1.Privacy2.Prepare datasets3.Share datasets4.Body of knowledge 24
  49. 49. Privacy 25
  50. 50. Privacy 25
  51. 51. PrivacyOVERSHARING 25
  52. 52. Privacy OVERSHARINGWere the founders of PleaseRobMe.com actuallyallowed to take the data from the web and present itin that way? 25
  53. 53. Privacy OVERSHARINGWere the founders of PleaseRobMe.com actuallyallowed to take the data from the web and present itin that way?Are we allowed to use data from social services andreuse it for research purposes? 25
  54. 54. Privacy 26
  55. 55. Privacy1.Privacy as confidentiality The right to be let alone (Warren and Brandeis, 1890) 26
  56. 56. Privacy1.Privacy as confidentiality The right to be let alone (Warren and Brandeis, 1890)2.Privacy as control The right of the individual to decide what information about herself should be communicated to others and under which circumstances. 26
  57. 57. Privacy1.Privacy as confidentiality The right to be let alone (Warren and Brandeis, 1890)2.Privacy as control The right of the individual to decide what information about herself should be communicated to others and under which circumstances.3.Privacy as practice The right to intervene in the flows of existing data and the re-negotiation of boundaries with respect to collected data. 26
  58. 58. Privacy solutions 27
  59. 59. Privacy solutions1.Privacy as confidentiality Information services that minimizing, secure or anonymize the collected information 27
  60. 60. Privacy solutions1.Privacy as confidentiality Information services that minimizing, secure or anonymize the collected information2.Privacy as control Identity Management Systems (IDMS), with access control rules 27
  61. 61. Privacy solutions1.Privacy as confidentiality Information services that minimizing, secure or anonymize the collected information2.Privacy as control Identity Management Systems (IDMS), with access control rules3.Privacy as practice Timestamp on data, data degradation technologies 27
  62. 62. Prepare datasets Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/ 267285816 28
  63. 63. Prepare datasets1. Create a dataset thatrealistically reflects thevariables of the learningsetting. Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/ 267285816 28
  64. 64. Prepare datasets1. Create a dataset thatrealistically reflects thevariables of the learningsetting.2. Use a sufficiently largeset of user profiles Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/ 267285816 28
  65. 65. Prepare datasets1. Create a dataset thatrealistically reflects thevariables of the learningsetting.2. Use a sufficiently largeset of user profiles3. Create datasets thatare comparable to others Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/ 267285816 28
  66. 66. Prepare datasetsFor informal data sets:1. Collect data2. Process data3. Document data4. Share dataFor formal data setsfrom LMS:1. Data storing scripts2. Anonymisation scripts3. Document data4. Share data 29
  67. 67. Prepare datasets 30
  68. 68. Share/cite datasets 31
  69. 69. Sharing policies 32
  70. 70. Sharing policies 32
  71. 71. Sharing policies 32
  72. 72. Sharing policies 32
  73. 73. Sharing policy guidelinesA brief guide on data licenses developed by SURF and the Centre forIntellectual Property Law (CIER), 2009 available atwww.surffoundation.nl 33
  74. 74. Body of knowledge DatasetsFormal Informal Data A Data B Data CAlgorithms: Algorithms: Algorithms:Algoritmen A Algoritmen D Algoritmen BAlgoritmen B Algoritmen E Algoritmen DAlgoritmen CModels: Models: Models:Learner Model A Learner Model C Learner Model ALearner Model B Learner Model E Learner Model CMeasured attributes: Measured attributes: Measured attributes:Attribute A Attribute A Attribute AAttribute B Attribute B Attribute BAttribute C Attribute C Attribute C 34
  75. 75. Body of knowledge 35
  76. 76. Body of knowledge 35
  77. 77. Body of knowledge 35
  78. 78. Body of knowledge 35
  79. 79. Body of knowledge 35
  80. 80. dataTEL::SIG http://www.teleurope.eu/pg/groups/9405/datatel/Objectives:• Representing dataTEL researchers to promote the release of open datasets from educational providers• Fostering the standardizations of datasets to enable exchange and interoperability• Contributing to policies on ethical guidelines (privacy and legal protection rights)• Fostering a shared understanding of evaluation methods in TEL RecSys and Learning Analytics technologies. 36
  81. 81. Many thanks for your interests 37picture by Tom Raftery http://www.flickr.com/photos/traftery/4773457853/sizes/l
  82. 82. Many thanks for your interests Free the data 37picture by Tom Raftery http://www.flickr.com/photos/traftery/4773457853/sizes/l
  83. 83. Many thanks for your interests This silde is available at: http://www.slideshare.com/Drachsler Email: hendrik.drachsler@ou.nl Skype: celstec-hendrik.drachsler Blogging at: http://www.drachsler.de Twittering at: http://twitter.com/HDrachsler 38

×