DataMind: An e-learning platform for Data Analysis based on R. RBelgium meetup talk.

673
-1

Published on

We're looking for people to give us feedback on the prototype containing a first introduction to R tutorial on http://beta.datamind.org.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
673
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
10
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • University project, early stage, in heavy development, we are looking forward to your feedback....
  • Bij punt 2: - Wijdoen promo + wijradengoeie lessen aanaan course takersBij punt 3: - What type of users?
  • Bijpunt 2: - Wijdoen promo + wijradengoeie lessen aanaan course takersBijpunt 3: - What type of users?
  • DataMind: An e-learning platform for Data Analysis based on R. RBelgium meetup talk.

    1. 1. An  e-­‐learning  pla,orm  for  Data  Analysis    based  on  R    Jonathan  Cornelissen,  Dieter  De  Mesmaeker,  Albert  Jorissen,  Mar5jn  Theuwissen      24/5/2013,  RBelgium  meetup  FEB,  KU  Leuven  Welcome!
    2. 2. 1.   MoIvaIon:  Why  e-­‐learning  with  and  for  R?  2.   Learner  experience    3.   Technical  overview  4.   Course  creators  experience  on  DataMind  5.   Submission  Correctness  Tests  (examples)  6.   QuesIons  and  answers?  
    3. 3. Why  e-­‐learning  with  and  for  R?  Need  for  scalable  tools  to  learn    R  and  Data  Analysis…  
    4. 4. Because of exponentially growing R user base  More  than  2  million  R  users  growing  at  40-­‐60%  yearly  Source:  hWp://r4stats.com/arIcles/popularity/  and  hWp://prezi.com/s1qrgfm9ko4i/the-­‐r-­‐ecosystem/  
    5. 5. Keyword Competition Global2Monthly2Searchesr"tutorial 0 6600introduction"to"r 0 1600online"statistics"course 0.98 1600ggplot2"tutorial 0 880statistics"course 0.85 880an"introduction"to"r 0.01 880r"book 0.06 590learning"statistics 0.38 590r"tutorials 0 590r"introduction 0.01 480statistics"courses 0.84 480statistics"introduction 0.1 480online"statistics"courses 0.99 320r"course 0.04 260r"training 0.17 260free"online"statistics"course 0.56 260statistics"training 0.62 210online"statistics"class 0.98 170statistics"class"online 0.98 140data"analysis"tutorial 0.5 110Analysis of r-project.org Analysis of Google keywordsCompare  to:    SAS  tutorial:    4400  Eviews  tutorial:    390  Stata  tutorial:    1900  Matlab  tutorial:    22200    Hadoop  tutorial:      12100  Source:  Analysis  based  on    h?p://cran.r-­‐project.org/report_cran.html  Source:  Analysis  based  on    h?p://adwords.google.com/select/keywordtoolexternal  That needs to learn the basics and the specifics of R  •  Number  of  downloads  per  month  for:  •  IntroducIon  to  R  pdfs:  140.000  •  Summary  pdfs:  50.000  •  Some  of  the  “top”  package:  (reliability/stability  of  numbers  below?)  kernlab.pdf 349,780  party.pdf 167,396  igraph.pdf 59,969  VennDiagram.pdf 30,889  mclust.pdf 19,347  KnitR.pdf 10,697  twitteR.pdf 7,507  randomForest.pdf 6,824  Ggplot2 5,924  raster.pdf 5,326  
    6. 6. Source:  hWp://r4stats.com/arIcles/popularity/    6,275  R  packages  at  all  major  repositories,  4,315  of  which  were  at  CRAN  Across  a  broad  spectrum  of  domains:  Financial  engineering,  biostaSsScs,  data  mining,  …      Because of the exponentially growing functionality  
    7. 7. Why e-learning with and for R?  
    8. 8. •  Great  books,  tutorials,…  on  R    •  But  coding  is  learned  by  doing    •  No  online  learning  interface  for  R  •  DocumentaIon  made  by  experts  for  experts,  not  for  beginners  or  intermediate  users  Learners : Students, Professionals, Researchers, EmployeesWhy e-learning with and for R?  
    9. 9. •  Great  books,  tutorials,…  on  R    •  But  coding  is  learned  by  doing    •  No  online  learning  interface  for  R  •  DocumentaIon  made  by  experts  for  experts,  not  for  beginners  or  intermediate  users  Teachers :Learners : •  Ofen  give  the  same  or  similar  feedback  to  students  in  exercise  sessions  •  Manually  correct  assignments  •  StaIc  content  •  Hard  to  get  feedback  Students, Professionals, Researchers, EmployeesWhy e-learning with and for R?  Data Analysis Professors, Consultants, Researchers, Book authors
    10. 10. InteracIve  training  Learning  by  doing  Two pillars of learning experience on DataMind  In  a  compelling  way  GamificaSon  
    11. 11. Benefits for students of learning R online1.  Everything  in  one  place:  Assignments,  sample  code,  R-­‐console,  …      2.  Lowering  the  barrier:    Start  right-­‐away  with  R,  no  installaIon,  version  problems,  ..  since  R    runs  in  the  background  on  our  servers  3.  Automated  correcIon  and  feedback  through  Submission  Correctness  Tests  (SCT)    4.  More  fun  through  gamificaIon  of  the  learning  process  
    12. 12. LIVE  DEMO  Surf  to  hNp://beta.datamind.org  
    13. 13. Exercises versus Challenges1.  Read  challenge  2.  Type  code  to  solve  the  challenge  3.  Get  result  on  certain  metric  4.  Get  ranked  on  the  leaderboard  5.  Possibility  to  improve  your  code  6.  Learn  from  others’  soluIons  1.  Read  exercise  descripIon  2.  Read  instrucIons  3.  Type  code  to  solve  the  Exercise  4.  Get  personalized  feedback  on  the  correctness  of  your  soluIon  •  For  example:  •  Forecast  R  usage  in  next  month    Metric  =  accuracy  of  forecast  •  Find  most  efficient  way  to  calculate  certain  parameter  of  a  model  Metric  =  Sme  to  compute  •  …  
    14. 14. Technical  overview  DataMind  IT  architecture  
    15. 15. R  Open-­‐source  staIsIcal  language  DataMind leverages state of the art open-sourceframeworks in the cloud•  Scaling  •  Automated  •  Affordable  
    16. 16. •  Scalable  •  Plug  &  Play  •  Easy  R  serve  Ruby  on  Rails  High  producIvity  web  applicaIon  framework  Node.js  Pla,orm  for  real-­‐Ime  scalable  network  applicaIons  R  Open-­‐source  staIsIcal  language  DataMind leverages state of the art open-sourceframeworks in the cloud
    17. 17. WebSockets  AJAX  requests  R  serve  Ruby  on  Rails  High  producIvity  web  applicaIon  framework  Node.js  Pla,orm  for  real-­‐Ime  scalable  network  applicaIons  RESTful      API  R  Open-­‐source  staIsIcal  language  Angular.js  MVC  JavaScript  framework  for  single-­‐page  applicaIons,  maintained  by  Google  DataMind leverages state of the art open-sourceframeworks in the cloud
    18. 18. Rserve: Communication with R•  Package  of  Simon  Urbanek  •  Manages  sessions  and  workspaces  •  Binary  communicaIon  •  Emulate  console  with  capture.output()  •  Detect  incomplete  statements  with  parse()  •  Catch  and  print  errors  
    19. 19. RAppArmor: Security•  EvaluaIon  of  external  code  è  Huge  security  risk  •  SoluIon:  •  Limited  access  to  OS  •  RAppArmor  •  Package  of  Jeroen  Ooms  •  R-­‐interface  to  OS  Security  •  Limit  CPU,  Memory,  Spawned  processes  
    20. 20. Course creators experience on DataMind
    21. 21. Benefits for course creation1.  Save  Time!  1.  Automated  correcIon  of  student  exercises  2.  Efficient  way  to  get  feedback  from  course  takers  3.  Scalable  distribuIon  of  course  content  2.  Visibility  for  your  package  /  courses  3.  Insights  in  your  course  4.  Per  student  tracking  1.  Number  of  aWempts  per  exercise  2.  Use  of  “hint”  and  “soluIon”  3.  Time  to  complete  per  exercise  5.  Possibility  to  use  courses/exercises  from  other  creators  
    22. 22. How to create coursesWe want your feedback!1.  Write  the  Assignment  
    23. 23. How to create coursesWe want your feedback!2.  Provide  instruc5ons  to  student  
    24. 24. How to create coursesWe want your feedback!3.  Provide  sample  code  to  help  student  geZng  started  
    25. 25. How to create coursesWe want your feedback!4.  Pre-­‐exercise  code  is  run  in  the  background  to  pre-­‐load  a  dataset,  graphs,  etc.  
    26. 26. How to create coursesWe want your feedback!5.  Provide  sample  solu5on  
    27. 27. How to create coursesWe want your feedback!6.  Write  Submission  Correctness  Test  wriNen  in  R  that  checks  the  input  of  the  student  and  returns  feedback  
    28. 28. Submission  Correctness  Tests  (examples)  
    29. 29. Submission Correctness Tests (SCT)A  Submission  Correctness  Test  checks  the  input  from  a  student  and  returns    (i)  whether  the  student’s  input  was  correct  and  (ii)  feedback  to  student.      •  These  tests  are  wriWen  in  R  •  Should  be  easy  for  a  course  creator  -­‐>    started  developing  an  R  package  DataMind  package  to  aid  course  creators  to  write  simple  tests*  *hWps://github.com/jonathancornelissen/DM  "Mistakes  are  not  errors  but  parSally  correct  soluSons  with  underlying  logic."  
    30. 30. 1.  Assignment  to  student:    x  should  be  5    2.  Student  types:                                      x <- 43.  Submission  Correctness  Test:    if( x == 5 ){DM.result <- list(TRUE, “Well done, you genius!”)}else{DM.result <- list(FALSE, “Please assign 5 to x”)}4.  Output  to  student    “Please assign 5 to x”  Simple Submission Correctness Tests (SCT)
    31. 31. 1.  Assignment  to  student:    x  should  be  5    2.  Student  types:                                      x <- 53.  Submission  Correctness  Test:    if( x == 5 ){DM.result <- list(TRUE, “Well done, you genius!”)}else{DM.result <- list(FALSE, “Please assign 5 to x”)}4.  Output  to  student    “Well done, you genius!”  Simple Submission Correctness Tests (SCT)
    32. 32. •  Everything  in  the  student’s  workspace  •  DM.user.code    all  code  wri?en  by  student  •  DM.console.output    everything  printed  to  user  console  •  DM.errors    errors  generated  when  running  students  code  INPUT  Automated exercise correction with SCTAssignment  to  the  student:  Print  a  matrix  with  3  rows  containing  the  numbers  1  up  to  9    If  Student  does  this  correctly  then:  DM.console.ouput  contains                        [,1]  [,2]  [,3]  [1,]        1        2        3  [2,]        4        5        6  [3,]        7        8        9  
    33. 33. •  Everything  in  the  student’s  workspace  •  DM.user.code    all  code  wri?en  by  student  •  DM.console.output    everything  printed  to  user  console  •  DM.errors    errors  generated  when  running  students  code  INPUT  Automated exercise correction with SCTSubmission  Correctness  Test  wriNen  by  course  creator  (poten5ally  using  DM  package)  Assignment  to  the  student:  Print  a  matrix  with  3  rows  containing  the  numbers  1  up  to  9    If  Student  does  this  correctly  then:  DM.console.ouput  contains                        [,1]  [,2]  [,3]  [1,]        1        2        3  [2,]        4        5        6  [3,]        7        8        9  DM.result <-DM.outputContains("matrix(1:9,byrow=TRUE, nrow=3)”)
    34. 34. •  Everything  in  the  student’s  workspace  •  DM.user.code    all  code  wri?en  by  student  •  DM.console.output    everything  printed  to  user  console  •  DM.errors    errors  generated  when  running  students  code  INPUT  Automated exercise correction with SCTSubmission  Correctness  Test  wriNen  by  course  creator  (poten5ally  using  DM  package)          •  Assigned  to  variable  DM.result  •  List  with  two  elements  1.  TRUE  /  FALSE  2.  Message  to  provide  to  student  with  feedback  OUTPUT  Assignment  to  the  student:  Print  a  matrix  with  3  rows  containing  the  numbers  1  up  to  9    If  Student  does  this  correctly  then:  DM.console.ouput  contains                        [,1]  [,2]  [,3]  [1,]        1        2        3  [2,]        4        5        6  [3,]        7        8        9  DM.result <-DM.outputContains("matrix(1:9,byrow=TRUE, nrow=3)”)DM.  result  is  shown  to  student  
    35. 35. SCT enable wide variety of options•  Has  the  student  esImated  a  certain  model  correctly?  •  Generated  a  transformed  Ime  series  that  fulfills  certain  condiIons?  •  Generated  a  certain  type  of  graph  ?  •  Forecasted  a  metric  of  interest  within  certain  bounds?  •  …  
    36. 36. Albert JorissenMartijn TheuwissenDieter De MesmaekerJonathan CornelissenWant to help us to build a community !for learning and teaching R online?
Contact us!!Jonathan@datamind.orgDieter@datamind.orgAlbert@datamind.orgMartijn@datamind.org
    37. 37. Q&A  QuesIons  and  Answers  
    38. 38. Filled out by 286 Academics,  professionals  and  students  from  around  the  globe.Majority  of  respondents  interested  in  free  interacIve  coursesMost  package  authors  willing  to  create    free  interacIve  tutorialsFull  data  set  of  the  survey  and  discussion  of  results  at  www.datamind.org/survey  Survey on R and education to verify interestof community  
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×