•

1 like•164 views

These slides present the preliminary results through the utilisation of machine learning techniques for the analysis of Educational Robotics activities. An experimentation with 197 secondary school students from Italy was con-ducted, through updating Lego Mindstorms EV3 programming blocks in order to record log files containing the coding sequences designed by the students (within team work), during the resolution of a preliminary Robotics’ exercise. We utilised four machine learning techniques (logistic regression, support vec-tor machine, K-nearest neighbors and random forests) to predict the students’ performance, comparing a supervised approach (using twelve indicators ex-tracted from the log files as input for the algorithms) and a mixed approach (ap-plying a k-means algorithm to calculate the machine learning features). The re-sults have highlighted that SVM with the mixed approach outperformed the other techniques, and that three learning styles were predominantly emerged from the data mining analysis.

- 1. Analysis of Educational Robotics activities using a machine learning approach L. Cesaretti*,**, L. Screpanti*, D. Scaradozzi*,***, E. Mangina**** * Department of Information Engineering, Università Politecnica delle Marche, Ancona, Italy ** Talent srl, Osimo, Italy *** LSIS- umr CNRS 6168, Laboratoire des Sciences de l'Information et des Systèmes Equipe I&M (ESIL) **** School of Computer Science, University College Dublin, Dublin, Ireland
- 2. Why this research project? In the Educational Robotics (ER) field researchers have identified lack of quantitative analysis on how robotics can improve skills and increase learning achievements in students (Benitti 2012; Alimisis, 2013). Alimisis, D. (2013). Educational robotics: Open questions and new challenges. Themes in Science and Technology Education, 6(1), 63-71. Benitti, F. B. V. (2012). Exploring the educational potential of robotics in schools: A systematic review. Computers & Education, 58(3), 978-988.
- 3. Machine learning and Data mining are everywhere! Made for me That's a prediction “of what Netflix thinks you may enjoy watching, based on your own unique tastes”.
- 4. Real time monitoring of students’ activity (Registered Patent: n. 102018000009636, «APPARATO DI MONITORAGGIO DI OPERAZIONI DI ASSEMBLAGGIO E PROGRAMMAZIONE»)
- 5. Case Study: Research Questions 1. identification of different patterns in the students’ problem-solving trajectories; 2. accurate prediction of students’ team final performance; and 3. correlation of the discovered patterns of students’ problem-solving with the evaluation given by the educators Applying data mining and machine learning methods to data collected from the educational environments can allow to predict and classify students’ behaviours and discover latent structural regularities to large educational dataset. Berland, M., Baker, R. S., & Blikstein, P. (2014). Educational data mining and learning analytics: Applications to constructionist research. Technology, Knowledge and Learning, 19(1-2), 205-220.
- 6. Case Study: Participants and Procedure Students from seven Italian lower and higher secondary schools, located in the Emilia Romagna and Marche regions. The total number of students involved in this study is 197. The experimentation was carried out from March 2018 to March 2019.
- 7. STUDENT’S ACTIVITY (Think – Make – Improve approach)
- 8. Program the robot so that it covers a given distance (1 m), trying to be as precise as possible. Students’ teams involved in the research project had to take into account some constraints: • the amount of time within they had to design and test their solution (15 - 20 minutes); • the teams could test the programming sequence as many times as they wanted; • they were allowed to use measuring instruments only to measure some robot’s parameters (for example the radius of the wheel). Case Study: the Introductory Exercise EVALUATION • if the error was < 4 cm, the educator considered the challenge completed; • if the error was >= 4 cm the educator considered the challenge not completed.
- 9. Case Study: how to represent the participants' activities in the robot programming activity Students’ teams designed 1113 programming sequences to solve the introductory Exercise. Each programming test realised by the students’ team can be represented as a vector composed by these 12 elements: ● Motors: the n° of Motor blocks in the sequence ● Loops: the n° of Loop blocks in the sequence. ● Conditionals: the n° of Conditional and Sensors blocks in the sequence. ● Others: the n° of blocks in the sequence belonging to different categories than Motors, Loops and Conditionals. ● Added: the n° of blocks added, compared to the previous sequence; ● Deleted: the n° of blocks deleted, compared to the previous sequence; ● Changed: the n° of blocks changed, compared to the previous sequence; ● Equal: the n° of the same blocks, compared to the previous sequence; ● Delta Motors: amount of change in Motor blocks parameters (first, second or third parameter), compared to the previous sequence (calculated only for blocks of the “Changed” category); ● Delta Loops: amount of change in Loop blocks parameters, compared to the previous sequence; ● Delta Conditionals: amount of change in Conditional blocks parameters, compared to the previous sequence; ● Delta Others: amount of change in Other blocks parameters, compared to the previous sequence.
- 10. SUPERVISED APPROACH The feature matrix was created by calculating the mean value and the standard deviation for each indicator presented in the previous section, taking into account all the trials performed by a students’ team to solve an exercise; then, the authors compared the performances of four different machine learning algorithms: • Logistic Regression • Support Vector Machine • k-nearest neighbors • Random Forest classifier in the prediction of the students’ teams final result. MIXED APPROACH Characterized by the benefits of both supervised and unsupervised methods: a k-means algorithm was applied to calculate in which clusters the programming sequences could be divided. Following the clustering, the percentage of each cluster in the programming activity of the students’ groups was calculated and these new features were then used to create a feature matrix as an input for the previously cited four supervised algorithms. Case Study: Results
- 11. Case Study: Results − Accuracy − Mean Precision (calculated considering the average value between precision in the prediction of students’ positive performance and negative performance) − Mean Recall (calculated considering the average value between the recall in the prediction of students’ positive performance and negative performance) − Mean F1 – Score (calculated considering the average value between the F1-score in the prediction of students’ positive performance and negative performance) To obtain these parameters a repeated 10-fold cross validation was performed, so that the average value and standard deviation of the previous four parameters repeating the 10-fold validation multiple times were calculated.
- 12. Case Study: Clustering Results
- 13. Case Study: Problem solving patterns 0 5 10 15 20 25 30 Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 Cluster 7 Cluster 8 Trials #ofprogrammingsequences Different teams' behaviours Mathematical / Planning Tinkering (with prevalent refining behaviuor) Tinkering (with significant high changes) Pearson correlation coefficient (between the features extracted with the k-means algorithms and the final results obtained by students’ teams) Two features show a statistically significant negative correlation: • number of trials (PCC = -0.48, p- value < 0.001) • percentage of sequences of the cluster named “HIGH MOTORS PARAMETERS CHANGE” (PCC = - 0.39, p-value < 0.01); High values of these features (typical for the “Tinkering with significantly high changes” teams) indicate higher probability of a negative performance.
- 14. Case Study: Mathematical behaviour FINAL ERROR = 1 CM
- 15. Case Study: Tinkering behaviour (with refining) FINAL ERROR = 1,5 CM
- 16. 0 5 10 15 20 25 30 35 40 45 50 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 DeltaMotors Test number Case Study: Tinkering behaviour (with large changes) FINAL ERROR = 50 CM
- 17. Conclusions • Compare the performance of the machine learning techniques already applied (SVM, Logistic Regression, KNN, Random Forest) with MLP Neural Network → Larger Dataset • Another improvement for future development of this study will be time tracking in the log files generated by the system. • Authors intend also to utilise recurrent neural network, in particular the long short-term memory autoencoders (a structure specifically designed to support sequences of input data), in order to translate the programming sequences created by students into fixed-length vectors (compress representation of the input data), maintaining high level of information content. • Another planned development is the update of the current system design with a personalised e- learning system: an educational recommender system could give real-time feedback to teachers and students involved in Educational Robotics activities, or propose personalised learning path to learners.
- 18. Thank you! Questions? Contacts: Lorenzo Cesaretti l.cesaretti@pm.univpm.it David Scaradozzi d.scaradozzi@univpm.it Laura Screpanti l.screpanti@pm.univpm.it