SlideShare a Scribd company logo
1 of 39
Download to read offline
Miguel Santamaría Lancho, Mauro Hernández, ,Angeles Sánchez-Elvira, José María Luzón Encabo, Guillermo de Jorge-
Botana,
UNED, Spain
Using semantic technologies for giving a formative
assessment and supporting scoring in large courses
and MOOCs: first experiences at UNED (2015-2017)
Department of Economic
History and Applied Economics
Department of Developmental
and Educational Psychology
Economic History Teachers Team G-Rubric software developers
FACULTY OF ECONOMICS FACULTY OF PSYCHOLOGY
Miguel Santamaria José M. Luzón Guillermo de JorgeMauro Hernández
Our goal was to improve formative assessment in online courses giving personalised feedback
Department
of Personality
Ángeles Sánchez-Elvira
G-Rubric user
Summary
1. Our challenge: How semantic technologies
could help us to:
• give personalised feedback on open-ended questions
• support our tutors to score TMAs in a more reliable way
2. What G-Rubrics is and how it works?
3. Analysis of our experiences giving automatic
formative feedback on open-ended questions
4. Proposal about how G-Rubric could cope with
problems related to manual grading
5. Results and conclusions
How to give personalised feedback on
open-ended activities
•Personalising learning
• Fostering performance
improvement
• Increasing motivation
01/11/2017 msantamaria@cee.uned.es 5
FEEDBACK IS THE KEY FACTOR FOR
Wich is the kind of feedback that our students expect?
•Quick
•Iterative
• They love learning by trial and error
01/11/2017 msantamaria@cee.uned.es 6
CHARACTERISTICS OF EXPECTED
FEEDBACK
Only technology can provide this kind of feedback
Feedback based on technologies offers limited solutions
At classroom
• “clickers”
• (Socrative, Kahoot)
01/11/2017 msantamaria@cee.uned.es 7
In e-learning platforms
• Quizzes
• Adaptive quizzes
Quizzes have severe limitations to assess learning outcomes on
economic history field
Our challenge was how to give:
• quick and iterative feedback
• for open-ended questions
• in a sustainable way
• by using technologies
• Knowledge about
Economic History
• Soft skills:
• Analysis
• Critical thinking
• Multiple choice questions
• Open-ended short questions
about concepts, historical
processes, etc
• Writing comments of texts,
maps, graphs, statistical data
LEARNING OUTCOMES ASSESSMENT ACTIVITIES
WHAT G-RUBRIC IS AND HOW IT WORKS
2nd step
3rd step
1st step To build up a specialized linguistic corpus and a Semantic Space
6 Economic History textbooks
Semantic SpaceCorpus
Activities based on short open-ended questions should be developed
To deliver the activities to our students we use a web interface
Students
Web interface
IN-built rubric space
To implement G-Rubric into a subject we need to follow 3 steps
Answer
Feedback
Canon answer
Example of a G-Rubric open-ended activity
Question
Canon answer
Or Golden text
Conceptual
axes
Mercantilism: policies and objectives.
“Mercantilism is a set of ideas and policies deployed in early modern Europe
(16th, 17th and 18th centuries) aimed at strengthening the State through
economic power, and specially focused on trade-balance surpluses and
accumulation of precious metals (bullionism).
The are several types of policies, emphasizing: a) those focused on obtaining trade
balance surpluses (tariff protectionism, prohibition on exporting gold or silver or
raw materials, privileged trading companies, shipping records, colonial
monopolies); B) promotion of manufactures (import tariffs or prohibitions, laws
against luxury, real manufactures); C) other policies: favoring the birth rate,
limitation or rate of interior prices.
They are often associated with the names of Colbert in France, or the English or
Dutch companies of India (VOC).
Definition : mercantilism, ideas, practices, state, economy, monarchy, strengthen, reinforce,
increase, trade balance, favorable, bullonism, precious metals, gold, silver, privileges.
Trade policies: trade, protectionism, tariffs, prohibition, exports, imports, privileged
companies, records of navigation, colonies, monopoly, fleet, merchant, surplus
Manufacturing policies: manufactures, factories, real, luxury, import substitution
Context: Europe, England, France, Holland, Colbert, XVI, XVII, XVIII, modern, VOC, East Indies,
West Indies.
An example to understand how G-Rubric works
01/11/2017 msantamaria@cee.uned.es 12
G-Rubric web interface
The student selects an activity
01/11/2017 msantamaria@cee.uned.es 13
1.-Mercantilism
2.- Triangular Trade
3.- Coal and Ind. Rev.
4.- Gerschenkron
5.- Second Industrial Revolution
6.- Consequences of IWW
7.- Bretton Woods
1.- Mercantilism
The student introduces the answer
01/11/2017 msantamaria@cee.uned.es 14
“Mercantilism is a set of ideas and policies deployed in early modern Europe (16th, 17th
and 18th centuries) aimed at strengthening the State through economic power, and
specially focused on trade-balance surpluses and accumulation of precious metals
(bullionism).
After submitting an answeer the students receive feedback
consisting of
01/11/2017 msantamaria@cee.uned.es 15
“Mercantilism is a set of ideas and policies deployed in early modern Europe (16th, 17th and
18th centuries) aimed at strengthening the State through economic power, and specially
focused on trade-balance surpluses and accumulation of precious metals (bulionism).
Content grade
Graphical
feedback
Style
grade
Acceptance
area
Definition
Trade
Manufact
Context
Grammatical
accuracy
to what
extent the
answer is
correct.
After checking th feedback
01/11/2017 msantamaria@cee.uned.es 16
The student improves their answer by adding new information
“Mercantilism is a set of ideas and policies deployed in early modern Europe (16th, 17th
and 18th centuries) aimed at strengthening the State through economic power, and
specially focused on trade-balance surpluses and accumulation of precious metals
(bulionism).
Amongst mercantilist polices, some outstand, i.e. those focused on attaining surpluses in
trade balance through tariff protection, prohibition of exports of gold, silver and raw
materials, creation of chartered trade companies, navigation acts and commercial
monopolies”.
A new feedback is provided
01/11/2017 msantamaria@cee.uned.es 17
The content grade
grow-up
the answers for each conceptual axis get closer to the acceptance area
EXPERIENCES CARRY OUT
BETWEEN 2015-2016
Providing personalized formative assessment
Experiences using G-Rubrics in 2015 and 2016
• The trials carried out were focused on providing
formative assessment
• Our goal was to promote deep learning through
iterative feedback
• G-Rubric offers two main advantages regarding
formative assessment:
• It allows as many attempts as lecturers set
• gives the students immediate rich feedback
• All trials have been conducted with first year
Business Administration Degree students
Two experiences (2015 and 2016): goals
• Could Grubrics be able to give
accurate feedback?
• Could the feedback allow an
improvement on following answers?
• Could rich feedback increase the
time devoted to the activity?
OUR QUESTIONS
• The impact on their
motivation
• The utility to prepare the
final exam
• The level of agreement with
the grades received
STUDENTS OPINIONS ABOUT
2015: 132 Volunteers 2016: 120 Volunteers
The enriched graphical feedback increases:
• The number of trials performed by the students
• The amount of time devoted to the task
Content grade improvement
01/11/2017 msantamaria@cee.uned.es 2101/11/2017
msantamaria@cee.uned.es
21
The average percentage score increases between first and last attempt
Activity 1 Activity 3Activity 2 Activity 4 Activity 5 Activity 6 Activity 7
We could verify how students using the feedback could improve their answers
Students’ agreement with the grades received
The level of agreement was bigger in the last trial
First trial
47%
very much or totally agree
Last trial
70%
very much or totally agree
G-Rubric had a positive impact on students’ motivation
Totally or very much: 65%
Totally or very much: 60%
Usefulness and positive value
The 80 % of students
considered Grubric totally
or very much useful
regarding exam
preparation
More than 80 % of
students considered this
experience very much or
totally positive
BEYOND FORMATIVE ASSESSMENT: HOW
SEMANTIC TECHNOLOGIES CAN HELP
TUTORS TO MARK TMAs
Are humans reliable to mark open ended questions?
• Inter-examiners variability depending on who
marked the task
• Intra-examiner reliability depending on when the
same tutor marked the task
Students view manual grading of open-ended questions as
subjective
➢ In contrast automated test assessement is perceived as
more objective
Manual grading has almost two problems:
Accidentally double grading (2012 & 2013)
Two members of the academic team, independently and unknowingly, graded
the same exams.
• The differential was in an average of 1,5 points over 8
• Final grade differed substantially > 37,5% not obtain a passing grade
-1,5
-1
-0,5
0
0,5
1
1,5
2
2,5
3
3,5
4
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
"Essay+short questions grade differential"
Essay grade differential
Figure 5. Differential in grades for doubly-assessed exams (June 2012)*
*Referred to 24 Econonic History final exams from Barcelona-CUXAM Regional Center (June 2012)
Accidentally double grading (2013)
-1
-0,5
0
0,5
1
1,5
2
2,5
3
3,5
4
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77
Essay+Short Questions differential (Grade 2-Grade 1)
Figure 6. Differential in grades for doubly-assessed exams (June 2013)*
Referred to 76 Econ. History final exams from Valencia-Alzira Regional Center (June 2013)
• We found the same
• The differential was in an average of 0,9 points over 8
• Final grade differed substantially (21%) not obtain a passing grade
Correlations between grades assigned by examiners
2012 2013
n 20 76
GLOBAL GRADE 0,82 0,88
SORT QUESTIONS 0,85 0,87
TEXT COMMENTARY 0,70 0,67
Despite these differences between examiners we found:
• A high correlation on the global score and short questions
• Lower correlation on text commentary grades
Comparing how tutors and G-Rubrics marks TMA
• Grubric could cope simultaneously with both problems:
• Inter-examiners variability
• Intra-examiner reliability
• A fragment of "the Wealth of Nations", by Adam Smith, was selected to be
commented by students.
• A rubric was build to minimise inter-examiners variability.
• A G-Rubric's object, similar to those above described, was designed and their
axes were aligned with the rubric used by tutors to mark the students'
assignments
• The tutors graded these assignments using the rubric
• The teaching team used GRubric to grade the students' TMA again
• 252 TMAs were double-graded to compare G-Rubric and Tutors marks
Our first step has been to compare how tutors and G-Rubric grades TMAs
What have we found comparing grades given by tutors
and GRubric?
2.- Pearson correlations between GRubric´s and tutor´s marks
yielded a large effect size (.549**).
M SD Min Max
Tutor’s
Marks
5.95 1.45 1.55 8.54
GRubric
Marks
5.92 1.61 2.13 9,20
Main Descriptives of Tutors and GRubric marks (N=252)
An independent samples t test yielded no significant differences between the means of
Tutors and GRrubric marks, t(251), p=.720, ns **. The correlation is significant at the 0.01
level (bilateral)
1. - No significative difference between means.
Grades distributions: analysis of frequencies
0,79
4,37
6,75 6,35
30,56
28,57
14,68
7,94
4,76
9,92
15,08
17,06
22,62
21,83
7,94
0,79
0
5
10
15
20
25
30
35
0 a 1 1 a 2 2 a 3 3 a 4 4 a 5 5 a 6 6 a 7 7 a 8 8 a 9 9 a 10
3.- G-Rubric’s marks were more homogeneously distributed in
comparison with the higher concentration of the Tutors’ marks in the ranges
between 5 and 7 points
Tutors grades Grubric’s grades
Points ranges
Percentagesofgradesintoeachrange
Analysis of the homogeneity of G-Rubric and tutor’s marks
Tutor Mark GRubric Mark
Mark
Difference
Chi-
cuadrado
69,14 47,21 74,49
gl 36 36 36
p ,001 ,100 ,000
Kruskal-Wallis analyses for the evaluation of Marks homogeneity between the 37 tutoring groups
4.- Tutors’ marks presented a significant inter-group variability,
as well as mark difference.
On the contrary, G-Rubric marks did not differ significantly between
these same tutorial groups, proving, thus, its higher levels of homogeneity.
CONCLUSIONS
Main conclusions
• Automated-assessment software such as G-Rubric is currently
mature enough to be used with students.
• The kind of feedback offered was useful to improve the students’
performance
• Results in terms of students’ satisfaction are also encouraging.
• For teachers, the time and effort required is affordable.
• A remarkable correlation and no significant differences
between the means has been found.
• Tutors’ scores presented a significant inter-group variability
• On the contrary, G-Rubric’s marks did not differ significantly
between these same tutorial groups, proving, thus, its
higher levels of homogeneity
Our proposal:
The students’ essays will be grade first using G-Rubric,
afterward tutors will grade again to validate or modify the
grades given.
Regarding how Grubric could support grading
Download page
http://www.elsemantico.es/gallito20/download-eng.html
References
Cascón, L., & Antonio, J. (1989). Comprensión y memoria de textos expositivos: diferencias entre sujetos expertos y novatos. Recuperado a partir de
https://repositorio.uam.es/handle/10486/4362
Forsman, S. (1985). Writing to learn means learning to think. Roots in the Sawdust, 162–174.
Hernández, M., & Santamaría Lancho, M. (s. f.). G-Rubric: una aplicación para corrección automática de preguntas abiertas. Primer balance de su utilización. G-Rubric:
an application for automatic assessment of free-text questions: first outcome analysis. Recuperado a partir de http://www.xiiedhe.unican.es/wp-
content/uploads/2016/04/hernandezsantamaria.pdf
Jorge Botana, G. (2010). La técnica del análisis de la Semántica Latente (LSA/LSI) como modelo informático de la comprensión del texto y el discurso una aproximación
distribuida al análisis semántico. Universidad Autónoma de Madrid. Recuperado a partir de https://dialnet.unirioja.es/servlet/tesis?codigo=27624
Jorge-Botana, G., Leon, J. A., Olmos, R., & Escudero, I. (2010). Latent semantic analysis parameters for essay evaluation using small-scale corpora*. Journal of
Quantitative Linguistics, 17(1), 1–29.
Jorge-Botana, G., León, J. A., Olmos, R., & Hassan-Montero, Y. (2010). Visualizing polysemy using LSA and the predication algorithm. Journal of the American Society
for Information Science and Technology, 61(8), 1706–1724.
Jorge-Botana, G., Olmos, R., & Barroso, A. (2012). The Construction-Integration framework: a means to diminish bias in LSA-based call routing. International Journal
of Speech Technology, 15(2), 151–164.
Jorge-Botana, G., Olmos, R., & Barroso, A. (2013). Gallito 2.0: A natural language processing tool to support research on discourse. En Proceedings of the 13th Annual
Meeting of the Society for Text and Discourse. Recuperado a partir de http://elsemantico.es/Documentos/Gallito2_Valencia_new.pdf
Jorge-Botana, G., Olmos, R., & León, J. A. (2009). Using latent semantic analysis and the predication algorithm to improve extraction of meanings from a diagnostic
corpus. The Spanish journal of psychology, 12(02), 424–440.
Julià, J. M. (1999). Aprendizaje a través de la escritura. Actas de las V Jornadas de Enseñanza Universitaria de Informática, Jenui, 99, 205–210.
Olmos, R., Jorge-Botana, G., León, J. A., & Escudero, I. (2014). Transforming selected concepts into dimensions in latent semantic analysis. Discourse Processes, 51(5-
6), 494–510.
Olmos, R., León, J. A., Escudero, I., & Jorge-Botana, G. (2009). Análisis del tamaño y especificidad de los corpus en la evaluación de resúmenes mediante el LSA: Un
análisis comparativo entre LSA y jueces expertos. Revista signos, 42(69), 71–81.
Olmos, R., León, J. A., Escudero, I., & Jorge-Botana, G. (2011). Using latent semantic analysis to grade brief summaries: some proposals. International Journal of
Continuing Engineering Education and Life Long Learning, 21(2-3), 192–209.
Olmos, R., León, J. A., Jorge-Botana, G., & Escudero, I. (2009). New algorithms assessing short summaries in expository texts using latent semantic analysis. Behavior
Research Methods, 41(3), 944–950.
Parker, R. P., & Goodkin, V. (1987). The Consequences of Writing: Enhancing Learning in the Disciplines. ERIC. Recuperado a partir de http://eric.ed.gov/?id=ED272928
Roscoe, R. D., Allen, L. K., Weston, J. L., Crossley, S. A., & McNamara, D. S. (2014a). The Writing Pal intelligent tutoring system: Usability testing and development.
Computers and Composition, 34, 39–59.
Roscoe, R. D., Allen, L. K., Weston, J. L., Crossley, S. A., & McNamara, D. S. (2014b). The Writing Pal intelligent tutoring system: Usability testing and development.
Computers and Composition, 34, 39–59.
Roscoe, R. D., Brandon, R. D., Snow, E. L., & McNamara, D. S. (2013). Game-based writing strategy practice with the Writing Pal. Exploring technology for writing and
writing instruction, 1–20
Thanks you for your
attention

More Related Content

More from UNED

El e learning el valor añadido de la formación online (14 de noviembre 2006)
El e learning el valor añadido de la formación online (14 de noviembre 2006)El e learning el valor añadido de la formación online (14 de noviembre 2006)
El e learning el valor añadido de la formación online (14 de noviembre 2006)
UNED
 
Comercio medieval de larga distancia
Comercio medieval de larga distanciaComercio medieval de larga distancia
Comercio medieval de larga distancia
UNED
 
Innovaciones agrarias en la Alta Edad Media
Innovaciones agrarias en la Alta Edad MediaInnovaciones agrarias en la Alta Edad Media
Innovaciones agrarias en la Alta Edad Media
UNED
 
UNx comunidad iberoamericana de emprendimiento: sus or´gienes
UNx comunidad iberoamericana de emprendimiento: sus or´gienesUNx comunidad iberoamericana de emprendimiento: sus or´gienes
UNx comunidad iberoamericana de emprendimiento: sus or´gienes
UNED
 
UNx más allá de los MOOCs
UNx más allá de los MOOCsUNx más allá de los MOOCs
UNx más allá de los MOOCs
UNED
 
La formación te lleva al futuro que elijas
La formación te lleva al futuro que elijasLa formación te lleva al futuro que elijas
La formación te lleva al futuro que elijas
UNED
 

More from UNED (13)

El e learning el valor añadido de la formación online (14 de noviembre 2006)
El e learning el valor añadido de la formación online (14 de noviembre 2006)El e learning el valor añadido de la formación online (14 de noviembre 2006)
El e learning el valor añadido de la formación online (14 de noviembre 2006)
 
Un experiencia de flip teaching en la UNED. XI Encuentro de Didáctica de la H...
Un experiencia de flip teaching en la UNED. XI Encuentro de Didáctica de la H...Un experiencia de flip teaching en la UNED. XI Encuentro de Didáctica de la H...
Un experiencia de flip teaching en la UNED. XI Encuentro de Didáctica de la H...
 
MOOCs y SPOC (Small Private ONline Course) en la formación del profesorado. V...
MOOCs y SPOC (Small Private ONline Course) en la formación del profesorado. V...MOOCs y SPOC (Small Private ONline Course) en la formación del profesorado. V...
MOOCs y SPOC (Small Private ONline Course) en la formación del profesorado. V...
 
Impacto social de las TICs. Presentación en Virtual Educa Lima 2014
Impacto social de las TICs. Presentación en Virtual Educa Lima 2014Impacto social de las TICs. Presentación en Virtual Educa Lima 2014
Impacto social de las TICs. Presentación en Virtual Educa Lima 2014
 
Del Aula al infinito y más alla: ¿salto con Red?
Del Aula al infinito y más alla: ¿salto con Red?Del Aula al infinito y más alla: ¿salto con Red?
Del Aula al infinito y más alla: ¿salto con Red?
 
Principios de diseño de cursos en línea
Principios de diseño de cursos en líneaPrincipios de diseño de cursos en línea
Principios de diseño de cursos en línea
 
Comercio medieval de larga distancia
Comercio medieval de larga distanciaComercio medieval de larga distancia
Comercio medieval de larga distancia
 
Innovaciones agrarias en la Alta Edad Media
Innovaciones agrarias en la Alta Edad MediaInnovaciones agrarias en la Alta Edad Media
Innovaciones agrarias en la Alta Edad Media
 
Evolución económica durante la Edad Media
Evolución económica durante la Edad MediaEvolución económica durante la Edad Media
Evolución económica durante la Edad Media
 
Prueba
PruebaPrueba
Prueba
 
UNx comunidad iberoamericana de emprendimiento: sus or´gienes
UNx comunidad iberoamericana de emprendimiento: sus or´gienesUNx comunidad iberoamericana de emprendimiento: sus or´gienes
UNx comunidad iberoamericana de emprendimiento: sus or´gienes
 
UNx más allá de los MOOCs
UNx más allá de los MOOCsUNx más allá de los MOOCs
UNx más allá de los MOOCs
 
La formación te lleva al futuro que elijas
La formación te lleva al futuro que elijasLa formación te lleva al futuro que elijas
La formación te lleva al futuro que elijas
 

Recently uploaded

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 

Recently uploaded (20)

Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 

Using semantic technologies for giving a formative assessment and supporting scoring in large courses and MOOCs: first experiences at UNED (2015-2017)

  • 1. Miguel Santamaría Lancho, Mauro Hernández, ,Angeles Sánchez-Elvira, José María Luzón Encabo, Guillermo de Jorge- Botana, UNED, Spain Using semantic technologies for giving a formative assessment and supporting scoring in large courses and MOOCs: first experiences at UNED (2015-2017)
  • 2. Department of Economic History and Applied Economics Department of Developmental and Educational Psychology Economic History Teachers Team G-Rubric software developers FACULTY OF ECONOMICS FACULTY OF PSYCHOLOGY Miguel Santamaria José M. Luzón Guillermo de JorgeMauro Hernández Our goal was to improve formative assessment in online courses giving personalised feedback Department of Personality Ángeles Sánchez-Elvira G-Rubric user
  • 3. Summary 1. Our challenge: How semantic technologies could help us to: • give personalised feedback on open-ended questions • support our tutors to score TMAs in a more reliable way 2. What G-Rubrics is and how it works? 3. Analysis of our experiences giving automatic formative feedback on open-ended questions 4. Proposal about how G-Rubric could cope with problems related to manual grading 5. Results and conclusions
  • 4. How to give personalised feedback on open-ended activities
  • 5. •Personalising learning • Fostering performance improvement • Increasing motivation 01/11/2017 msantamaria@cee.uned.es 5 FEEDBACK IS THE KEY FACTOR FOR
  • 6. Wich is the kind of feedback that our students expect? •Quick •Iterative • They love learning by trial and error 01/11/2017 msantamaria@cee.uned.es 6 CHARACTERISTICS OF EXPECTED FEEDBACK Only technology can provide this kind of feedback
  • 7. Feedback based on technologies offers limited solutions At classroom • “clickers” • (Socrative, Kahoot) 01/11/2017 msantamaria@cee.uned.es 7 In e-learning platforms • Quizzes • Adaptive quizzes
  • 8. Quizzes have severe limitations to assess learning outcomes on economic history field Our challenge was how to give: • quick and iterative feedback • for open-ended questions • in a sustainable way • by using technologies • Knowledge about Economic History • Soft skills: • Analysis • Critical thinking • Multiple choice questions • Open-ended short questions about concepts, historical processes, etc • Writing comments of texts, maps, graphs, statistical data LEARNING OUTCOMES ASSESSMENT ACTIVITIES
  • 9. WHAT G-RUBRIC IS AND HOW IT WORKS
  • 10. 2nd step 3rd step 1st step To build up a specialized linguistic corpus and a Semantic Space 6 Economic History textbooks Semantic SpaceCorpus Activities based on short open-ended questions should be developed To deliver the activities to our students we use a web interface Students Web interface IN-built rubric space To implement G-Rubric into a subject we need to follow 3 steps Answer Feedback Canon answer
  • 11. Example of a G-Rubric open-ended activity Question Canon answer Or Golden text Conceptual axes Mercantilism: policies and objectives. “Mercantilism is a set of ideas and policies deployed in early modern Europe (16th, 17th and 18th centuries) aimed at strengthening the State through economic power, and specially focused on trade-balance surpluses and accumulation of precious metals (bullionism). The are several types of policies, emphasizing: a) those focused on obtaining trade balance surpluses (tariff protectionism, prohibition on exporting gold or silver or raw materials, privileged trading companies, shipping records, colonial monopolies); B) promotion of manufactures (import tariffs or prohibitions, laws against luxury, real manufactures); C) other policies: favoring the birth rate, limitation or rate of interior prices. They are often associated with the names of Colbert in France, or the English or Dutch companies of India (VOC). Definition : mercantilism, ideas, practices, state, economy, monarchy, strengthen, reinforce, increase, trade balance, favorable, bullonism, precious metals, gold, silver, privileges. Trade policies: trade, protectionism, tariffs, prohibition, exports, imports, privileged companies, records of navigation, colonies, monopoly, fleet, merchant, surplus Manufacturing policies: manufactures, factories, real, luxury, import substitution Context: Europe, England, France, Holland, Colbert, XVI, XVII, XVIII, modern, VOC, East Indies, West Indies.
  • 12. An example to understand how G-Rubric works 01/11/2017 msantamaria@cee.uned.es 12 G-Rubric web interface
  • 13. The student selects an activity 01/11/2017 msantamaria@cee.uned.es 13 1.-Mercantilism 2.- Triangular Trade 3.- Coal and Ind. Rev. 4.- Gerschenkron 5.- Second Industrial Revolution 6.- Consequences of IWW 7.- Bretton Woods 1.- Mercantilism
  • 14. The student introduces the answer 01/11/2017 msantamaria@cee.uned.es 14 “Mercantilism is a set of ideas and policies deployed in early modern Europe (16th, 17th and 18th centuries) aimed at strengthening the State through economic power, and specially focused on trade-balance surpluses and accumulation of precious metals (bullionism).
  • 15. After submitting an answeer the students receive feedback consisting of 01/11/2017 msantamaria@cee.uned.es 15 “Mercantilism is a set of ideas and policies deployed in early modern Europe (16th, 17th and 18th centuries) aimed at strengthening the State through economic power, and specially focused on trade-balance surpluses and accumulation of precious metals (bulionism). Content grade Graphical feedback Style grade Acceptance area Definition Trade Manufact Context Grammatical accuracy to what extent the answer is correct.
  • 16. After checking th feedback 01/11/2017 msantamaria@cee.uned.es 16 The student improves their answer by adding new information “Mercantilism is a set of ideas and policies deployed in early modern Europe (16th, 17th and 18th centuries) aimed at strengthening the State through economic power, and specially focused on trade-balance surpluses and accumulation of precious metals (bulionism). Amongst mercantilist polices, some outstand, i.e. those focused on attaining surpluses in trade balance through tariff protection, prohibition of exports of gold, silver and raw materials, creation of chartered trade companies, navigation acts and commercial monopolies”.
  • 17. A new feedback is provided 01/11/2017 msantamaria@cee.uned.es 17 The content grade grow-up the answers for each conceptual axis get closer to the acceptance area
  • 18. EXPERIENCES CARRY OUT BETWEEN 2015-2016 Providing personalized formative assessment
  • 19. Experiences using G-Rubrics in 2015 and 2016 • The trials carried out were focused on providing formative assessment • Our goal was to promote deep learning through iterative feedback • G-Rubric offers two main advantages regarding formative assessment: • It allows as many attempts as lecturers set • gives the students immediate rich feedback • All trials have been conducted with first year Business Administration Degree students
  • 20. Two experiences (2015 and 2016): goals • Could Grubrics be able to give accurate feedback? • Could the feedback allow an improvement on following answers? • Could rich feedback increase the time devoted to the activity? OUR QUESTIONS • The impact on their motivation • The utility to prepare the final exam • The level of agreement with the grades received STUDENTS OPINIONS ABOUT 2015: 132 Volunteers 2016: 120 Volunteers The enriched graphical feedback increases: • The number of trials performed by the students • The amount of time devoted to the task
  • 21. Content grade improvement 01/11/2017 msantamaria@cee.uned.es 2101/11/2017 msantamaria@cee.uned.es 21 The average percentage score increases between first and last attempt Activity 1 Activity 3Activity 2 Activity 4 Activity 5 Activity 6 Activity 7 We could verify how students using the feedback could improve their answers
  • 22. Students’ agreement with the grades received The level of agreement was bigger in the last trial First trial 47% very much or totally agree Last trial 70% very much or totally agree
  • 23. G-Rubric had a positive impact on students’ motivation Totally or very much: 65% Totally or very much: 60%
  • 24. Usefulness and positive value The 80 % of students considered Grubric totally or very much useful regarding exam preparation More than 80 % of students considered this experience very much or totally positive
  • 25. BEYOND FORMATIVE ASSESSMENT: HOW SEMANTIC TECHNOLOGIES CAN HELP TUTORS TO MARK TMAs
  • 26. Are humans reliable to mark open ended questions? • Inter-examiners variability depending on who marked the task • Intra-examiner reliability depending on when the same tutor marked the task Students view manual grading of open-ended questions as subjective ➢ In contrast automated test assessement is perceived as more objective Manual grading has almost two problems:
  • 27. Accidentally double grading (2012 & 2013) Two members of the academic team, independently and unknowingly, graded the same exams. • The differential was in an average of 1,5 points over 8 • Final grade differed substantially > 37,5% not obtain a passing grade -1,5 -1 -0,5 0 0,5 1 1,5 2 2,5 3 3,5 4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 "Essay+short questions grade differential" Essay grade differential Figure 5. Differential in grades for doubly-assessed exams (June 2012)* *Referred to 24 Econonic History final exams from Barcelona-CUXAM Regional Center (June 2012)
  • 28. Accidentally double grading (2013) -1 -0,5 0 0,5 1 1,5 2 2,5 3 3,5 4 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 Essay+Short Questions differential (Grade 2-Grade 1) Figure 6. Differential in grades for doubly-assessed exams (June 2013)* Referred to 76 Econ. History final exams from Valencia-Alzira Regional Center (June 2013) • We found the same • The differential was in an average of 0,9 points over 8 • Final grade differed substantially (21%) not obtain a passing grade
  • 29. Correlations between grades assigned by examiners 2012 2013 n 20 76 GLOBAL GRADE 0,82 0,88 SORT QUESTIONS 0,85 0,87 TEXT COMMENTARY 0,70 0,67 Despite these differences between examiners we found: • A high correlation on the global score and short questions • Lower correlation on text commentary grades
  • 30. Comparing how tutors and G-Rubrics marks TMA • Grubric could cope simultaneously with both problems: • Inter-examiners variability • Intra-examiner reliability • A fragment of "the Wealth of Nations", by Adam Smith, was selected to be commented by students. • A rubric was build to minimise inter-examiners variability. • A G-Rubric's object, similar to those above described, was designed and their axes were aligned with the rubric used by tutors to mark the students' assignments • The tutors graded these assignments using the rubric • The teaching team used GRubric to grade the students' TMA again • 252 TMAs were double-graded to compare G-Rubric and Tutors marks Our first step has been to compare how tutors and G-Rubric grades TMAs
  • 31. What have we found comparing grades given by tutors and GRubric? 2.- Pearson correlations between GRubric´s and tutor´s marks yielded a large effect size (.549**). M SD Min Max Tutor’s Marks 5.95 1.45 1.55 8.54 GRubric Marks 5.92 1.61 2.13 9,20 Main Descriptives of Tutors and GRubric marks (N=252) An independent samples t test yielded no significant differences between the means of Tutors and GRrubric marks, t(251), p=.720, ns **. The correlation is significant at the 0.01 level (bilateral) 1. - No significative difference between means.
  • 32. Grades distributions: analysis of frequencies 0,79 4,37 6,75 6,35 30,56 28,57 14,68 7,94 4,76 9,92 15,08 17,06 22,62 21,83 7,94 0,79 0 5 10 15 20 25 30 35 0 a 1 1 a 2 2 a 3 3 a 4 4 a 5 5 a 6 6 a 7 7 a 8 8 a 9 9 a 10 3.- G-Rubric’s marks were more homogeneously distributed in comparison with the higher concentration of the Tutors’ marks in the ranges between 5 and 7 points Tutors grades Grubric’s grades Points ranges Percentagesofgradesintoeachrange
  • 33. Analysis of the homogeneity of G-Rubric and tutor’s marks Tutor Mark GRubric Mark Mark Difference Chi- cuadrado 69,14 47,21 74,49 gl 36 36 36 p ,001 ,100 ,000 Kruskal-Wallis analyses for the evaluation of Marks homogeneity between the 37 tutoring groups 4.- Tutors’ marks presented a significant inter-group variability, as well as mark difference. On the contrary, G-Rubric marks did not differ significantly between these same tutorial groups, proving, thus, its higher levels of homogeneity.
  • 35. Main conclusions • Automated-assessment software such as G-Rubric is currently mature enough to be used with students. • The kind of feedback offered was useful to improve the students’ performance • Results in terms of students’ satisfaction are also encouraging. • For teachers, the time and effort required is affordable.
  • 36. • A remarkable correlation and no significant differences between the means has been found. • Tutors’ scores presented a significant inter-group variability • On the contrary, G-Rubric’s marks did not differ significantly between these same tutorial groups, proving, thus, its higher levels of homogeneity Our proposal: The students’ essays will be grade first using G-Rubric, afterward tutors will grade again to validate or modify the grades given. Regarding how Grubric could support grading
  • 38. References Cascón, L., & Antonio, J. (1989). Comprensión y memoria de textos expositivos: diferencias entre sujetos expertos y novatos. Recuperado a partir de https://repositorio.uam.es/handle/10486/4362 Forsman, S. (1985). Writing to learn means learning to think. Roots in the Sawdust, 162–174. Hernández, M., & Santamaría Lancho, M. (s. f.). G-Rubric: una aplicación para corrección automática de preguntas abiertas. Primer balance de su utilización. G-Rubric: an application for automatic assessment of free-text questions: first outcome analysis. Recuperado a partir de http://www.xiiedhe.unican.es/wp- content/uploads/2016/04/hernandezsantamaria.pdf Jorge Botana, G. (2010). La técnica del análisis de la Semántica Latente (LSA/LSI) como modelo informático de la comprensión del texto y el discurso una aproximación distribuida al análisis semántico. Universidad Autónoma de Madrid. Recuperado a partir de https://dialnet.unirioja.es/servlet/tesis?codigo=27624 Jorge-Botana, G., Leon, J. A., Olmos, R., & Escudero, I. (2010). Latent semantic analysis parameters for essay evaluation using small-scale corpora*. Journal of Quantitative Linguistics, 17(1), 1–29. Jorge-Botana, G., León, J. A., Olmos, R., & Hassan-Montero, Y. (2010). Visualizing polysemy using LSA and the predication algorithm. Journal of the American Society for Information Science and Technology, 61(8), 1706–1724. Jorge-Botana, G., Olmos, R., & Barroso, A. (2012). The Construction-Integration framework: a means to diminish bias in LSA-based call routing. International Journal of Speech Technology, 15(2), 151–164. Jorge-Botana, G., Olmos, R., & Barroso, A. (2013). Gallito 2.0: A natural language processing tool to support research on discourse. En Proceedings of the 13th Annual Meeting of the Society for Text and Discourse. Recuperado a partir de http://elsemantico.es/Documentos/Gallito2_Valencia_new.pdf Jorge-Botana, G., Olmos, R., & León, J. A. (2009). Using latent semantic analysis and the predication algorithm to improve extraction of meanings from a diagnostic corpus. The Spanish journal of psychology, 12(02), 424–440. Julià, J. M. (1999). Aprendizaje a través de la escritura. Actas de las V Jornadas de Enseñanza Universitaria de Informática, Jenui, 99, 205–210. Olmos, R., Jorge-Botana, G., León, J. A., & Escudero, I. (2014). Transforming selected concepts into dimensions in latent semantic analysis. Discourse Processes, 51(5- 6), 494–510. Olmos, R., León, J. A., Escudero, I., & Jorge-Botana, G. (2009). Análisis del tamaño y especificidad de los corpus en la evaluación de resúmenes mediante el LSA: Un análisis comparativo entre LSA y jueces expertos. Revista signos, 42(69), 71–81. Olmos, R., León, J. A., Escudero, I., & Jorge-Botana, G. (2011). Using latent semantic analysis to grade brief summaries: some proposals. International Journal of Continuing Engineering Education and Life Long Learning, 21(2-3), 192–209. Olmos, R., León, J. A., Jorge-Botana, G., & Escudero, I. (2009). New algorithms assessing short summaries in expository texts using latent semantic analysis. Behavior Research Methods, 41(3), 944–950. Parker, R. P., & Goodkin, V. (1987). The Consequences of Writing: Enhancing Learning in the Disciplines. ERIC. Recuperado a partir de http://eric.ed.gov/?id=ED272928 Roscoe, R. D., Allen, L. K., Weston, J. L., Crossley, S. A., & McNamara, D. S. (2014a). The Writing Pal intelligent tutoring system: Usability testing and development. Computers and Composition, 34, 39–59. Roscoe, R. D., Allen, L. K., Weston, J. L., Crossley, S. A., & McNamara, D. S. (2014b). The Writing Pal intelligent tutoring system: Usability testing and development. Computers and Composition, 34, 39–59. Roscoe, R. D., Brandon, R. D., Snow, E. L., & McNamara, D. S. (2013). Game-based writing strategy practice with the Writing Pal. Exploring technology for writing and writing instruction, 1–20
  • 39. Thanks you for your attention