Unsupervised system for automatic grading of bachelor and master thesis

Authors
University
Politehnica
of Bucharest
Unsupervised System for Automatic
Grading of Bachelor and Master Thesis
Yusra Mosallam
Lukas Toma
Mulu Weldegebreal Adhana
Costin-Gabriel Chiru
Traian Rebedea traian.rebedea@cs.pub.ro

Overview
• Introduction
• Motivation
• Previous work
• System architecture
• Dataset
• Results
• Conclusions
26.02.19 K-Teams @ eLSE 2014 – Bucharest, Romania 2

Introduction
• Using natural language processing (NLP) and
machine learning for automated analysis of
written texts (essays, books, thesis) in
e-learning
• Essay grading
• Text complexity
• Assessment of conversations
• Authorship identification

Motivation
• Are features used in essay grading and/or text
complexity assessment suitable for automatic
grading of BSc and MSc diploma theses in
computer science?
• Which is the most accurate classifier for
grading theses?
• What problems are encountered?

Previous work
• Textual complexity features computed on distinct
levels:
– Character measures
– Lexical measures
– Syntactic measures
– Semantic measures
– Coherence measures
• Text complexity measures can help in grading
students' essays
• Assessing the text complexity can also provide a good
indicator for assigning reading passages to students in
different grade levels (predicting the correct grade
level of each reading passage)

System architecture

Features
• Lexical Features – lexical measures based on sentences and words
– sentence length
– word length
– vocabulary richness
– hapax legomenon (the number of words mentioned once)
– functional words
– frequent words, frequent word n-grams, frequent acronyms
– number of constituent paragraphs
• Character Features
– character n-grams
– punctuation marks count
– letter count
– ratio of upper case to lower case characters
– ratio of digits to alphabetical characters

Features
• WordNet Features:
– depth of proper nouns mentioned in the text
– and the average length of the hypernym path for nouns, verbs, and
noun and verbs altogether
• Syntactic Features:
– frequent POS tags, frequent n-grams of POS tags
– named entities
– properties of the syntactic parse tree (average branching factor,
average height) of each sentence
• Cohesion Features:
– noun overlap, argument overlap, stem overlap, content word overlap
– noun phrase density
– personal pronoun incidence scores
– polysemy for words

Dataset
• BSc and Msc diploma theses from the Department of
Computer Science within University Politehnica of
Bucharest
• 361 BSC and 202 MSc = 563 theses written in English
during the last 4 years
• After removing duplicates and thesis that did not have
a student name (or the name was not discovered
automatically), our dataset comprised of 437 instances
• Matching student data from thesis with student data
from the grade database (approximate string matching
using student name + thesis name + year of
graduation)

Dataset
• Distribution of grades is very unbalanced
• Dataset is also affected by some human errors
/ outliers (the grades below 5)

Results
• Several classifiers have been trained:
– k-NN (with k=10)
– Neural network (NN)
– Support vector machine (SVM)
– Random Forest (RF)
• Used 3-fold cross validation, keeping 2/3 of
the data for training and 1/3 for the test set
• Performance assessed using:
– Mean squared error (MSE)
– Pearson correlation (with p-values)

Results
Method MSE P-value Correlation
SVM
(classification)
0.447 0.068 0.151
k-NN
(classification)
random 0.987 -0.001
NN
(regression)
random 0.312 -0.040
RF
(classification)
0.368 0 0.388

Results
• Random Forest classifier had the best results
(MSE=0.368, r=0.388, p-value=0)
• SVM has poorer results, k-NN and NN (regression)
have not achieved any useful results

Conclusions
• Linguistic textual complexity features provide a
low accuracy for thesis grading on our dataset
• Three main reasons:
– Dataset: most of the grades assigned by the
evaluation committee ranged from 9 to 10
• Usually only the best Romanian students write their
graduation thesis in English
– Task: difficulty of finding the best features for
assessing the scientific content of the thesis
– Grading process: the methodology used by the
evaluation committee when grading the thesis, which
does not always judge only the quality of the thesis,
but also uses information about the student’s GPA

Improvements
• Feature selection and post-processing
• Retrain the classifiers using a subset of features
with the strongest prediction power
• Find other measures that can evaluate the
scientific content of the thesis
• Semantic features that could capture the level of
knowledge
• Should be able to predict the main field of a given
thesis and to evaluate the thesis considering the
context of that specific field

Thank you!
Questions?
Discussion

Unsupervised system for automatic grading of bachelor and master thesis

Recommended

Recommended

More Related Content

Similar to Unsupervised system for automatic grading of bachelor and master thesis

Similar to Unsupervised system for automatic grading of bachelor and master thesis (20)

More from University Politehnica Bucharest

More from University Politehnica Bucharest (20)

Recently uploaded

Recently uploaded (20)

Unsupervised system for automatic grading of bachelor and master thesis