Automatic Generation of Multiple Choice Questions using Surface-based Semanti...CSCJournals
Multiple Choice Questions (MCQs) are a popular large-scale assessment tool. MCQs make it much easier for test-takers to take tests and for examiners to interpret their results; however, they are very expensive to compile manually, and they often need to be produced on a large scale and within short iterative cycles. We examine the problem of automated MCQ generation with the help of unsupervised Relation Extraction, a technique used in a number of related Natural Language Processing problems. Unsupervised Relation Extraction aims to identify the most important named entities and terminology in a document and then recognize semantic relations between them, without any prior knowledge as to the semantic types of the relations or their specific linguistic realization. We investigated a number of relation extraction patterns and tested a number of assumptions about linguistic expression of semantic relations between named entities. Our findings indicate that an optimized configuration of our MCQ generation system is capable of achieving high precision rates, which are much more important than recall in the automatic generation of MCQs. Its enhancement with linguistic knowledge further helps to produce significantly better patterns. We furthermore carried out a user-centric evaluation of the system, where subject domain experts from biomedical domain evaluated automatically generated MCQ items in terms of readability, usefulness of semantic relations, relevance, acceptability of questions and distractors and overall MCQ usability. The results of this evaluation make it possible for us to draw conclusions about the utility of the approach in practical e-Learning applications.
Question Answering System using machine learning approachGarima Nanda
In a compact form, this is a presentation reflecting how the machine learning approach can be used for the effective and efficient interaction using classification techniques.
The purpose of this research proposal is to identify organizational principles for the development of online learning curriculum in higher education. This study will address the following research questions: Can educational psychology learning theories (such as cognitive load theory) be used to inform usability-testing methods? Can usability-testing methods be used to discover basic principles of online learning curricular organization? Are there basic principles of online learning curricular organization that can improve the efficiency, effectiveness, and user satisfaction of learning in online environments? While there are many theoretical directions one could take to examine the interface of instructional design and technology, this research proposal will use the lens of the cognitive load theory. This study will use the cognitive walkthrough method as established by usability testing standards. Cognitive walkthroughs use an explicitly detailed procedure to simulate a user’s problem solving process at each step through the dialogue, checking if the simulated user’s goals and memory content can be assumed to lead to the next correct action. Participants will be asked to complete a series of tasks in an online learning environment formulated to compare different methods of organization. This study has the potential to make significant contributions to the field of educational psychology and online education by providing substantive empirical data that sheds light on potential principles that improve the effectiveness, efficiency, and user satisfaction of Web-based education.
Automatic Generation of Multiple Choice Questions using Surface-based Semanti...CSCJournals
Multiple Choice Questions (MCQs) are a popular large-scale assessment tool. MCQs make it much easier for test-takers to take tests and for examiners to interpret their results; however, they are very expensive to compile manually, and they often need to be produced on a large scale and within short iterative cycles. We examine the problem of automated MCQ generation with the help of unsupervised Relation Extraction, a technique used in a number of related Natural Language Processing problems. Unsupervised Relation Extraction aims to identify the most important named entities and terminology in a document and then recognize semantic relations between them, without any prior knowledge as to the semantic types of the relations or their specific linguistic realization. We investigated a number of relation extraction patterns and tested a number of assumptions about linguistic expression of semantic relations between named entities. Our findings indicate that an optimized configuration of our MCQ generation system is capable of achieving high precision rates, which are much more important than recall in the automatic generation of MCQs. Its enhancement with linguistic knowledge further helps to produce significantly better patterns. We furthermore carried out a user-centric evaluation of the system, where subject domain experts from biomedical domain evaluated automatically generated MCQ items in terms of readability, usefulness of semantic relations, relevance, acceptability of questions and distractors and overall MCQ usability. The results of this evaluation make it possible for us to draw conclusions about the utility of the approach in practical e-Learning applications.
Question Answering System using machine learning approachGarima Nanda
In a compact form, this is a presentation reflecting how the machine learning approach can be used for the effective and efficient interaction using classification techniques.
The purpose of this research proposal is to identify organizational principles for the development of online learning curriculum in higher education. This study will address the following research questions: Can educational psychology learning theories (such as cognitive load theory) be used to inform usability-testing methods? Can usability-testing methods be used to discover basic principles of online learning curricular organization? Are there basic principles of online learning curricular organization that can improve the efficiency, effectiveness, and user satisfaction of learning in online environments? While there are many theoretical directions one could take to examine the interface of instructional design and technology, this research proposal will use the lens of the cognitive load theory. This study will use the cognitive walkthrough method as established by usability testing standards. Cognitive walkthroughs use an explicitly detailed procedure to simulate a user’s problem solving process at each step through the dialogue, checking if the simulated user’s goals and memory content can be assumed to lead to the next correct action. Participants will be asked to complete a series of tasks in an online learning environment formulated to compare different methods of organization. This study has the potential to make significant contributions to the field of educational psychology and online education by providing substantive empirical data that sheds light on potential principles that improve the effectiveness, efficiency, and user satisfaction of Web-based education.
Question Answering (QA) is a subfield of Natural Language Processing (NLP) and computer science
focused on building systems that automatically answer questions from humans in natural language. This
survey summarizes the history and current state of the field and is intended as an introductory overview of
QA systems. After discussing QA history, this paper summarizes the different approaches to the
architecture of QA systems -- whether they are closed or open-domain and whether they are text-based,
knowledge-based, or hybrid systems. Lastly, some common datasets in this field are introduced and
different evaluation metrics are discussed.
Learning Analytics: Seeking new insights from educational dataAndrew Deacon
CPUT Fundani TWT - 22 May 2014
Analytics is a buzzword that encompasses the analysis and visualisation of big data. Current interest results from the growing access to data and the many software tools now available to analyse this data in Higher Education, through platforms such as Learning Management Systems. This seminar provides an overview of current applications and uses of learning analytics and how it can help institutions of learning better support their learners. The illustrative examples look at institutional and social media data that together provide rich insights into institutional, teaching and learning issues. A few simple ways to perform such analytics in a context of Higher Education will be introduced.
Advances in Learning Analytics and Educational Data Mining MehrnooshV
This presentation is about the state-of-the-art of Learning Analytics and Edicational Data Mining. It is presented by Mehrnoosh Vahdat as the introductory tutorial of Special Session 'Advances in Learning Analytics and Educational Data Mining' at ESANN 2015 conference.
User Control in AIED (Artificial Intelligence in Education)Peter Brusilovsky
Slides of my intro to "Meet the Expert" session at AIED 2021. This is a subset of slides of a longer presentation on user control in AI extended with many specific examples from AIED area.
The first requirement for an online mathematics homework engine is to encourage students to practice and reinforce their mathematics skills in ways that are as good or better than traditional paper homework. The use of the computer and the internet should not limit the kind or quality of the mathematics that we teach and if possible it should expand it.
Now that much of the homework practice takes place online we have the potential of a new and much better window into how students learn mathematics but we must continue to ensure that students are studying the mathematics we want to have learned and not just mathematics that is easily gradable. Several of the open source mathematics engines that do this well are represented at this conference.
The WeBWorK mathematics rendering engine started twenty years ago as a stand alone application. Since then homework questions contributed by many, many mathematicians to the OpenProblemLibrary (OPL) have created a collection of over 30,000 Creative Commons licensed problems primarily directed toward calculus but ranging from basic algebra through matrix linear algebra.
I’ll present one of the adaptations of WeBWorK which allows it to render mathematics questions for a standard Moodle quiz in much the same way that STACK functions. Both STACK and WeBWorK vastly increase Moodle’s ability to handle mathematics. Using the Moodle quiz format will make the OPL available to many more educators and allows utilization of Moodle’s facility at collecting student data.
If there is time I’ll show a second adaptation which allows WeBWorK to serve as an assignment type within Moodle. These same mechanisms allow active WeBWorK questions to be embedded in other learning management systems, in interactive textbooks and even HTML pages. This capability fits well with an emerging trend to use smaller, more specialized, inter-operating components for online education.
Question Answering (QA) is a subfield of Natural Language Processing (NLP) and computer science
focused on building systems that automatically answer questions from humans in natural language. This
survey summarizes the history and current state of the field and is intended as an introductory overview of
QA systems. After discussing QA history, this paper summarizes the different approaches to the
architecture of QA systems -- whether they are closed or open-domain and whether they are text-based,
knowledge-based, or hybrid systems. Lastly, some common datasets in this field are introduced and
different evaluation metrics are discussed.
Learning Analytics: Seeking new insights from educational dataAndrew Deacon
CPUT Fundani TWT - 22 May 2014
Analytics is a buzzword that encompasses the analysis and visualisation of big data. Current interest results from the growing access to data and the many software tools now available to analyse this data in Higher Education, through platforms such as Learning Management Systems. This seminar provides an overview of current applications and uses of learning analytics and how it can help institutions of learning better support their learners. The illustrative examples look at institutional and social media data that together provide rich insights into institutional, teaching and learning issues. A few simple ways to perform such analytics in a context of Higher Education will be introduced.
Advances in Learning Analytics and Educational Data Mining MehrnooshV
This presentation is about the state-of-the-art of Learning Analytics and Edicational Data Mining. It is presented by Mehrnoosh Vahdat as the introductory tutorial of Special Session 'Advances in Learning Analytics and Educational Data Mining' at ESANN 2015 conference.
User Control in AIED (Artificial Intelligence in Education)Peter Brusilovsky
Slides of my intro to "Meet the Expert" session at AIED 2021. This is a subset of slides of a longer presentation on user control in AI extended with many specific examples from AIED area.
The first requirement for an online mathematics homework engine is to encourage students to practice and reinforce their mathematics skills in ways that are as good or better than traditional paper homework. The use of the computer and the internet should not limit the kind or quality of the mathematics that we teach and if possible it should expand it.
Now that much of the homework practice takes place online we have the potential of a new and much better window into how students learn mathematics but we must continue to ensure that students are studying the mathematics we want to have learned and not just mathematics that is easily gradable. Several of the open source mathematics engines that do this well are represented at this conference.
The WeBWorK mathematics rendering engine started twenty years ago as a stand alone application. Since then homework questions contributed by many, many mathematicians to the OpenProblemLibrary (OPL) have created a collection of over 30,000 Creative Commons licensed problems primarily directed toward calculus but ranging from basic algebra through matrix linear algebra.
I’ll present one of the adaptations of WeBWorK which allows it to render mathematics questions for a standard Moodle quiz in much the same way that STACK functions. Both STACK and WeBWorK vastly increase Moodle’s ability to handle mathematics. Using the Moodle quiz format will make the OPL available to many more educators and allows utilization of Moodle’s facility at collecting student data.
If there is time I’ll show a second adaptation which allows WeBWorK to serve as an assignment type within Moodle. These same mechanisms allow active WeBWorK questions to be embedded in other learning management systems, in interactive textbooks and even HTML pages. This capability fits well with an emerging trend to use smaller, more specialized, inter-operating components for online education.
Asking Clarifying Questions in Open-Domain Information-Seeking ConversationsMohammad Aliannejadi
Users often fail to formulate their complex information needs in a single query. As a consequence, they may need to scan multiple result pages or reformulate their queries, which may be a frustrating experience.
Alternatively, systems can improve user satisfaction by proactively asking questions of the users to clarify their information needs. Asking clarifying questions is especially important in conversational systems since they can only return a limited number of (often only one) result(s).
In this paper, we formulate the task of asking clarifying questions in open-domain information-seeking conversational systems. To this end, we propose an offline evaluation methodology for the task and collect a dataset, called Qulac, through crowdsourcing. Our dataset is built on top of the TREC Web Track 2009-2012 data and consists of over 10K question-answer pairs for 198 TREC topics with 762 facets.
Our experiments on an oracle model demonstrate that asking only one good question leads to over 170% retrieval performance improvement in terms of P@1, which clearly demonstrates the potential impact of the task. We further propose a retrieval framework consisting of three components: question retrieval, question selection, and document retrieval. In particular, our question selection model takes into account the original query and previous question-answer interactions while selecting the next question. Our model significantly outperforms competitive baselines. To foster research in this area, we have made Qulac publicly available.
This is the slide deck for the presentation that was given with Kate Lawrence (VP User Experience EBSCO), Courtney McDonald (Indiana University), and Esther Onega (University of Virginia) at the 2014 Charleston Conference on Thursday Nov 6, 2014.
Influence of Timeline and Named-entity Components on User Engagement Roi Blanco
Nowadays, successful applications are those which contain features that captivate and engage users. Using an interactive news retrieval system as a use case, in this paper we study the effect of timeline and named-entity components on user engagement. This is in contrast with previous studies where the importance of these components were studied from a retrieval effectiveness point of view. Our experimental results show significant improvements in user engagement when named-entity and timeline components were installed. Further, we investigate if we can predict user-centred metrics through user's interaction with the system. Results show that we can successfully learn a model that predicts all dimensions of user engagement and whether users will like the system or not. These findings might steer systems that apply a more personalised user experience, tailored to the user's preferences.
Machine Learning Assisted Citation Screening for Systematic ReviewsAnjani Dhrangadhariya
Evidence-based practice is highly dependent upon up-to-date systematic reviews (SR) for decision making. However, conducting and updating systematic reviews, especially the citation screening for identification of relevant studies, requires much human work and is therefore expensive. Automating citation screening using machine learning (ML) based approaches can reduce cost and labor. Machine learning has been applied to automate citation screening but not for the SRs with very narrow research questions. This paper reports the results and observations for an ongoing research that aims to automate citation screening for SRs with narrow research questions using machine learning. The research also sheds light on the problem of class imbalance and class overlap on the performance of ML classifiers when applied to SRs with narrow research questions.
Improving neural question generation using answer separationNAVER Engineering
Neural question generation (NQG) is the task of generating a question from a given passage with deep neural networks. Previous NQG models suffer from a problem that a significant proportion of the generated questions include words in the question target, resulting in the generation of unintended questions. In this paper, we propose answer-separated seq2seq, which better utilizes the information from both the passage and the target answer. By replacing the target answer in the original passage with a special token, our model learns to identify which interrogative word should be used. We also propose a new module termed keyword-net, which helps the model better capture the key information in the target answer and generate an appropriate question. Experimental results demonstrate that our answer separation method significantly reduces the number of improper questions which include answers. Consequently, our model significantly outperforms previous state-of-the-art NQG models.
Post-it Up: Qualitative Data Analysis of a Test FestSarah Joy Arnold
Presentation at Southeastern Library Assessment Conference 2017 in Atlanta, GA.
This session will outline how we planned and executed five simultaneous usability tests and what we learned from using this method. We'll also discuss how we approached analyzing the large amount of qualitative data that was gathered during testing via affinity diagrams and lots of post-it notes. The focus of this session is on our methodologies, though we'll briefly look at the results of each test.
[DSC Europe 22] Machine learning algorithms as tools for student success pred...DataScienceConferenc1
The goal of higher education institutions is to provide quality education to students. Predicting academic success and early intervention to help at-risk students is an important task for this purpose. This talk explores the possibilities of applying machine learning in developing predictive models of academic performance. What factors lead to success at university? Are there differences between students of different generations? Answers are given by applying machine learning algorithms to a data set of 400 students of three generations of IT studies. The results show differences between students with regard to student responsibility and regularity of class attendance and great potential of applying machine learning in developing predictive models.
Developing an electronic rubric to assess leadership behavior Dhanya G
Developing an electronic rubric to assess leadership behavior of secondary school students
Key Terms : Leadership behavior, Constructvist classroom, Electronic rubric
2. Outline
• Multiple Choice Questions
• Motivation
• System Architecture
• Unsupervised IE
• Surface‐based Approach
• Dependency‐based Approach
• Use of Web as a corpus
• Automatic Generation of Questions
• Automatic Generation of Distractors
• Extrinsic Evaluation
• Comparison
• Main Contributions
2
8. Named Entity Recognition (NER)
Entity Type Precision Recall F-score
Protein 65.82 81.41 72.79
DNA 65.64 66.76 66.20
RNA 60.45 68.64 64.29
Cell Line 56.12 59.60 57.81
Cell Type 78.51 70.54 74.31
Overall 67.45 75.78 71.37
GENIA NER is used to recognise the following 5
main Named Entities:
8
11. Patterns Building
• Minimum one content word and maximum three
content words are extracted between two named
entities
• Why??
• The idea behind this selection process is that if
• No content word between two NE’s then it is most likely there
will be no relation between them
• While on the other hand, if two NE’s are quite far from each
other then it is also most likely they will be not related either
• Use of lemmatised word
11
19. Patterns Ranking
• The patterns are ranked using the following ranking methods:
• Information Gain
• Information Gain Ratio
• Mutual Information
• Normalised Mutual Information
• Log‐likelihood
• Chi‐Square
• Meta‐ranking
• tf‐idf
• The patterns along with their scores obtained using the above
mentioned ranking methods are stored into the database
Information-theoretic
concepts
Statistical tests
19
28. Use of Web as a corpus
• Web corpus is not homogenous
• Web corpus is not similar to GENIA corpus
• Web corpus is not similar to GENIA EVENT corpus
• One of the possible reasons for this is that GENIA is a very
narrow‐domain corpus and it is hard to collect relevant
topical documents automatically
• Use of a Web as a corpus is still unable to ensure the same
level of topic relevance as achieved in manually compiled
corpora
28
32. Automatic Question Generation
• Step 2: The part of the extracted sentence that contains template together
with slot fillers is tagged by <QP> and </QP> tags as shown below:
• Thus, the <DNA> gamma 3 ECS </DNA> is an <QP> <DNA> inducible promoter
</DNA> containing <DNA> cis elements </DNA> </QP> that critically mediate
<protein> CD40L </protein> and IL‐4‐triggered transcriptional activation of the
<DNA> human C gamma 3 gene </DNA>.
• Step 3: In this step, we extract semantic tags and actual names from the
extracted sentence by employing Machinese parser (Tapanainen and Järvinen,
1997). After parsing, the extracted semantic pattern is transformed into the
following two types of questions (active voice and passive voice):
• Which DNA contains cis elements?
• Which DNA is contained by inducible promoter?
• For various forms of extracted patterns, we develop a certain set of rules
based on semantic classes (Named Entities) and part‐of‐speech (PoS)
information present in a pattern.
32
33. Automatic Question Generation
• [V/encode] (subj[DNA] + obj[PROTEIN])
• This pattern is matched with the following sentence, which
contains its instantiation:
• This structural similarity suggests that the pAT 133 gene encodes
a transcription factor with a specific biological function.
• Our dependency‐based patterns always include a main verb, so
in order to automatically generate questions:
• We traverse the whole dependency tree of the extracted sentence and
• Extract all of the words which rely on the main verb present in the
dependency parse of a sentence.
• The part of the sentence is then transformed into the question by
selecting the subtree of the parse bounded by the two named entities
present in the dependency pattern.
33
51. References
• PhD Thesis Online:
• http://clg.wlv.ac.uk/papers/afzal‐thesis.pdf
• Journal Papers:
• Afzal N. and Mitkov R. (2014). Automatic Generation of Multiple Choice Questions
using Dependency‐based Semantic Relations. Soft Computing. Volume 18, Issue 7, pp.
1269‐1281 (Impact Factor 2013: 1.304) DOI: 10.1007/s00500‐013‐1141‐4
• Afzal N. and Farzindar A. (2013). Unsupervised Relation Extraction from a Corpus
Automatically Collected from the Web from Biomedical Domain. International Journal
of Computational Linguistics and Natural Language Processing (IJCLNLP), Vol. 2 Issue 4
pp. 315‐324.
• Conference Papers:
• Afzal N., Mitkov R. and Farzindar A. (2011). Unsupervised Relation Extraction using
Dependency Trees for Automatic Generation of Multiple‐Choice Questions. In
Proceedings of the C. Butz and P. Lingras (Eds.): Canadian AI 2011, LNAI 6657, pp. 32‐
43. Springer, Heidelberg.
• Afzal N. and Pekar V. (2009). Unsupervised Relation Extraction for Automatic
Generation of Multiple‐Choice Questions. In Proceedings of RANLP'2009 14‐16
September, 2009. Borovets, Bulgaria.
51