Ph d sem_1@iitm

Ontology-based Multiple-ChoiceOntology-based Multiple-Choice
Question GenerationQuestion Generation
Vinu E.V (CS12D019)Vinu E.V (CS12D019)
PhD Scholar IITMPhD Scholar IITM
Guide: P Sreenivasa KumarGuide: P Sreenivasa Kumar

Overview
● Importance of Question Generation (QG)
● Why do we use an ontology for QG?
● Related works
● QG from Ontologies
● Heuristics for selecting most-relevant questions
● Difficulty-scores of MCQs
● Difficulty-level of Question-set
● Experiments & Empirical study
● Conclusion

Importance of Question Generation
● Assessing a learner.
● An automated system which asks questions and
enriches its knowledge base (KB).
● Assessing a KB itself – what all queries a system
can answer.

Importance of Question Generation
● Assessing a learner.
● An automated system which asks questions and
enriches its knowledge base (KB).
● Assessing a KB itself – what all queries a system
can answer.
●
Today's focus:Today's focus: Ontology-based Knowledge
Assessement System
✗ Generate MCQ-sets of specific sizes and difficulty-levels.

Multiple-Choice Question
Components of an MCQ:
Stem (S)Stem (S). Statement that introduces a problem
to a learner.
Choice set.Choice set. Set of options corresponding to S.
It can be further divided into two sets:
Keys.Keys. Set of correct options.
Distractors.Distractors. Set of incorrect options.

Previous Works
●
[1970-83] Generate open-ended questions from text
✗ WH-Questions
E.g., A boy rode the horse. => Who rode the horse?
✗ Filling prepositions
E.g., There were hundreds of people at the park.
=> There were hundreds of people___ the park.
● [1983-91] Generate MCQs from text
✗ Distractors based on lexical features
● [2003-15] Ontology-based MCQ generation

Why Ontologies?
Movie
Documentary
Film
Black &White
Film
Fantasy
Film
Indian
Doc Film
New Zealand
Doc Film
Thing
Class Heirarchy
..
..
..
..
..
..
..
..
..
..
..
..
TopProperty
actedIn madeAt producedBy
shotAt
Property Heirarchy
..
..
..
..
..
..
..
..
..
Class Definition
Class Assertion
Oscar_Movie(braveheart)
       .
       .
Role Assertion
hasReceived(braveheart, oscar_award_12)
       .
       .
Reasoner
Fact++
Hermit
.
.

Why Ontologies?
TBox: Class heirarchy, Class definitions Rbox*: Property heirarchy
ABox: Class assertions, Role assertions
SWRL Rules: antecedent  ⇒ consequent
SPARQL: to query and retrieve graph patterns
*Role inclusion axioms
Movie
Documentary
Film
Black &White
Film
Fantasy
Film
Indian
Doc Film
New Zealand
Doc Film
Thing
Class Heirarchy
..
..
..
..
..
..
..
..
..
..
..
..
TopProperty
actedIn madeAt producedBy
shotAt
Property Heirarchy
..
..
..
..
..
..
..
..
..
Class Definition
Class Assertion
Oscar_Movie(braveheart)
       .
       .
Role Assertion
hasReceived(braveheart, oscar_award_12)
       .
       .
Reasoner
Fact++
Hermit
.
.

Ontology-based MCQs
(Related work)
● [FLAIRS2011] SWRL rule-based MCQs
✗
An Oscar Movie is a Movie that (has received (an AcademyAward)).
What is a movie that has received an AcademyAward?
An oscar movie is a movie that has recieved
_________.
● [OWLED2012] Similarity-based MCQs
✗ Analogy questions of the form “A is to B as __ is to__?”
Movie(?x)∧ hasReceived(?x, ?y) ∧ AcademyAward(?y) ⇒ OscarMovie(?x)
Stem : Movie : OscarMovie
Choice set: a) Award : OscarAward
b) City : Country
c) Director : Producer
d) Book : Movie

Ontology-based MCQs
(Related work)
● [ AIME2013, ICALT2011, FLAIRS2008]
Template-based MCQ Generation
✗ Only selected genericgeneric templates were studies.
Choose an example of [ClassName].
[Instance] belongs to which class?
What is the domain of [PropertyName]?
✗ No systematic study of possible generic templates is present.
✗
Domain specificDomain specific entities were used in the template.
What is the show time of the [Movie]?
What is the name of the movie which is shown in [Cinema]?

Proposed generic approach
● Studied various (sentence forming) patterns in an ABox
graph.

QG from ABox
Choose a Movie which is shot at Scotland.

QG from ABox
Choose a Movie which is based on “the great escape”.

QG from ABox
Choose the director of a movie which is shot at Scotland.

Generic Patterns in an ABox
● Patterns in an ABox can be expressed as combinations of
Object Property (O), Datatype Property (D) and Class
names (C).
: Reference instance
, : Object Properties (with directions)
: Datatype Property
: Instance
: Literal value

Generic Patterns in an ABox
names (C).
SELECT ?x ?O ?i
WHERE {?x ?O ?i.
?O a ?owl:ObjectPPty.}
SELECT ?x ?O1 ?i1 ?O2 ?i2
WHERE {?x ?O1 ?i1. ?x ?O2 ?i2.
?O1 a ?owl:ObjectPPty.
?O2 a ?owl:ObjectPPty.}
SELECT ?x ?O1 ?i1 ?O2 ?i2 ?D ?v
WHERE {?x ?O1 ?i1. ?x ?O2 ?i2.
?x ?D ?v.
?D a ?owl:DatatypePPty.}

Patterns in an ABox
names (C).
4 size-1 patterns
.
.
.
10 size-2 patterns
.
.
.
26 size-3 patterns

Identifying Essential Patterns
● Identified basic set of 40 predicate combinations.
● Analyze features of real-world questions.
✗ Mooney Natural Language Learning Data, University of Texas.
✗ They provide Knowledge base in OWL and English questions.
✗ The English questions were gathered from “real” people using a Web
interface.
● Map Pattern-based qns. to atleast one of the 40
predicate combinations.
● Interestingly, most of the predicate combinations
were not being mapped to by any real-world
Pattern-based questions
✗ Out of 40 combinations only 13only 13 were found to be necessary!

13 Essential Patterns and
19 Question-templates

Distractor Generation
●
Choice set:Choice set: contains Key and Distractorscontains Key and Distractors
✗
(Good) Distractors = Possible Answers – Actual Answers(Good) Distractors = Possible Answers – Actual Answers
✗
Possible Answers can be found using thePossible Answers can be found using the Potential Set ofof
the tuple w.r.t. the position of its key variable.the tuple w.r.t. the position of its key variable.
Potential set w.r.t = Domain(O1
) Domain(O2
)1 21 2
Potential set w.r.t = Range(O1
) Domain(O2
)1 21 2

Non-pattern-based stems
●
Cannot be generated using generic patterns.Cannot be generated using generic patterns.
●
Aggregation-based stems*:Aggregation-based stems*: performperform aggregationaggregation
operation over tuples generated using generic patterns.operation over tuples generated using generic patterns.
Examples,Examples,
✗
Choose the Movie with the highest number of academy
awards.
✗
Choose the Movie with the lowest number of academy
awards.
*Applicable only to generic patterns which contain a datatype property.

●
Choose the State with the (highest/longest/oldest/...)
population.
(Key: Arizona)
●
Choose the State with the (lowest/shortest/youngest/...)
population.
(Key: Connecticut)
●
Choose the River with the (highest/longest/oldest...)
length.
(Key: Neosho)
●
Choose the River with the (lowest/shortest/youngest/...)
length.
(Key: Ouachita)
● Explicit Semantic Analysis (ESA) relatedness score is used to
determine the qualifying adjective.

Heuristics for Selecting Questions

Tuple selection Heuristics
● The quality of a good question set: Unbiased and
cover the required knowledge boundaries of the
domain.
● Heuristics which mimic the question selection
process of human experts.
● We propose three heuristics:
✗ Property-based screening
✗ Concept-based screening
✗ Similarity-based screening

Property-based Screening
P1 = {isDirectedBy, isProducedBy}
P2 = {isDirectedBy, isBasedOn}
● Assign a triviality-score to each property sequence.
● Property-sequence triviaity-score (PSTS)
Avoids routine questions that are less likely to be chosen by a
human to conduct a test.

Property-based Screening
isDirectedBy
isProducedBy
isBasedOnClass:Movie ● P1 = {isDirectedBy,
isProducedBy}
● P2 = {isDirectedBy, isBasedOn}
PSTS(P1) >> PSTS(P2)
PSTS(P) <= Tp (max. Triviality score (max. Triviality score
threshold) threshold)

Concept-based screening
argo
Movie
scotland
SovereignState
a
hasCapital
edinburghbraveheart
isShotAt
a
glasgow
largestCity
oscar academy 12
wonAward
isBasedOn
the great escape
Book
a
a
<wonAward, oscar-academy-12, argo, isBasedOn, the-great-escape>
<largestCity, glasgow, scotland, hasCapital, edinburgh>
Generate Tuples
Select only those tuples/questions that are relevant to the domain.

● Approach: Select the tuple, only if the reference-instance
of the tuple belongs to a relevant class.
ClassesClasses
Horror_Movies
Oscar_Movies
City
SoveignState
Movies
State
Awards
Capital
Person
Director
Producer
Human
Top 5 relevant ClassesTop 5 relevant Classes
Horror_Movies
Oscar_Movies
Movies
Awards
Director
Top 3 relevant ClassesTop 3 relevant Classes
Oscar_Movies
Movies
Awards
Top n relevant ClassesTop n relevant Classes
Movies
Awards
....
....
To automate the relevant class selection:
Key Concept Extraction (KCE) API
[ISWC2008]
Topological measures such
as density and coverage.
Statistical and lexical measures such
as popularity and cognitive criteria (natural
categories)

Similarity-based screening
● Tuple-set is not small enough.
● Contains semantically similar tuples.
● Group the tuples based on their semantic-similarities.
Similarity(.) is a symmetric function which determines the similarity of
two tuples w.r.t. their reference-instances.
Similarity(t1
, t2
) =
| Ins(P(t1
)) Ins(P(t∩ 2
)) | #Semantically similar triples in t1
and t2
Max(#Triples in t1
, #Triples in t2
)
+
P(t): Properties in the tuple t
Ins(P(t)): Instances which satisfy the properties in P(t)
| Ins(P(t1
)) Ins(P(t∪ 2
)) |

Similarity-based screening
2 locally similar groups, based on a threshold c
Select representative tuples based on tuple popularity.
✗ The popularity of a tuple is defined as the sum of the popularities of its instances.
✗ The popularities of an instance x is defined in terms of its connectivity from
other instances which belong to a concept which x does not belong to.

Assigning Stem Difficulty-score
● How its property combination is making it difficult to
answer.
●
Empirical study: increasing the answerspace of the
properties in a stem has an effect on its difficulty value
Choose a President who was born on Feb 12th
1809
Choose an American President who was born on Feb 12th
1809.
Being a more generic concept (unary predicate) than AmericanPresident,
the concept President, when used in a stem, makes the question difficult to
answer.

Assigning Stem Difficulty-score
● Practical approach to make a stem difficult.
✗ Incorporating a predicate p1 which is true for large number of individuals, along with a predicate p2
which is present only for comparatively less number of instances.
✗ p1 may deviate the learner away from the correct answer and p2 may direct her to the correct
answer.
● Find property sequences with less PSTS.
✗
Less PSTS does not always implies high difficulty-score.
✗
Condition-1:Condition-1: p1, p2 P such that |Inst(p1)| >> |Inst(p2)|∃ ∈ p1, p2 P such that |Inst(p1)| >> |Inst(p2)|∃ ∈
Inst(p)Inst(p): individuals satisfying the property: individuals satisfying the property pp
✗
Difficulty-score of a tupleDifficulty-score of a tuple tt is defined as:is defined as:
Difficulty(t) =1 / ePSTS(P
t
)
Property sequenceProperty sequence PSTSPSTS Difficulty-scoreDifficulty-score
{IsDirectedBy, isProducedBy}{IsDirectedBy, isProducedBy} high low
{isDirectedBy, isBasedOn}{isDirectedBy, isBasedOn} low high
{isBasedOn, wonAward}{isBasedOn, wonAward} low high, if Cond.-1 satisfies

Difficulty-level of Question-set

● Difficulty-level of a question-set depends on the
difficulty-scores of its questions.
●
G = (V, E)
V = {t | t T }, T∈ is the set of heuristically selected
tuples.
E = {(t1, t2 ) | t1, t2 Similarity(t∈ 1, t2) mgs}≥
mgs: minimum globalsimilarity threshold

t2
t3
t1
t5
t6
t7
t4
t10
t8
t9
t11
t12t13
t15
t14

t2
t3
t1
t15
t6
t7
t14
t10
t8
t9
t11
t12t13
t5
t4
Ensure Coverage:Ensure Coverage:
Include all Isolated vertices
Select one from each of the
similar vertices

t2
t3
t1
t15
t6
t7
t14
t10
t8
t9
t11
t12t13
t5
t4
similar vertices
Ensure Not-baised:Ensure Not-baised:
No two similar vertices should
be included

t2
t3
t1
t15
t6
t7
t14
t10
t8
t9
t11
t12t13
t5
t4
“Maximal Independent Set property”
similar vertices
be included

t2
t3
t1
t15
t6
t7
t14
t10
t8
t9
t11
t12t13
t5
t4
“Maximal Independent Set property”
similar vertices
be included
Difficulty-level of the question-set depends on the difficulty-scores of {t1, t13, t14, t15}

Controlling the Difficulty-level of
Question-set
t2
t3
t1
t15
t6
t7
t14
t10
t8
t9
t11
t12t13
t5
t4
Q-set = {t4, t5}
t1 t2 t3 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15
Min Max

Question-set
t2
t3
t1
t15
t6
t7
t14
t10
t8
t9
t11
t12t13
t5
t4
t1 t2 t3 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15
Min Max
Q-set = {t4, t5, t15}

Question-set
t2
t3
t15
t6
t7
t14
t10
t8
t9
t11
t12t13
t5
t4
t1 t2 t3 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15
Min Max
Q-set = {t4, t5, t15}

Question-set
t2
t3
t15
t6
t7
t14
t10
t8
t9
t11
t12t13
t5
t4
t1 t2 t3 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15
Min Max
Q-set = {t4, t5, t14, t15}

Question-set
t2
t3
t15
t6
t7
t14
t10
t8
t9
t11
t12t13
t5
t4
t1 t2 t3 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15
Min Max
Q-set = {t4, t5, t14, t15
t13}

Question-set
t2
t3
t15
t6
t7
t14
t10
t8
t9
t11
t13
t5
t4
t1 t2 t3 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15
Min Max
Q-set = {t4, t5, t14, t15
t13}

Question-set
t2
t3
t15
t6
t7
t14
t10
t8
t9
t11
t13
t5
t4
t1 t2 t3 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15
Min Max
Q-set = {t4, t5, t14, t15
t13, t11}

Question-set
t2
t3
t15
t6
t7
t14
t10
t8
t11
t13
t5
t4
t1 t2 t3 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15
Min Max
Q-set = {t4, t5, t14, t15
t13, t11}

Question-set
t2
t3
t15
t6
t7
t14
t10
t8
t11
t13
t5
t4
t1 t2 t3 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15
Min Max
Q-set = {t4, t5, t14, t15
t13, t11, t10}

Question-set
t2
t3
t15
t6
t7
t14
t10
t8
t11
t13
t5
t4
t1 t2 t3 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15
Min Max
Q-set = {t4, t5, t14, t15
t13, t11, t10, t8}

Question-set
t2
t3
t15
t6
t7
t14
t10
t8
t11
t13
t5
t4
t1 t2 t3 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15
Min Max
Q-set = {t4, t5, t14, t15
t13, t11, t10, t8
t7 }

Question-set
t2
t3
t15
t7
t14
t10
t8
t11
t13
t5
t4
t1 t2 t3 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15
Min Max
Q-set = {t4, t5, t14, t15
t13, t11, t10, t8
t7 }

Question-set
t2
t3
t15
t7
t14
t10
t8
t11
t13
t5
t4
t1 t2 t3 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15
Min Max
Q-set = {t4, t5, t14, t15
t13, t11, t10, t8
t7, t3 }

Question-set
t2
t3
t15
t7
t14
t10
t8
t11
t13
t5
t4
t1 t2 t3 t6 t7 t8 t9 t10 t11 t12 t13 t14 t15
Min Max
Q-set = {t4, t5, t14, t15
t13, t11, t10, t8
t7, t3, t2}

●
Experiment-1:Experiment-1: How effectively our heuristics
help in generating question-sets which are
close to those prepared by domain experts?
●
Experiment-2:Experiment-2: What is the correlation of the
predicted difficultylevelspredicted difficultylevels of the stems with their
actual difficultylevelsactual difficultylevels?

Experiment-1
●
Objective:Objective: Evaluate how close the question-sets
generated by our approach (AGSetsAGSets) are to the
benchmark question-sets (BMSetsBMSets).
●
Datasets used:Datasets used: DSA ontology, GEO ontology,
MAH ontology. (available online)
●
BM-Sets:BM-Sets: Prepared by experts of the domain (either
involved in the development of the ontologies or have
a detailed understanding of the knowledge formalized
in the ontology.)
Set-ASet-A (#25),(#25), Set-BSet-B (#50) and(#50) and Set-CSet-C (#75)(#75) (available online)(available online)

Experiment-1
●
Precision & Recall measures are used to compare AG-setsPrecision & Recall measures are used to compare AG-sets
and BM-sets.and BM-sets.
●
For each of the tuples in the AG-Sets, we found the mostFor each of the tuples in the AG-Sets, we found the most
matching tuple in the BM-Sets, thereby establishing amatching tuple in the BM-Sets, thereby establishing a
mapping between the sets.mapping between the sets.
●
Results:Results:
✗
TheThe Precision values were in the range: 72-93%values were in the range: 72-93%
✗
TheThe Recall values were in acceptable range ( ≈ 50%).values were in acceptable range ( ≈ 50%).

Experiment-2
●
Objective:Objective: Find the actual difficulty-level, and
compare it with the predicted difficulty-level.
● Estimation of actual difficulty-level
✗
Use Item Response TheoryItem Response Theory (IRT) principles
✗ IRT is a Psychometric method for assessing the .
✗
Probability = e(proficiency−difficulty)
1 + e(proficiency−difficulty)

Experiment-2
●
✗
Probability = e(proficiency−difficulty)
1 + e(proficiency−difficulty)

Conclusion
● A method to generate MCQs for educational assessment-tests
from formal ontologies.
● A set of heuristics were employed to select only those questions
which are most appropriate for conducting a domain related test.
● A method to determine the difficulty-level of a question-stem and
an algorithm to control the difficulty of a question-set were also
proposed.
● The effectiveness of the suggested question selection heuristics is
studied by comparing the resulting questions with those questions
which were prepared by domain experts.
● The difficulty-scores of questions computed by the proposed
system are highly correlated with their actual difficulty-scores
determined with the help of IRT applied to data from classroom
experiments.

Publications based on this work
● [FLAIRS2015] E. V. Vinu and P. S. Kumar. Improving large-scale
assessment tests by ontology based approach. In Proceedings of
theTwenty-Eighth International FLAIRS 2015, Hollywood, Florida. May
18-20, 2015., page 457, 2015.
● [SWJ2015] E. V. Vinu and P. S. Kumar. Automated Generation of
Assessment Testsfrom Domain Ontologies. In Semantic Web –
Interoperability, Usability, Applicability an IOS Press Journal. Accepted April
20, 2016.

References
● [OWLED2012] T. Alsubait, B. Parsia, and U. Sattler. Mining ontologies for
analogy questions: A similarity-based approach. Volume 849 of CEUR
Workshop Proceedings. OWLED, 2012.
● [FLAIRS2011] Zoumpatianos, K.; Papasalouros, A.; and Kotis, K.
Automated transformation of swrl rules into multiple-choice questions. In
Murray, R. C., and McCarthy, P. M., eds.,FLAIRS Conference. AAAI
Press, 2011.
● [AIME2013] A. B. Abacha, M. D. Silveira, and C. Pruski. Medical ontology
validation through question answering. In AIME, pages 196–205, 2013.
● [ICALT2011] M. M. Al-Yahya. Ontoque: A question generation engine
foreducational assesment based on domain ontologies. In ICALT, pages
393–395. IEEE Computer Society, 2011.
● [ISWC2008] Wu, G.; Li, J.; Feng, L.; and Wang, K. 2008. Identifying
potentially important concepts and relations in an ontology. In
International Semantic Web Conference, 33–49.

Experiment-1
●
Three parameters help in controlling the final question count:Three parameters help in controlling the final question count:
✗
Tp (max. Triviality score threshold)(max. Triviality score threshold)
✗
I (number of important concepts)(number of important concepts)
✗
mls (min. local-similarity score threshold for a locally-similar group)(min. local-similarity score threshold for a locally-similar group)
✗
mgs
●
Calculates appropriate values for each of these parameters inCalculates appropriate values for each of these parameters in
a sequential manner.a sequential manner.
✗
Tp is fixed, to limit the number of common property patterns in the resultis fixed, to limit the number of common property patterns in the result
✗
then,then, I is determined to select only those questions which are related to the mostis determined to select only those questions which are related to the most
important domain concepts.important domain concepts.
✗
After that, the parametersAfter that, the parameters mls andand mgs are fixed to avoid questions which areare fixed to avoid questions which are
semantically similar.semantically similar.

Challenges involved
[SWJ-2016]

Identifying Essential Patterns
● Identified basic set of 40 predicate combinations.
● Analyzed features of real-world (factual) questions
✗ 1748 FQs gathered from three different domains
✗ Mooney's Repository, University of Texas
✗ 570 (Pattern-based) + 729 (Non-pattern-based) + 449 (Invalid/redundant)
● Map 570 Pattern-based qns. to atleast one of the 40
predicate combinations.
●
Most of the predicate combinations were not being
mapped to by any real-world Pattern-based questions
✗ Out of 40 combinations only 13 were found to be necessary!

●
Potential set:Potential set: set of instances or literal values
which maymay possibly satisfy the properties in a
tuple (or a pattern).
Potential set w.r.t
braveheart = Domain(isShotAt) Domain(isDirectedBy)
= Movie Movie
Potential set w.r.t = Domain(O1
) Domain(O2
)1 21 2
Potential set w.r.t = Range(O2
)2
Potential set w.r.t = Domain(D) Domain(O2
) C

●
Choose the State with the highest population.
(Key: Arizona)
●
Choose the River with the longest length.
(Key: Ouachita)
●
Choose the State with the lowest population.
(Key: Connecticut)
●
Choose the River with the shortest length.
(Key: Neosho)

●
Choice set:Choice set: contains Key and Distractorscontains Key and Distractors
✗
(Good) Distractors = Possible Answers – Actual Answers (Good) Distractors = Possible Answers – Actual Answers
t = <“April 12, 2011”, argoargo, hasReleaseDate, Movie, isBasedOn, the-great-escape>
Distractors(t)= PostentialSet(P(t)) – KeyDistractors(t)= PostentialSet(P(t)) – Key
P(t) = { } P(t) = { }
Key = {Key = {argo}}
hasReleaseDate, isBasedOn

argo
Movie
scotland
SovereignState
a
hasCapital
edinburghbraveheart
isShotAt
a
glasgow
largestCity
oscar academy 12
wonAward
isBasedOn
the great escape
Book
a
a
<wonAward, oscar-academy-12, argo, isBasedOn, the-great-escape>
<largestCity, glasgow, scotland, hasCapital, edinburgh>
Choose a Sovereign State which has capital Edinburgh and
has largest city Glasgow.
Choose a movie which won oscar-academy-12 award, and is based on “the
great escape”.
Generate Tuples

Ph d sem_1@iitm

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (10)

Similar to Ph d sem_1@iitm

Similar to Ph d sem_1@iitm (20)

Recently uploaded

Recently uploaded (20)

Ph d sem_1@iitm