2. Goals of the project
• Our system allows users to input a English sentence
without tags around possible errors, then it can
automatically detect and correct preposition errors.
2
3. Functions of the project
• Automatically detect and correct preposition errors
• Replacement preposition correction (RT)
• price for the tickets → price of the tickets
• Unwanted preposition correction (UT)
• discuss about the issue → discuss the issue
• Missing preposition correction (MT)
• listen music → listen to music
• Give some examples of each corrections
3
5. Methodology
• Extract Patterns From VOA Corpus
• For each sentence, generate n-grams containing
preposition. (n = 3,4,5)
• Keep the n-grams which start and end with content
words or preposition and do not contain symbol.
5
6. Methodology
• Transform the n-grams into part-of-speech n-grams
and group by part-of-speech n-grams. Finally, the n-
grams with higher frequency are patterns.
• Ex. N PREP DT N, V PREP DT N, V PREP ADJ N, …
• For RT and UT, patterns would be (pattern, index of
preposition)
• Ex. (V PREP ADJ N, 1)
• For MT, patterns would be (pattern, a set of index of
preposition should be inserted)
• Ex. (V ADJ N , set([1, 2]))
6
7. Methodology
• Preposition Error Detection
• When user input a sentence, our system will find the
n-grams probably containing preposition error by
means of matching these pattern to it with the same
part-of-speech n-gram.
• If the n-grams overlap, give priority to the n-gram with
longer length.
• Ex. “We have discussed about the issue a lot of times.”
• ('have', 'V'), ('discussed', 'V'), ('about', 'PREP'), ('the', 'DT'), ('issue', 'N')
• ('discussed', 'V'), ('about', 'PREP'), ('the', 'DT'), ('issue', 'N')
• ('have', 'V'), ('discussed', 'V'), ('about', 'PREP')
• ('about', 'PREP'), ('the', 'DT'), ('issue', 'N')
7
8. Methodology
• Automatic Preposition Error Correction
• Transform the n-gram into the query for Linggle to get
possible corrections
8
Linggle API
Generalize
the query
Output
1. There are results.
2. No result after generalize
the query two times.
Query
No result
9. Methodology
• Generating related sentences of correction
• Use the phrase of correction which returns from Linggle
• EX. [('know', 'V'), ('about', 'PREP'), ('the', 'DT'), ('weather', 'N')]
• Convert the phrase into n-grams which are made up of
content words such as noun, verb, adj. ….
• EX. ['know', 'about'], ['the', 'weather'], ['know', 'about', 'the'],
['about', 'the', 'weather'], ['know', 'about', 'the', 'weather']
• Map those n-grams against each sentence in the corpus
and calculates scores
• Display sentences with higher scores as the examples of
evidence for each correction 9
11. System Architecture
• Using modern web tech and client-server architecture to
build our project.
• Client part
• React ( Javascript UI library supported by Facebook )
• Server part
• Run Node as web server and all business logics will be
implemented with Python (Use Flask to build Restful API)
11
14. References
• Ting hui Kao, Yu-Wei Chang, Hsun wen Chiu, Tzu-Hsi Yen, Joanne
Boisson, Jian-Cheng Wu, and Jason S. Chang. 2013. CoNLL-2013
Shared Task: Grammatical Error Correction NTHU System Description.
• Nitin Madnani and Aoife Cahill. 2014. An Explicit Feedback System
for Preposition Errors based on Wikipedia Revisions.
14