ERROR DETECTION ANDFEEDBACK WITH OT-LFG FORCOMPUTER-ASSISTED LANGUAGELEARNINGYuxiu Hu & Adams BodomoMay 10, 2013
• Introduction1• OT-LFG2• A case study3Outline
Introduction• In the age of IT, marked by an increasing use ofcomputers in language learning, an effectivegrammar check sy...
Introduction• Optimality Theory (OT) has been applied incomputational phonology, and it has beenshown that the basic assum...
OT-LFG• OT-LFG is the combination of OptimalityTheory(OT) and Lexical FunctionalGrammar(LFG).• LFG is used as the represen...
Lam (2004:4)
Figure 1: Schematic representation of OT basic assumptions (Kager, 1999:8)
Mini candidate sets• “Finding the right candidates to study may bethe hardest but also useful practical skill indoing OT” ...
Error-driven Constraint Demotion•• Output Target parse p• ( his/her own parse p’)• Therefore, it is natural to conclude th...
The CaseStudy
Features in OT-LFG inputs• The input in OT-LFG is a meaning, and candidates areforms/syntactic structures (Fikkert & Hoop,...
Illustration1. [-SR, +HK]i. Omissiona. The horse has four legs.b. *Horse has four legs.• This is a mini candidate set.
Constraints• *FunctN: Avoid functional structure in the nominaldomain. (de Swart & Zwarts, 2008). Each functionalprojectio...
• FDR: parse a discourse referent by means of afunctional layer above NP. (de Swart &Zwarts, 2008)This faithfulness constr...
Constraints ranking in English• (↑Num)=SG – FunctN >> *Def/[-Fam] >> FDR>> *FunctN, FPL >> *PLMorph
Conclusion• OT-LFG can provide successful and clearaccount for linguistic issues that concerninterface analyses like the i...
• Therefore, it is possible to apply the EVALprocess of OT-LFG as the theoretical basisand practical guidance for building...
The EndThank you!
References• Clement, L., Gerdes, K. & Marlet, R. 2011. A Grammar Correction Algorithm:Deep Parsing and Minimal Corrections...
Error Detection and Feedback with OT-LFG for Computer-assisted Language Learning
Error Detection and Feedback with OT-LFG for Computer-assisted Language Learning
Upcoming SlideShare
Loading in …5
×

Error Detection and Feedback with OT-LFG for Computer-assisted Language Learning

603 views

Published on

HU, Yuxiu (Harbin Institute of Technology Shenzhen Graduate School, China)
BODOMO, Adams (The University of Hong Kong)

http://citers2013.cite.hku.hk/en/paper_603.htm

---------------------------
Author(s) bear(s) the responsibility in case of any infringement of the Intellectual Property Rights of third parties.
---------------------------
CITE was notified by the author(s) that if the presentation slides contain any personal particulars, records and personal data (as defined in the Personal Data (Privacy) Ordinance) such as names, email addresses, photos of students, etc, the author(s) have/has obtained the corresponding person's consent.

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
603
On SlideShare
0
From Embeds
0
Number of Embeds
13
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • As is said in the abstract, this study attempts to present and explore the properties of OT-LFG that may be used for building an online grammar check system, we focus on the linguistic analysis withOT-LFG for the possible application in grammar check online system.
  • How OT-LFG works
  • Clement, Gerdes & Marlet (2011) thinks that one of the reasons is that linguistics, the indispensible theoretical basis of grammar check system, has been focusing itself on its own development as a science of language, though some scholars in sociolinguistics, psycholinguistics, and foreign language teaching have shown indirect interest for the development of error grammars.
  • proposed by Prince & Smolensky (1993) for phonological studies in the first place, Both OT research and research on computational linguistics
  • So OT explains language variability with language universality
  • Languages are universal in terms of the assumption of a constraint set universal to all the languages in the world, but languages are variable because of the different rankings of universal constraints in different languages. This universal constraint set is abbreviated as CON in the schematic representation above. Different rankings of these constraints result in different languages such as languages X and Y. Therefore, learning a language is to check whether the paring of form and meaning observed is predicted as optimal by the learner’s own system (Kuhn, 2003),. Given an input, an infinite set of output candidates is generated, and then evaluated by a set of hierarchically ranked constraints, at last the optimal candidate is selected (Kager, 1999). The ranking of constraints is language specific, but constraints are universal, although universals do not play the same role in every language (Archangeli, 1997).
  • An output is “optimal” when it incurs the least serious violation ranked in a constraint hierarchy of a language. For a given input (meaning), the grammar generates and then evaluates an infinite set of output candidates and selects the optimal candidate. The higher the constraint that a candidate violates is, the greater cost to harmony is made. Candidate b violates C1 that is ranked highest, which makes candidate b be out of the competition in the first round, while the other candidates all satisfy C1. Even if candidate b violates C1 only and satisfies the other two constraints, it is still out, because of the highest ranked position of C1. In the second round, only candidates a and d satisfy C2, which is ranked highest in this round. At last candidate d wins with respect to the constraints concerned or activated and the hierarchical ranking of these constraints. This is the basic assumption of OT working process.
  • “The diversity and infinity of candidates is a source of worry to many people when they first encounter OT.” (McCarthy, 2008:17)They suggest the use of pairs of winner and loser. and the correct ranking must make the grammatical structure more harmonic than the ungrammatical competitor, so given a suitable set of such winner and loser pairs, we can find a ranking that makes sure the winner is more harmonic than its corresponding loser in each pair (Tesar & Smolensky, 1998a). Then how to select informative winner/loser pairs for investigating learning problems? The current study made use of the selecting idea of Error-Driven Constraint Demotion to choose informative winner/loser pairs. Here I will give a brief introduction of the basic idea of Error-Driven Constraint Demotion on winner/loser pair selection.
  • Error-Driven Constraint Demotion (Tesar, 1998) incorporates the basic principle of Constraint Demotion into a procedure for learning a grammar from informative winner/loser pairs. The procedure for choosing informative winner/loser pairs is described in Tesar and Smolensky (1998a) very clearly. When a learner is learning a grammar, given an input I, he/she would compute his/her own parse p’ for the input I, which he/she think is optimal with respect to his/her current constraint ranking. If this parse p’ is different from the target parse p, learning would occur, otherwise it wouldn’t. The reason is if the target parse p is the same as the learner’s parse p’, then it means that the current constraint ranking also choose p as the optimal candidate, so no need to rerank the constraints, that is no error occurs and no learning is involved. On the contrary, if the learner’s parse p’ is different from the target parse p, which means that the target parse p is not optimal according to the learner’s current constraint ranking, after receiving the positive target parse p, the learner has to rerank or modify his/her constraint ranking to make sure that the target parse p is optimal. Therefore, it is natural to conclude that an informative winner/loser pair would be composed of a target parse p and a learner’s wrong parse p’ of the same input I. In this study, the errors made by the subjects are losers and the correct forms of the corresponding English usage are winners. Different losers could have the same winner. They are all listed and analyzed as different pairs of loser and winner, and each pair is a mini candidate set. Under Huebner (1983)’s taxonomy and the types of error identified in the subjects’ compositions,
  • In This study we examined the use of English articles to illustrate and explore the EVAL function of OT-LFG, not only because of the powerful explanatory ability of OT to the interface problem of English article use, but also because the multifaceted analyses requirement of English article use is very representative of natural language processing. Data were collected through conducting an article diagnostic test and a Chinese to English translation task. The collected data have been classified with Huebner (1983)’s semantic contexts of noun phrase reference. Based on the classification, the data will be analyzed within the framework of OT-LFG with the help of Error-driven constraint demotion that is a principle of OT learnability.
  • [+HK] is defined as the state of knowledge known to the hearer and this state is assumed by the speaker, while [+SR] is defined as the state of a particular referent in a speaker’s mind, and the hearer’s state of the referent is not involved. [+definite, +specific]A: Where’s your mother?B: She is meeting the principal of my brother’s elementary school. He is a very nice man. He is talking to my mother about my brother’s grade. [+definite, -specific]A: It’s already 4pm. Why isn’t your little brother home from school?B: He just called and told me that he got in trouble! He is talking to the principal of his school! I don’t know who that is. I hope my brother comes home soon.
  • The semantic context is generics/Omission is one type of English article error detected in the data, proposed by Prince & Smolensky (1993) for phonological studies in the first place, ‘I/n this case, a conflict between the constraints FDR and *FunctN is involved. Since there must be a functional projection before a singular referent in the semantic context [-SR, +HK] in English, and the horse is more harmonic than horse in this case in English, so the conflict is resolved by ranking FDR higher than *FunctN in the target language, English, while in the native language, Mandarin, no functional projection is needed even before a bare singular referent in the context, so the conflict is resolved by ranking *FunctN higher than FDR in Mandarin. This ranking difference is illustrated with two tableaux in which the inputs are the same (Tableau 1 and Tableau 2)
  • Languages differ in the ranking of universal constraints
  • F-structure displays the predicate and its grammatical functions such as SUBJECT and OBJECT and features such as NUMBER and PRDICATE in attribute-value matrixes. C-structure shows constituency relations by organizing constituents according to X-bar theory.
  • As illustrated in Tableau 1 and Tableau 2, the inputs are identical, but the same meaning is represented by two different optimal forms in two different languages depending on the ranking of the two constraints concerned. The target language, English, says that semantic interpretations in the input must be represented syntactically, while Mandarin says that no syntactic representation is needed here in this case. In order to make the correct target form the winner, FDR must be ranked higher than *FunctN, so we got FDR >>*FunctN based on this set of candidates for correct English expressions of singular generics.
  • After analyzing the candidate sets in all semantic contexts, we can sum up the constraint ranking results for EnglishIn order to map correct article forms in English to meanings, and to make sure that correct forms of nouns in terms of morphological changes for number have been used, the constraints must be ranked as in the ranking. However, because of the influence of the constraints ranking in Mandarin, Mandarin-speaking learners may rank *FunctN higher than FDR and *PLMorph higher than FPL as how they are ranked in their native language, which results in omission errors.
  • Therefore, it is possible to apply the EVAL process of OT-LFG as theoretical basis and practical guidance build an online grammar check system.
  • Error Detection and Feedback with OT-LFG for Computer-assisted Language Learning

    1. 1. ERROR DETECTION ANDFEEDBACK WITH OT-LFG FORCOMPUTER-ASSISTED LANGUAGELEARNINGYuxiu Hu & Adams BodomoMay 10, 2013
    2. 2. • Introduction1• OT-LFG2• A case study3Outline
    3. 3. Introduction• In the age of IT, marked by an increasing use ofcomputers in language learning, an effectivegrammar check system is highly expected.• Grammar checking is one of the most commonnatural language processing technologies usedby the general public, yet little research havebeen done on it (Clement, Gerdes & Marlet,2011).
    4. 4. Introduction• Optimality Theory (OT) has been applied incomputational phonology, and it has beenshown that the basic assumptions andmethods of OT can be formalized andimplemented to the benefit of both researchareas (Kager, 1999; Kuhn, 2003; Ma, 2001;Ma & Wang, 2008).
    5. 5. OT-LFG• OT-LFG is the combination of OptimalityTheory(OT) and Lexical FunctionalGrammar(LFG).• LFG is used as the representational basis ofOT-LFG. The basic idea of OT is that variationacross languages is a result of competitionamong a set of universal and violableconstraints.
    6. 6. Lam (2004:4)
    7. 7. Figure 1: Schematic representation of OT basic assumptions (Kager, 1999:8)
    8. 8. Mini candidate sets• “Finding the right candidates to study may bethe hardest but also useful practical skill indoing OT” (McCarthy, 2002:34)• “OT is inherently comparative; Thegrammaticality of a structural description isdetermined not in isolation, but with respectto competing candidates.”(Tesar &Smolensky, 1998: 238)
    9. 9. Error-driven Constraint Demotion•• Output Target parse p• ( his/her own parse p’)• Therefore, it is natural to conclude that an informative winner/loser pairwould be composed of a target parse p and a learner’s wrong parse p’ ofthe same input I.Input
    10. 10. The CaseStudy
    11. 11. Features in OT-LFG inputs• The input in OT-LFG is a meaning, and candidates areforms/syntactic structures (Fikkert & Hoop, 2009). The features inthe input that will be used in this study are listed and explained asbelow:- “REF-NUM” is used for the semantic referent’s numberspecification.- “REF-SR” refers to the specificity status of the semantic referent.- “REF-HK” is used to specify if the listener has the knowledge ofthe semantic referent, assumed by the speaker.• There are two senses of Number in this study. In the input, REF-NUM refers to semantic number, while the number annotation inthe output is a syntactic feature.
    12. 12. Illustration1. [-SR, +HK]i. Omissiona. The horse has four legs.b. *Horse has four legs.• This is a mini candidate set.
    13. 13. Constraints• *FunctN: Avoid functional structure in the nominaldomain. (de Swart & Zwarts, 2008). Each functionalprojection in the noun phrase performs a violation ofthis constraint. If this constraint is ranked higher thanfaithfulness constraints that require expression ofmeanings in functional layers above NP, then therewould be no determiners. Therefore, in Mandarin thisconstraint should outrank the faithfulness constraintsthat are involved. Contrary to Mandarin, thefaithfulness constraints should be ranked high inEnglish.
    14. 14. • FDR: parse a discourse referent by means of afunctional layer above NP. (de Swart &Zwarts, 2008)This faithfulness constraintrequires that a discourse referent must beparsed by means of a functional layer abovean NP. There must be functional layers aboveNP. This faithfulness constraint and theconstraint *FunctN form a conflict to decidethe existence of functional layers above NPs.
    15. 15. Constraints ranking in English• (↑Num)=SG – FunctN >> *Def/[-Fam] >> FDR>> *FunctN, FPL >> *PLMorph
    16. 16. Conclusion• OT-LFG can provide successful and clearaccount for linguistic issues that concerninterface analyses like the interface analysisof syntax and semantics in the acquisition ofEnglish articles.• OT-LFG could provide a theoretical basis forinvestigating language learning throughdifferent rankings of the same universalconstraints.
    17. 17. • Therefore, it is possible to apply the EVALprocess of OT-LFG as the theoretical basisand practical guidance for building an onlinegrammar check system.
    18. 18. The EndThank you!
    19. 19. References• Clement, L., Gerdes, K. & Marlet, R. 2011. A Grammar Correction Algorithm:Deep Parsing and Minimal Corrections for a Grammar Checker. LNAI 5591, pp.47-63. Springer – Verlag Berlin Heidelberg.• de Swart, H., & Zwarts, J. (2008). Article Use Across Languages: An OT Typology.Proceedings of SuB12, Oslo: ILOS2008, 628-644.• Fikkert, P., & Hoop, H. D. (2009). Language Acquisition in Optimality Theory.Linguistics, 47(2), 311-357.• Kager, R.1999. Optimality Theory. Cambridge, New York: Cambridge UniversityPress.• Kuhn, J. 2003. Optimality-Theoretic Syntax: A Declarative Approach. Stanford:CSLI Publications, Centre for the Study of Language and Information.• Lam, O. 2004. Aspects of the Cantonese Verb Phrase : Order and Rank.Unpublished MPhil Dissertation, The University of Hong Kong, Hong Kong.• Ma, Q. W. 2008. Optimality Theory. Shanghai Education Press.• Ma, Q. W & Wang, H. M. 2008. The Future and Development of OptimalityTheory. Contemporary Linguistics. 3:237-245.• McCarthy, J. J. 2002. A Thematic Guide to Optimality Theory. Cambridge:Cambridge University Press.• Tesar, B., & Smolensky, P. 1998. Learnability in Optimality Theory. LinguisticInquiry, 29(2), 229-268.

    ×