Successfully reported this slideshow.
Your SlideShare is downloading. ×

文法および流暢性を考慮した頑健なテキスト誤り訂正 (第15回ステアラボ人工知能セミナー)

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 99 Ad

More Related Content

Similar to 文法および流暢性を考慮した頑健なテキスト誤り訂正 (第15回ステアラボ人工知能セミナー) (20)

More from STAIR Lab, Chiba Institute of Technology (20)

Advertisement

Recently uploaded (20)

文法および流暢性を考慮した頑健なテキスト誤り訂正 (第15回ステアラボ人工知能セミナー)

  1. 1. 文法および流暢性を考慮した 頑健なテキスト誤り訂正 坂口 慶祐 1
  2. 2. Natural Language Processing Algorithm Algorithm Algorithm Algorithm Algorithm … POS Parse Sentiment Paraphrase Translation … 2
  3. 3. Natural Language Processing Algorithm Algorithm Algorithm Algorithm Algorithm … POS Parse Sentiment Paraphrase Translation … 3
  4. 4. 4
  5. 5. 5
  6. 6. Noisy Text is Everywhere 6
  7. 7. Human brain vs. Computers 7
  8. 8. Outline Robust Text Correction for Grammar and Fluency 1. Character-level 2. Word-level 3. Sentence (phrase)-level 8
  9. 9. Outline Robust Text Correction for Grammar and Fluency 1. Character-level 2. Word-level 3. Sentence (phrase)-level 9
  10. 10. 1. Character-level robust processing Robsut Wrod Reocginiton via semi-Character Recurrent Neural Network. (AAAI 2017) Keisuke Sakaguchi, Kevin Duh, Matt Post, Benjamin Van Durme 10
  11. 11. Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn’t mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe. (Cambridge University Effect: Davis 2003) 11
  12. 12. 12
  13. 13. Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn’t mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe. (Cambridge University Effect: Davis 2003) Human (brain) is good at dealing with noisy input robustly. 13
  14. 14. Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn’t mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe. (Cambridge University Effect: Davis 2003) Question: Can we build a computational model which replicates this robust mechanism? Human (brain) is good at dealing with noisy input robustly. 14
  15. 15. Masked priming study Swap (Forster et al. 1987) gadren-GARDEN Shuffle (Perea and Lupker. 2004) caniso-CASINO Delete (Humphreys et al. 1990) blck-BLACK Insert (Van Assche and Grainger. 2006) juastice-JUSTICE 15
  16. 16. Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn’t mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe. (Cambridge University Effect: Davis 2003) Question: Can we build a computational model which replicates this robust mechanism? Human (brain) is good at dealing with noisy input robustly. 16
  17. 17. semi-Character RNN (scRNN) Simple RNN except … xn = 2 4 bn in en 3 5 e.g., “University” is represented as bn = {U = 1} en = {y = 1} in = {e = 1, i = 2, n = 1, s = 1, t = 1, v = 1} 17
  18. 18. Exp1: Spelling Correction Corpus: Penn TreeBank, (10k vocabulary) Parameters: - hidden unit size: 650 - mini-batch size 20 Comparison: - Enchant - 2 commercial products - char aware neural LM (Kim et al., 2016, AAAI) 18
  19. 19. Exp1: Spelling Correction Three conditions in test time - Jumble (Cambridge à Cmbarigde) - Delete (Cambridge à Camridge) - Insert (Cambridge à Cambpridge) Results (accuracy): Jumble Delete Insert CharCNN 16.2 19.8 35.5 Enchant 57.6 35.4 89.6 Commercial A 54.8 60.2 93.5 Commercial B 54.3 71.7 73.5 scRNN 99.4 85.6 97.0 19
  20. 20. Exp1: Spelling Correction Three conditions in test time - Jumble (Cambridge à Cmbarigde) - Delete (Cambridge à Camridge) - Insert (Cambridge à Cambpridge) Results (accuracy): Jumble Delete Insert CharCNN 16.2 19.8 35.5 Enchant 57.6 35.4 89.6 Commercial A 54.8 60.2 93.5 Commercial B 54.3 71.7 73.5 scRNN 99.4 85.6 97.0 place à pace miss, mass, mess à mss 20
  21. 21. Exp2: Comparison with eye tracking 21
  22. 22. Eye tracking study 22
  23. 23. Eye tracking studyCondition Example #fixation Regression (%) Avg.Fixation (ms) N The boy could not solve the problem so he asked for help. INT The boy cuold not slove the probelm so he aksed for help. END The boy coudl not solev the problme so he askde for help. BEG The boy oculd not oslve the rpoblem so he saked for help. Rayner et al. (2006) 23
  24. 24. Eye tracking studyCondition Example #fixation Regression (%) Avg.Fixation (ms) N The boy could not solve the problem so he asked for help. 10.4 15.0 236 INT The boy cuold not slove the probelm so he aksed for help. 11.4* 17.6* 244* END The boy coudl not solev the problme so he askde for help. 12.6† 17.5* 246† BEG The boy oculd not oslve the rpoblem so he saked for help. 13.0‡ 21.5† 259‡ Rayner et al. (2006) p<0.01 respectively 24
  25. 25. Eye tracking studyCondition Example #fixation Regression (%) Avg.Fixation (ms) N The boy could not solve the problem so he asked for help. 10.4 15.0 236 INT The boy cuold not slove the probelm so he aksed for help. 11.4* 17.6* 244* END The boy coudl not solev the problme so he askde for help. 12.6† 17.5* 246† BEG The boy oculd not oslve the rpoblem so he saked for help. 13.0‡ 21.5† 259‡ Rayner et al. (2006) Reading difficulty: N < INT ≤ END < BEG p<0.01 respectively 25
  26. 26. Exp2: Comparison with eye tracking Reading difficulty (human) : N < INT ≤ END < BEG Trained and tested with 4 conditions: INT: same as the exp.1 BEG: last char is fixed END: first char is fixed ALL: bag of characters 26
  27. 27. Exp2: Comparison with eye tracking Reading difficulty (human) : N < INT ≤ END < BEG Condition Example accuracy INT As a relust , the lnik beewetn the fureuts and sctok mretkas rpiped arapt . 98.96 END As a rtelus , the lkni betwene the feturus and soctk msatrek rpepid atarp . 98.68* BEG As a lesurt , the lnik bweteen the utufers and tocsk makrtes pipred arpat . 98.12† ALL As a strule , the lnik eewtneb the eftusur and okcst msretak ipdepr prtaa . 96.79‡ *: p = 0.07, †.‡: p<0.01 respectively 27
  28. 28. Exp2: Comparison with eye tracking Reading difficulty (human) : N < INT ≤ END < BEG Condition Example accuracy INT As a relust , the lnik beewetn the fureuts and sctok mretkas rpiped arapt . 98.96 END As a rtelus , the lkni betwene the feturus and soctk msatrek rpepid atarp . 98.68* BEG As a lesurt , the lnik bweteen the utufers and tocsk makrtes pipred arpat . 98.12† ALL As a strule , the lnik eewtneb the eftusur and okcst msretak ipdepr prtaa . 96.79‡ Reading difficulty (scRNN) : INT ≤ END < BEG < ALL *: p = 0.07, †.‡: p<0.01 respectively 28
  29. 29. Summary so far … 1. Huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe. 2. scRNN recognizes noisy words robustly. 3. There is a similarity between scRNN and human word recognition mechanism. Forward Mask (500 milliseconds) GARDEN gadren ######## Prime (60 milliseconds) Target 29
  30. 30. Outline Robust Text Correction for Grammar and Fluency 1. Character-level 2. Word-level 3. Sentence (phrase)-level 30
  31. 31. 2. Word-level robust processing Error-repair Dependency Parsing for Ungrammatical Texts (ACL 2017) Keisuke Sakaguchi, Matt Post, Benjamin Van Durme 31
  32. 32. Dependency Parsing Text à Tree (with labels) Economic news had little effect on financial markets . 32
  33. 33. Background & Motivation I look in forward hear from you. I look forward to hearing from you. Error correction ↓ Parsing Pipeline Error-repair parsing Joint training 33
  34. 34. Error-repair Dependency Parsing 1. Non-directional Easy-first parsing (Goldberg and Elhadad, 2010) 2. Three new actions to repair errors 34
  35. 35. Non-directional Easy-first Parsing a brown fox jumped with joy a brown joywith joy fox a brown 35
  36. 36. Non-directional Easy-first Parsing a brown fox jumped with joy a brown joywith joy fox a brown Pending List 36
  37. 37. Non-directional Easy-first Parsing a brown fox jumped with joy a brown joywith joy fox a brown ATTACHRIGHT(𝑖) ATTACHLEFT(𝑖) Iteratively take actions until a complete tree is built. 37
  38. 38. Non-directional Easy-first Parsing a brown fox jumped with joy a brown joywith joy fox a brown 38
  39. 39. Non-directional Easy-first Parsing ATTACHRIGHT a brown fox jumped with joy a brown joywith joy fox a brown 39
  40. 40. Non-directional Easy-first Parsing a a fox jumped with joy a brown joywith joy fox a brown 40
  41. 41. Non-directional Easy-first Parsing ATTACHRIGHT a a fox jumped with joy a brown joywith joy fox a brown 41
  42. 42. Non-directional Easy-first Parsing a brown fox jumped with joy a brown joywith joy fox a brown 42
  43. 43. Non-directional Easy-first Parsing ATTACHLEFT a brown fox jumped with joy a brown joywith joy fox a brown 43
  44. 44. Non-directional Easy-first Parsing a brown fox jumped with joy a brown joywith joy fox a brown 44
  45. 45. Non-directional Easy-first Parsing ATTACHLEFT a brown fox jumped with joy a brown joywith joy fox a brown 45
  46. 46. Non-directional Easy-first Parsing a brown fox jumped with joy a brown joywith joy fox a brown 46
  47. 47. Non-directional Easy-first Parsing ATTACHRIGHT a brown fox jumped with joy a brown joywith joy fox a brown 47
  48. 48. Non-directional Easy-first Parsing a brown fox jumped with joy a brown joywith joy fox a brown 48
  49. 49. Non-directional Easy-first Parsing a brown fox jumped with joy a brown joywith joy fox a brown root 49
  50. 50. Three new actions to repair errors SUBSTITUTE (𝑤%) replaces a token to another (grammatically more probable) token DELETE (𝑤%) removes an unnecessary token INSERT (𝑤%) inserts a new token at an index i. 50
  51. 51. Three new actions to repair errors I look in forward xhearx from you I youyou 51
  52. 52. I look in forward xhearx from you I youyou Three new actions to repair errors ATTACHRIGHT ATTACHLEFT 52
  53. 53. I look in forward xhearx from you I youyou Three new actions to repair errors SUBSTITUTE / DELETE / INSERT 53
  54. 54. ATTACHRIGHT I look in forward xhearx from you I youyou Three new actions to repair errors 54
  55. 55. I look in forward xhearx from you I youyou Three new actions to repair errors 55
  56. 56. ATTACHLEFT I look in forward xhearx from you I youyou Three new actions to repair errors 56
  57. 57. Three new actions to repair errors I look in forward xhearx from you I youyou 57
  58. 58. Three new actions to repair errors SUBSTITUTE I look in forward xhearx from you I youyou 58
  59. 59. Three new actions to repair errors I look in forward hearing from you I youyou 59
  60. 60. Three new actions to repair errors DELETE I look in forward hearing from you I youyou 60
  61. 61. Three new actions to repair errors I look forward hearing from from you I youyou 61
  62. 62. Three new actions to repair errors INSERT I look forward hearing from from you I youyou 62
  63. 63. Three new actions to repair errors I look forward to hearing from you I youyou 63
  64. 64. Three new actions to repair errors ATTACHLEFT I look forward to hearing from you I youyou 64
  65. 65. Three new actions to repair errors I look look to hearing from you I youyouI forward 65
  66. 66. We are ready to parse noisy texts … ? Wait!! The new actions may cause infinite loops. SUB à SUB à SUB à … INS à DEL à INS à DEL à ... 66
  67. 67. We are ready to parse noisy texts … ? Wait!! The new actions may cause infinite loops. SUB à SUB à SUB à … INS à DEL à INS à DEL à ... Heuristic constraints to avoid infinite loops 1. Limiting the number of new action operations 2. Substituted token cannot be substituted again 67
  68. 68. Training the parser Model learns which action to take at each time step. structured perceptron + learning with exploration (Goldberg and Nivre, 2013) features: basic linguistic features (Goldberg and Elhadad 2010) 68
  69. 69. Training the parser How to know which action is good (i.e., oracle, valid)? ATTACHLEFT & ATTACHRIGHT (Goldberg and Elhadad, 2010) 1. proposed edge is in the gold parse and 2. the child (to be attached) already has all its children SUBSTITUTE, DELETE, & INSERT 3. proposed action decreases the (word) edit distance to the gold (grammatical) sentence. 69
  70. 70. Experiment 1 (simulated data) Dependency parsing on noisy Penn Treebank Errors injected similarly to Foster and Andersen (2009) 5 most frequent grammatical errors (CoNLL13) • Determiner (substitution, deletion, insertion) • Preposition (substitution, deletion, insertion) • Noun number (singular vs. plural) • Verb form (tense and aspect) • Subject verb agreement Eval: UAS by SParseval (Roark et al., 2006, Favre et al., 2010) Baseline: pipeline approach (error correction à parsing) 70
  71. 71. Result (Dependency: UAS) 71
  72. 72. Experiment 2 (real data) Grammaticality improvement on real ESL corpus Treebank of Learner English (Berzak et al., 2016) Grammaticality score (Heilman et al., 2014) Regression model with linguistic features 1 (incomprehensible) ~ 4 (perfect) 72
  73. 73. Result (Grammaticality on learner corpus) * * 73
  74. 74. Summary so far Error-repair Dependency Parsing 1. Non-directional Easy-first Parsing 2. Three new actions to repair errors Experimental results 1. more robust against grammatical errors 2. improves grammaticality I look in forward xhearx from you I youyou 74
  75. 75. Outline Robust Text Correction for Grammar and Fluency 1. Character-level 2. Word-level 3. Sentence (phrase)-level 75
  76. 76. 3. Sentence-level robust processing 3.3. Building a GEC model Grammatical Error Correction with Neural Reinforcement Learning (IJCNLP 2017) Keisuke Sakaguchi, Matt Post, Benjamin Van Durme 76
  77. 77. Grammatical Error Correction (GEC) Ungrammatical sentence Grammatical & Fluent sentence GEC algorithms 77
  78. 78. Grammatical Error Correction (GEC) Ungrammatical sentence Grammatical & Fluent sentence o Rule based model o Classifiers o Phrase-based MT o Neural MT 78
  79. 79. Grammatical Error Correction (GEC) Ungrammatical sentence Grammatical & Fluent sentence o Rule based model o Classifiers o Phrase-based MT o Neural MT 79
  80. 80. Neural MT for GEC (Encoder-decoder with attention) ・・・ x2 xS-1 xSx1 Encoder 80
  81. 81. Neural MT for GEC (Encoder-decoder with attention) ・・・ x2 xS-1 xSx1 NULL y1 Encoder Decoder 81
  82. 82. Neural MT for GEC (Encoder-decoder with attention) ・・・ x2 xS-1 xSx1 + NULL y1 y2 Encoder Decoder 82
  83. 83. Neural MT for GEC (Encoder-decoder with attention) ・・・ x2 xS-1 xSx1 + NULL ・・・ y1 y2 yT-1 yT Encoder Decoder 83
  84. 84. Neural MT for GEC (Encoder-decoder with attention) Training objective: Maximum Likelihood Estimation ・・・ log 𝑝(𝑦,) log 𝑝(𝑦-./) log 𝑝(𝑦-) gold label log 𝑝(𝑦/) NULL Decoder 84
  85. 85. Two Drawbacks in MLE #1 Word level optimization (not sentence-level) ・・・ log 𝑝(𝑦,) log 𝑝(𝑦-./) log 𝑝(𝑦-) gold label log 𝑝(𝑦/) NULL Decoder 85
  86. 86. Two Drawbacks in MLE #2 Exposure Bias (gold in training, argmax in test) ・・・ gold label NULL Predicted word (might be erroneous) is fed during test time. y’1 = y1 y’2 y2 y’T-1 yT-1 yT y’T Decoder 86
  87. 87. Sentence level (direct) optimization Decode a sentence and compute the score Decoder 87
  88. 88. Sentence level (direct) optimization . . . . . . Maximize the expected reward (metric score) Decoder 88
  89. 89. REINFORCE (Williams, 1992) Maximize the expected reward (metric score) Learning Rate (arbitrary) Baseline 89
  90. 90. REINFORCE (Williams, 1992) Maximize the expected reward (metric score) Learning Rate Relevance to Minimum Risk Training in NMT: Learning rate 𝜶 in REINFORCE corresponds to the smoothing parameter in MRT. See the appendix. 90
  91. 91. Experiment Data: Training: Cambridge Learner Corpus (FCE) NUCLE Corpus Lang8 Corpus Dev & Test: JFLEG Corpus Model (hyper-)parameters: Embedding: 512, Hidden: 1000, Dropout: 0.2, (for NRL) Sample size: 20, warm start: after 600k updates in MLE Metric (= score, reward): GLEU (Napoles et al., 2015) 91
  92. 92. Results 40 45 50 55 60 65 SRC CAMB14 NUS AMU CAMB16 MLE NRL Human SRC 40.5 92
  93. 93. Results 40 45 50 55 60 65 SRC CAMB14 NUS AMU CAMB16 MLE NRL Human SRC 40.5 PBMT 46.0~51.4 93
  94. 94. Results 40 45 50 55 60 65 SRC CAMB14 NUS AMU CAMB16 MLE NRL Human SRC 40.5 PBMT 46.0~51.4 NMT (MLE) 52.0~52.7 94
  95. 95. Results 40 45 50 55 60 65 SRC CAMB14 NUS AMU CAMB16 MLE NRL Human PBMT 46.0~51.4 NMT (MLE) 52.0~52.7 SRC 40.5 NMT (NRL) 53.9 95
  96. 96. Results 40 45 50 55 60 65 SRC CAMB14 NUS AMU CAMB16 MLE NRL Human PBMT 46.0~51.4 NMT (MLE) 52.0~52.7 SRC 40.5 NMT (NRL) 53.9 Human 62.3 96
  97. 97. Summary so far… Grammatical Error Correction with NRL ü Sentence-level objective. ü Direct optimization toward the metric. ü NRL > Maximum Likelihood Estimation 97
  98. 98. Conclusions Robust Text Correction for Grammar and Fluency 1. Character-level 2. Word-level 3. Sentence (phrase)-level I look in forward xhearx from you I youyou Fluency 98
  99. 99. Thnaks for yuor atentoin!! 99

×