Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Grammarly AI-NLP Club #4 - Understanding and assessing language with neural network models - Marek Rei

603 views

Published on

Speaker: Marek Rei, Senior Research Associate, University of Cambridge

Summary: The number of people learning English around the world is currently estimated at 1.5 billion and is predicted to exceed 1.9 billion by 2020. The increasing need to communicate beyond borders has created a large unmet demand for qualified language teachers across the globe. Computational models for error detection and essay scoring can alleviate this issue by giving millions of people access to affordable learning resources. Successful systems for automated language teaching will need to analyse language at various levels of granularity and provide useful feedback to individual students.In this talk, we will explore some of the latest approaches to written language assessment, using neural architectures for composing the meaning of a sentence or text, and also discuss potential future directions in the field.

Published in: Technology
  • Yes you are right. There are many research paper writing services available now. But almost services are fake and illegal. Only a genuine service will treat their customer with quality research papers. ⇒ www.WritePaper.info ⇐
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • The professtional essay writer are having more knowledege about the writing papers. The professional essay writer are providing the best essay writing services papers to the students. The writeersity writing company had to providing the more writing papers for the professtionalist. The papers should be very quality and possible to acedemic success. HelpWriting.net Good luck!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Grammarly AI-NLP Club #4 - Understanding and assessing language with neural network models - Marek Rei

  1. 1. 1 Understanding and Assessing Language with Neural Network Models Marek Rei
  2. 2. 2 Automated Language Assessment The number of people learning English around the world is currently estimated at 1.5 billion and is predicted to exceed 1.9 billion by 2020. Advantages for students: • Immediate grades and feedback • Enables self-assessment and self-tutoring • Constant availability as an online tool Advantages for teachers/examiners: • Reduced teacher/examiner workload • Can focus on more interesting or difficult content • Cost-effective approach to assessment
  3. 3. 3 Automated Language Assessment Dear Mrs Brown, I am writing you because my class want to give a surprise birthday party for your husband Mr Brown. We need your help for the details. First of all could you let us know if the date of June 16th is all right with his timetable program. We have organised to do the party between three to six o'clock in afternoon in College Canteen, about food we organised a buffet, but could you also help us with the music which he prefer, if prefer something especialy. We have invite the student, the teachers and the Principal of school but we appreciate if you are coming. At last would you tell us which is the best present for him a compact disk or a book . We want say thanks again for your help and you must be sure that your opinion it would be valuable to us. I am looking forward to receiving your answer and don't forget that it is a surprice birthday party. Yours faithfuly, Tom Evaluation: ● Detect any writing errors ● Calculate a holistic writing score ● Predict language proficiency score (IELTS, FCE) ● Detailed analytic scores (e.g., coherence, topic relevance) Guidance: ● Show detailed progress reports ● Provide corrections for errors ● Suggest areas to focus on ● Generate suitable exercises
  4. 4. 4 Talk Overview Error Detection Identifying the locations of grammatical errors 01 Error Correction Providing an edited version of an incorrect sentence 02 Applications and Future Directions How do we make this useful and where do we go next 04 Essay Scoring Estimating a language proficiency score based on the full text 03
  5. 5. 5 + DTAL + Engineering + Cambridge English
  6. 6. 6 Error Detection
  7. 7. 7 I want to thak you for preparing such a nice evening . Error Detection in Learner Writing
  8. 8. 8 Error Types in Learner Writing
  9. 9. 9 I want to thak you for preparing such a nice evening . Error Detection in Learner Writing Spelling error (8.6%) I know how to cook some things like potatoes . Missing punctuation (7.4%) If you have time , why don’t you meet up . Incorrect punctuation (7.1%) I’m looking forward to seeing you and good luck to your project . Incorrect preposition (6.3%) My friend eats two ice creams yesterday . Verb tense error (6.0%)
  10. 10. 10 We can invite also people who are not members . Error Detection in Learner Writing Word order error (2.8%) The main material that have been used is dark green glass . Verb agreement error (1.6%) I thing you should better save your money . Spelling error produces a valid word (1.5%) And at last but not the least , Captain Davidson showed him ... Incorrectly reproduced idiom (0.5%) Specially the old castle Wawel's great . Complex error (0.5%)
  11. 11. 11 Automated Error Detection 1. Experts have hand-annotated a large dataset of learner essays, marking the location of each error. 2. We create algorithms that can look at all these examples and discover regularities through machine learning. 3. We apply the resulting models on new data, where they are able to provide predictions.
  12. 12. 12 Deep Learning and Neural Networks • Highly-connected networks of parameters • Randomly initialised, but optimised for a specific task during training • Automatically discovering features that are useful for the task • Each layer is a function of the previous layer • Have achieved state-of-the-art results on nearly all language processing tasks
  13. 13. 13 Neural Error Detection Marek Rei and Helen Yannakoudakis (2016) Compositional Sequence Labeling Models for Error Detection in Learner Writing. ACL 2016. • Composing words into context-specific representations. • Predicting a probability distribution over all the possible labels for each word.
  14. 14. 14 System FCE CoNLL14-1 CoNLL14-2 BiLSTM 41.10 16.40 23.90 Neural Error Detection First Certificate in English dataset (FCE, Yannakoudakis et al. (2011)) ● 1,141 manually annotated essays, containing 450K words ● Written by learners during language examinations ● In response to prompts eliciting free-text answers ● Publicly available dataset Evaluating error detection using F0.5
  15. 15. 15 Additional Training Data System FCE CoNLL14-1 CoNLL14-2 Public FCE 41.10 16.40 23.90 Private CLC 64.30 34.30 44.00 More data = better performance We can generate artificial data: Additional training examples for error detection Idea 1: Randomly generate errors in correct text
  16. 16. 16 Pattern-based Error Generation Idea 2: Extract known error patterns and insert them into correct text We went shop on Saturday We went shopping on Saturday VVD shop_VV0 II => VVD shopping_VVG II I was shopping on Monday I was shop on Monday Marek Rei, Mariano Felice, Zheng Yuan and Ted Briscoe (2017) Artificial Error Generation with Machine Translation and Syntactic Patterns. BEA 2017.
  17. 17. 17 Translation-based Error Generation Idea 3: Train a machine translation model to translate from correct to incorrect text ORIG: We are a well-mixed class with equal numbers of boys and girls, all about 20 years old. PAT: We are a well-mixed class with equal numbers of boys an girls, all about 20 year old. MT: We are a well-mixed class with equals numbers of boys and girls, all about 20 years old. Normally translate between languages: E.g. English to French Now let’s translate for generating errors: English to faulty English Can use off-the-shelf machine translation tools Marek Rei, Mariano Felice, Zheng Yuan and Ted Briscoe (2017) Artificial Error Generation with Machine Translation and Syntactic Patterns. BEA 2017.
  18. 18. 18 System FCE CoNLL14-1 CoNLL14-2 BiLSTM 41.10 16.40 23.90 +PAT 47.81 19.47 28.49 +MT 48.37 19.73 28.39 +PAT+MT 49.11 21.87 30.13 Artificial Error Generation Training on 450K words of annotated data and 4.5M words of automatically generated data.
  19. 19. 19 Error Correction
  20. 20. 20 Error Correction Error detection identifies incorrect words Error correction modifies a sentence to remove errors We can formulate correction as a machine translation problem: Let’s translate from incorrect English to correct English Returns the highest scoring possible translation Input: We can invite also people who are not members . Output: We can also invite people who are not members .
  21. 21. 21 Statistical Machine Translation Text is separated into multi-word units (phrases) Phrase alignments and translation tables are learned from parallel datasets Language models are used to ensure reasonable output
  22. 22. 22 Neural Machine Translation The encoder learns to process the source sentence and produce an informative vector representation The decoder learns to generate a sentence in a different language based on that vector Bahdanau et al. (2014), figure by Stephen Merity.
  23. 23. 23 Input: I aren’t seen Albert since last summer . Output: I haven’t seen OOV since last summer . Handling Unknown Words Neural models have a limited fixed vocabulary and represent other words as OOV tokens. Solution: 1) Align the words between the input and output text 2) Translate OOV words in a post-processing step Zheng Yuan and Ted Briscoe (2016) Grammatical error correction using neural machine translation. NAACL 2016.
  24. 24. 24 System FCE CoNLL14 SMT 52.90 37.33 NMT+align 53.49 39.90 Neural Machine Translation
  25. 25. 25 Original sentence: There are some informations you have asked me about. SMT output: 1st There are some information you have asked me about. 2nd There is some information you have asked me about. 3rd There are some information you asked me about. 4th There are some information you have asked me. 5th There are some information you have asked me for. N-best List
  26. 26. 26 The correction system may not know how to fix an error, therefore leave it uncorrected. How can we use the detection model to fix this problem and assign a better score to each “translation”? + + + + + + - - The theatre restaurant was closed for unknown reason Scoring Candidates
  27. 27. 27 How can we use the detection model to fix this problem and assign a better score to each “translation”? 1.0 1.0 1.0 0.9 1.0 1.0 0.3 0.1 The theatre restaurant was closed for unknown reason Scoring Candidates 1. Sentence correctness score: calculated based on the probability of each of its tokens being correct. 2. Correction recall score: select the translation that has modified the (maximum number of) words marked by the detection model as incorrect. 3. Correction agreement score: the ratio of agreed corrections compared to the disagreed corrections. Helen Yannakoudakis, Marek Rei, Øistein E. Andersen and Zheng Yuan (2017) Neural Sequence-Labelling Models for Grammatical Error Correction. EMNLP 2017.
  28. 28. 28 System FCE CoNLL14 SMT 52.90 37.33 NMT+align 53.49 39.90 Detect+correct 55.60 42.44 Neural Machine Translation
  29. 29. 29 Original sentence: I work with children an the Computer help my Jop bat affeted to MT output: I work with children and the Computer help my Jop bat affeted to MT+detection output: I work with children and the computer helps my Jop bat affeted to Error Correction Results
  30. 30. 30 Original sentence: It takes 25 minutes that is convenient to us MT output: It takes 25 minutes that is convenient for us MT+detection output: It takes 25 minutes , which is convenient for us Error Correction Results
  31. 31. 31 Original sentence: I hope that our friend Richard Brown doesn’t have any serious willness MT output: I hope that our friend Richard Brown doesn’t have any serious willness MT+detection output: I hope that our friend Richard Brown doesn’t have any serious willingness Error Correction Results
  32. 32. 32 Essay Scoring
  33. 33. 33 Essay Scoring Automatically assign a language proficiency score based on a freeform short essay.
  34. 34. 34 Feature-based Essay Scoring Extract a number of features: ● Word sequences ○ Unigrams ○ Bigrams ○ Trigrams ● Part-of-speech tags ● Grammatical constructions ● Complexity measures ● Semantic similarity between sentences ● Estimated error count Helen Yannakoudakis, Ted Briscoe and Ben Medlock (2011) A New Dataset and Method for Automatically Grading ESOL Texts. ACL 2011.
  35. 35. 35 Feature-based Essay Scoring Features Spearman (⍴) % Word sequences 59.8 + POS tags 68.7 + Syntax structure 72.2 + Error rate 78.5 Human-human 79.2
  36. 36. 36 Neural Essay Scoring Bi-directional LSTM Convolutional Network
  37. 37. 37 Score-specific Word Embeddings Optimising word embeddings to: 1) differentiate between correct and randomly corrupted sequences 2) predict the score of the essay where the current word sequence came from Then use these embeddings in a neural network for essay scoring. Dimitrios Alikaniotis, Helen Yannakoudakis and Marek Rei (2016) Automatic Text Scoring Using Neural Networks. ACL 2016.
  38. 38. 38 Score-specific Word Embeddings Pre-training Spearman (⍴) % RMSE None 68 7.31 word2vec 79 3.2 SSWE 91 2.4 Evaluating score-specific word embeddings on the ASAP dataset: 13K marked essays (150-550 words each). Using a two-layer bi-directional LSTM for essay scoring.
  39. 39. 39 Error-specific Word Embeddings Taking advantage of the available error annotation in the training data. Optimising embeddings to detect real errors, as opposed to randomly corrupted sequences. Network predicts the quality of each word sequence, based on the number of errors it contains. Youmna Farag, Marek Rei and Ted Briscoe (2017) An Error-Oriented Approach to Word Embedding Pre-Training. BEA 2017.
  40. 40. 40 Pre-training Spearman (⍴) % RMSE word2vec 56.7 4.9 Glove 51.8 5.2 SSWE 58.3 4.9 ESWE 63.7 4.5 Error-specific Word Embeddings Evaluating error-specific word embeddings on the FCE dataset. Using the convolutional network for essay scoring.
  41. 41. 41 Future Directions
  42. 42. 42 Future Directions Personalisation Generating exercises that are designed for a specific user Automated tutoring Active teaching from an automated dialogue system Speech Evaluating and providing feedback for spoken answers
  43. 43. 43 Future Directions Specialised systems Supervised models targeting specific error types Multi-task learning Taking better advantage of other tasks and datasets Multi-modal topics Students writing about images or videos
  44. 44. 44 Summary Error detection Neural sequence labelling architecture Artificial data generation 01 Error correction Neural machine translation Reranking with detection 02 Essay scoring Feature-based model Neural essay scoring Score-specific word embeddings 03 BE THE BEST MARKETING COMPANY
  45. 45. 45 Thank you! Any questions?

×