Ijcnlp17 sakaguchi

•

0 likes•157 views

The document discusses using neural reinforcement learning for grammatical error correction. It presents an encoder-decoder model with attention for grammatical error correction. Typically these models are trained with maximum likelihood estimation, which has drawbacks of optimizing at the word level rather than sentence level and exposure bias between training and testing. The document proposes using reinforcement learning to directly optimize the expected reward of a metric at the sentence level. The experiment applies this to a grammatical error correction task, achieving better performance according to the GLEU metric than a model trained with maximum likelihood estimation.

Grammatical Error Correction with
Neural Reinforcement Learning
IJCNLP 2017
Keisuke Sakaguchi, Matt Post, Benjamin Van Durme

Grammatical Error Correction (GEC) 2
Ungrammatical
sentence
Grammatical
& Fluent
sentence
GEC algorithms

Grammatical Error Correction (GEC) 3
Ungrammatical
sentence
Grammatical
& Fluent
sentence
o Rule based model
o Classifiers
o Phrase-based MT
o Neural MT

Grammatical Error Correction (GEC) 4
Ungrammatical
sentence
Grammatical
& Fluent
sentence
o Rule based model
o Classifiers
o Phrase-based MT
o Neural MT

Neural MT for GEC (Encoder-decoder with attention) 5
x2 xS-1 xSx1
Encoder

Neural MT for GEC (Encoder-decoder with attention) 6
x2 xS-1 xSx1
NULL
y1
Encoder
Decoder

Neural MT for GEC (Encoder-decoder with attention) 7
x2 xS-1 xSx1
+
NULL
y1 y2
Encoder
Decoder

Neural MT for GEC (Encoder-decoder with attention) 8
x2 xS-1 xSx1
+
NULL
y1 y2 yT-1 yT
Encoder
Decoder

Neural MT for GEC (Encoder-decoder with attention) 9
Training objective: Maximum Likelihood Estimation
log $(&')
log $(&)*+)
log $(&))
gold label
log $(&+)
NULL
Decoder

Two Drawbacks in MLE 10
#1 Word level optimization (not sentence-level)
log $(&')
log $(&)*+)
log $(&))
gold label
log $(&+)
NULL
Decoder

Two Drawbacks in MLE 11
#2 Exposure Bias (gold in training, argmax in test)
gold label
NULL
Predicted word (might be erroneous) is fed during test time.
y’1 = y1
y’2
y2
y’T-1
yT-1
yT
y’T
Decoder

Reinforcement Learning 12
Sentence level (direct) optimization
Decoder

Reinforcement Learning 13
...
...
Maximize the expected reward (metric score)
Decoder

REINFORCE (Williams, 1992) 14
Maximize the expected reward (metric score)
Learning Rate (arbitrary) Baseline

REINFORCE (Williams, 1992) 15
Maximize the expected reward (metric score)
Learning Rate
Relevance to Minimum Risk Training in NMT:
Learning rate ! in REINFORCE corresponds to
the smoothing parameter in MRT.
See the appendix.

GLEU (Napoles et al., 2015) 16
Penalize n-grams that match
between source and hypothesis
but not in reference

Experiment 17
Data:
Training: Cambridge Learner Corpus (FCE)
NUCLE Corpus
Lang8 Corpus
Dev & Test: JFLEG Corpus
Model (hyper-)parameters:
Embedding: 512, Hidden: 1000, Dropout: 0.2,
(for NRL)
Sample size: 20, warm start: after 600k updates in MLE
Metric (= score, reward):
GLEU

Results 18
40
45
50
55
60
65
SRC CAMB14 NUS AMU CAMB16 MLE NRL Human
SRC
40.5

Results 19
40
45
50
55
60
65
SRC CAMB14 NUS AMU CAMB16 MLE NRL Human
SRC
40.5
PBMT
46.0~51.4

Results 20
40
45
50
55
60
65
SRC CAMB14 NUS AMU CAMB16 MLE NRL Human
SRC
40.5
PBMT
46.0~51.4
NMT (MLE)
52.0~52.7

Results 21
40
45
50
55
60
65
SRC CAMB14 NUS AMU CAMB16 MLE NRL Human
PBMT
46.0~51.4
NMT (MLE)
52.0~52.7
SRC
40.5
NMT
(NRL)
53.9

Results 22
40
45
50
55
60
65
SRC CAMB14 NUS AMU CAMB16 MLE NRL Human
PBMT
46.0~51.4
NMT (MLE)
52.0~52.7
SRC
40.5
NMT
(NRL)
53.9
Human
62.3

Summary 23
Grammatical Error Correction with NRL
ü Sentence-level objective.
ü Direct optimization toward the metric.
ü NRL > Maximum Likelihood Estimation

Example Outputs 24
SRC Fish firming uses the lots of special products such as fish meal .
REF Fish firming uses a lot of special products such as fish meal .
PBMT Fish firming uses a lot of special products such as fish meal .
MLE Fish contains a lot of special products such as fish meals .
NRL Fish shops use lots of special products such as fish meal .
SRC but found that successful people use the people money and use there
idea for a way to success .
REF But it was found that successful people use other people 's money and
use their ideas as a way to success .
PBMT but found that successful people use the money and use these ideas for
a way to success .
MLE But found that successful people use the people money and use it for a
way to success .
NRL But found that successful people use the people 's money and use their
idea for a way to success .

This document discusses the BLAST algorithm for comparing biological sequences. It explains that BLAST allows rapid sequence comparison of a query sequence against a database. BLAST is fast, accurate, and accessible online. The document then describes the four main components of a BLAST search: choosing the query sequence, BLAST program, database, and optional parameters. It provides details on how to interpret BLAST search results, including the expect value, and how BLAST works by compiling word pairs from the query and database in three phases of searching and alignment.

Do not Match, Inherit: Fitness Surrogates for Genetics-Based Machine Learning...

Xavier Llorà

A byproduct benefit of using probabilistic model-building genetic algorithms is the creation of cheap and accurate surrogate models. Learning classifier systems---and genetics-based machine learning in general---can greatly benefit from such surrogates which may replace the costly matching procedure of a rule against large data sets. In this paper we investigate the accuracy of such surrogate fitness functions when coupled with the probabilistic models evolved by the x-ary extended compact classifier system (xeCCS). To achieve such a goal, we show the need that the probabilistic models should be able to represent all the accurate basis functions required for creating an accurate surrogate. We also introduce a procedure to transform populations of rules based into dependency structure matrices (DSMs) which allows building accurate models of overlapping building blocks---a necessary condition to accurately estimate the fitness of the evolved rules.

Ch06 alignment

BioinformaticsInstitute

This document discusses sequence alignment and contains four sections: 1) Global alignment which finds the highest scoring alignment between entire sequences using dynamic programming. 2) Scoring matrices which generalize alignment scoring by assigning scores to individual character matches/mismatches based on biological evidence. 3) Local alignment which finds the best scoring alignment between substrings of sequences to identify conserved regions, as global alignment may miss these. 4) Ways to solve the local alignment problem efficiently in quadratic time instead of quartic time by computing alignments from each vertex in the grid.

文法および流暢性を考慮した頑健なテキスト誤り訂正 (第15回ステアラボ人工知能セミナー)

STAIR Lab, Chiba Institute of Technology

The document discusses building computational models that can robustly process noisy text like the human brain. It presents a character-level recurrent neural network model called semi-Character RNN (scRNN) that can accurately recognize words even when letters are reordered or inserted/deleted. The scRNN shows similar robustness as humans in an experiment where reading difficulty increased from intact text to progressively noisier text. It then discusses extending this work to the word level by incorporating repair actions into a dependency parser to handle grammatical errors in text.

Blast fasta 4

Er Puspendra Tripathi

BLAST and FASTA are algorithms for searching sequence databases to find local alignments between a query sequence and database sequences, with BLAST providing faster searches and improved statistical analysis compared to FASTA. Both algorithms work by first identifying short exact matches between sequences and then extending these matches to identify longer regions of similarity. The algorithms model DNA and protein sequence alignments as coin tosses to determine the expected length of the longest matching region between random sequences.

Exome Sequencing

Dr. Mohammad Reza Nateghi

Genome and exome sequencing can be used to identify genetic variants that cause rare diseases. Whole genome sequencing requires 30-50X coverage to sequence the entire genome, while exome sequencing only sequences the 1-2% of the genome that is the exome, or protein-coding regions. Read mapping is used to align sequencing reads to the reference genome and is computationally intensive. Variant detection methods like spaced seeds and Burrows-Wheeler transforms are used to identify SNPs and indels, while structural variation can be detected using tools like BreakDancer that analyze read pairs and soft-clipped reads.

Prediction of pKa from chemical structure using free and open source tools

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

The ionization state of a chemical, reflected in pKa values, affects lipophilicity, solubility, protein binding and the ability of a chemical to cross the plasma membrane. These properties govern the pharmacokinetic parameters such as absorption, distribution, metabolism, excretion and toxicity and thus pKa is a fundamental chemical property and is used in many models of chemical toxicity. Experimentally determining pKa is not feasible for high-throughput assays. Predicting pKa is challenging and existing models have been developed only using restricted chemical space (e.g., anilines, phenols, benzoic acids, primary amines) and lack of a generalized model impedes ADME modeling. No free and open source models exist for heterogeneous chemical classes, however, several proprietary programs exist. In this work, pKa open data bundled with DataWarrior (http://www.openmolecules.org/) were used to develop predictive models for pKa. After data cleaning, there were ~3100 and ~3900 monoprotic chemicals with an acidic or basic pKa, respectively. 1D and 2D chemical descriptors (AlogP, Topological polar surface area, etc) in addition to 12 fingerprints (presence or absence of a chemical group) were generated using PaDEL software. Three datasets were used: acidic, basic and acidic and basic combined. 13 datasets were examined, the 1D/2D descriptors and 12 fingerprints. Using the Extreme Gradient Boosting algorithm showed that the MACCS and Substructure Count fingerprints yielded the best results, with models showing an R-Squared of ~0.78 and a RMSE of 1.42. Recently, Deep Learning models have showed remarkable progress in image recognition and natural language processing. To determine if the Deep Learning algorithms would increase model performance we examined the datasets and found that the Deep Learning models were somewhat superior than Extreme Gradient Boosting with an R-Squared of ~0.80 and an RMSE of ~1.38. This work does not reflect U.S. EPA policy.

BBB_Presentation

Lodewijk Brand

1) The document presents proScript, a dataset of 6,400 partially ordered scripts crowdsourced from existing corpora. 2) It introduces two complementary tasks for working with partially ordered scripts: edge prediction and script generation. 3) A pre-trained neural language model is adapted to generate partial-order scripts for the first time, outperforming a random baseline according to a graph edit distance evaluation metric.

EMNLP 2021 proScript

Keisuke Sakaguchi

The document presents proScript, a new dataset of 6,400 partially ordered scripts crowdsourced by the authors. It introduces two tasks for modeling scripts: edge prediction and script generation. The authors adapt a pretrained T5 model for both tasks, showing it can generate partial-order scripts. Evaluation shows the model outputs are comparable to human scripts based on graph edit distance and pairwise comparisons. The proScript dataset will be made publicly available to advance research on modeling script knowledge.

Acl18 sakaguchi

Keisuke Sakaguchi

The document proposes an efficient online method called EASL (Efficient Annotation for Scalar Labels) for annotating data with bounded scalar labels between 0-1, 0-100, or -∞ to +∞. EASL adapts an online pairwise ranking aggregation method called TrueSkill, which was originally used for ranking players in online games with unbounded scores, by developing a bounded variant that restricts scores to a fixed range, making them more interpretable. This new bounded variant of TrueSkill is shown to provide a 50% annotation efficiency gain over existing methods like Direct Assessment that require annotating thousands of examples.

ACL17_Sakaguchi

Keisuke Sakaguchi

This document describes an error-repair dependency parsing method that can parse ungrammatical texts. It uses a non-directional easy-first parsing approach with three new actions to repair errors: substitute, delete, and insert tokens. Experiments show this method is more robust to grammatical errors compared to a pipeline approach, and can improve the grammaticality of texts, such as learner essays.

TACL16_Sakaguchi

Keisuke Sakaguchi

1. The document discusses distinguishing between grammaticality and fluency for grammatical error correction (GEC), arguing that fluency is a better goal that has not been clearly recognized. 2. It presents experiments showing that fluency-oriented annotations and metrics better capture native speaker preferences, correlate more with human rankings, and are easier and cheaper to collect than error-coded annotations. 3. The experiments find that crowdsourced fluency edits from non-experts can achieve high quality, and fluency references may be preferable for GEC evaluation over existing error-coded annotations.

NAACL15_sakaguchi

Keisuke Sakaguchi

This document describes an approach to automated short answer scoring that combines response-based and reference-based features using a stacking model. Response-based features are binary and sparse while reference-based features are continuous and dense. A stacking model was used to combine support vector regression models trained on each feature type, improving performance over naively combining the features. The stacking model treats the predicted scores from each model as additional dense features. Experimental results on a reading comprehension dataset showed the stacking approach improved quadratic weighted kappa scores compared to not using stacking.

BEA12_sakaguchi

Keisuke Sakaguchi

This document describes NAIST's system for the HOO 2012 Shared Task on grammatical error correction. It discusses the system's configuration for correcting spelling errors, preposition errors, and determiner errors. For spelling correction, it uses a spelling checker and language model ranking. For prepositions, it trains a maximum entropy model on two corpora to detect and correct errors. For determiners, it checks noun phrases and trains two parser models to correct errors. It analyzes the results of different system runs and configurations. The system achieved preliminary F-scores of 52.4%, 72.2%, and 60.7% for spelling correction, and aimed to improve correction of existing words, use richer verb knowledge, and add more determin

ACL13_sakaguchi

Keisuke Sakaguchi

This document proposes three methods for generating reliable and valid distractors for fill-in-the-blank language learning quizzes: 1) A confusion matrix method using an ESL corpus, 2) A discriminative ESL method using classifiers trained on an ESL corpus, and 3) A discriminative simulated-ESL method using classifiers trained on pseudo-ESL data. An experiment compares the three proposed methods to existing thesaurus- and roundtrip translation-based methods. The discriminative simulated-ESL method performed best in terms of distractor appropriateness and ability to discriminate learner proficiency levels.

WMT14_sakaguchi

Keisuke Sakaguchi

1. The document proposes the TrueSkill algorithm as an improvement over existing models for ranking machine translation systems based on pairwise comparisons from human evaluators. 2. TrueSkill is shown to outperform baselines by requiring less training data to achieve accurate rankings while also better predicting pairwise preferences. 3. It functions by modeling systems as distributions that are efficiently updated online during a matching process, unlike batch models, allowing more effective data collection and system clustering from fewer annotations.

COLING12_sakaguchi

Keisuke Sakaguchi

The document presents a method for joint English spelling error correction and part-of-speech (POS) tagging for language learners' writing. It proposes analyzing the text by deleting word boundaries and building a lattice to find the lowest cost path that considers both spelling corrections and POS tags. An experiment on ESL learner corpora shows the joint method improves over baselines in both POS tagging accuracy by up to 5.9% and spelling correction recall by 19.5%, demonstrating the mutual benefit of jointly modeling spelling and syntax.

Generating privacy-protected synthetic data using Secludy and Milvus

Zilliz

During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.

Artificial Intelligence for XMLDevelopment

Octavian Nadolu

In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject. We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup. Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved. The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring. The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise. By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.

System Design Case Study: Building a Scalable E-Commerce Platform - Hiike

Hiike

Azure API Management to expose backend services securely

Dinusha Kumarasiri

Best 20 SEO Techniques To Improve Website Visibility In SERP

Pixlogix Infotech

GraphRAG for Life Science to increase LLM accuracy

Tomaz Bratanic

Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...

Tatiana Kojar

Skybuffer AI, built on the robust SAP Business Technology Platform (SAP BTP), is the latest and most advanced version of our AI development, reaffirming our commitment to delivering top-tier AI solutions. Skybuffer AI harnesses all the innovative capabilities of the SAP BTP in the AI domain, from Conversational AI to cutting-edge Generative AI and Retrieval-Augmented Generation (RAG). It also helps SAP customers safeguard their investments into SAP Conversational AI and ensure a seamless, one-click transition to SAP Business AI. With Skybuffer AI, various AI models can be integrated into a single communication channel such as Microsoft Teams. This integration empowers business users with insights drawn from SAP backend systems, enterprise documents, and the expansive knowledge of Generative AI. And the best part of it is that it is all managed through our intuitive no-code Action Server interface, requiring no extensive coding knowledge and making the advanced AI accessible to more users.

WeTestAthens: Postman's AI & Automation Techniques

Postman

GenAI Pilot Implementation in the organizations

kumardaparthi1024

Your One-Stop Shop for Python Success: Top 10 US Python Development Providers

akankshawande

Fueling AI with Great Data with Airbyte Webinar

Zilliz

Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf

Malak Abu Hammad

Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers: * What is Vector Search? * Importance and benefits of vector search * Practical use cases across various industries * Step-by-step implementation guide * Live demos with code snippets * Enhancing LLM capabilities with vector search * Best practices and optimization strategies Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications. #MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology

Recently uploaded

Generating privacy-protected synthetic data using Secludy and Milvus

Zilliz

Artificial Intelligence for XMLDevelopment

Octavian Nadolu

System Design Case Study: Building a Scalable E-Commerce Platform - Hiike

Hiike

Azure API Management to expose backend services securely

Dinusha Kumarasiri

Best 20 SEO Techniques To Improve Website Visibility In SERP

Pixlogix Infotech

GraphRAG for Life Science to increase LLM accuracy

Tomaz Bratanic

Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...

Tatiana Kojar

WeTestAthens: Postman's AI & Automation Techniques

Postman

GenAI Pilot Implementation in the organizations

kumardaparthi1024

Your One-Stop Shop for Python Success: Top 10 US Python Development Providers

akankshawande

Fueling AI with Great Data with Airbyte Webinar

Zilliz

Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf

Malak Abu Hammad

Letter and Document Automation for Bonterra Impact Management (fka Social Sol...

Jeffrey Haguewood

Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows. We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases. This video focuses on automated letter generation for Bonterra Impact Management using Google Workspace or Microsoft 365. Interested in deploying letter generation automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.

AWS Cloud Cost Optimization Presentation.pptx

HarisZaheer8

This presentation provides valuable insights into effective cost-saving techniques on AWS. Learn how to optimize your AWS resources by rightsizing, increasing elasticity, picking the right storage class, and choosing the best pricing model. Additionally, discover essential governance mechanisms to ensure continuous cost efficiency. Whether you are new to AWS or an experienced user, this presentation provides clear and practical tips to help you reduce your cloud costs and get the most out of your budget.

Programming Foundation Models with DSPy - Meetup Slides

Zilliz

How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf

Chart Kalyan

Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...

saastr

Introduction of Cybersecurity with OSS at Code Europe 2024

Hiroshi SHIBATA

I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems. The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS. Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application. I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.

HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU

panagenda

Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/ DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen! Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell. Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten. Diese Themen werden behandelt - Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten - Wie funktionieren CCB- und CCX-Lizenzen wirklich? - Verstehen des DLAU-Tools und wie man es am besten nutzt - Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw. - Praxisbeispiele und Best Practices zum sofortigen Umsetzen

Serial Arm Control in Real Time Presentation

tolgahangng

Recently uploaded (20)

Generating privacy-protected synthetic data using Secludy and Milvus

Artificial Intelligence for XMLDevelopment

System Design Case Study: Building a Scalable E-Commerce Platform - Hiike

Azure API Management to expose backend services securely

Best 20 SEO Techniques To Improve Website Visibility In SERP

GraphRAG for Life Science to increase LLM accuracy

Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...

WeTestAthens: Postman's AI & Automation Techniques

GenAI Pilot Implementation in the organizations

Your One-Stop Shop for Python Success: Top 10 US Python Development Providers

Fueling AI with Great Data with Airbyte Webinar

Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf

Letter and Document Automation for Bonterra Impact Management (fka Social Sol...

AWS Cloud Cost Optimization Presentation.pptx

Programming Foundation Models with DSPy - Meetup Slides

How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf

Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...

Introduction of Cybersecurity with OSS at Code Europe 2024

HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU

Serial Arm Control in Real Time Presentation

Ijcnlp17 sakaguchi

1. Grammatical Error Correction with Neural Reinforcement Learning IJCNLP 2017 Keisuke Sakaguchi, Matt Post, Benjamin Van Durme

2. Grammatical Error Correction (GEC) 2 Ungrammatical sentence Grammatical & Fluent sentence GEC algorithms

3. Grammatical Error Correction (GEC) 3 Ungrammatical sentence Grammatical & Fluent sentence o Rule based model o Classifiers o Phrase-based MT o Neural MT

4. Grammatical Error Correction (GEC) 4 Ungrammatical sentence Grammatical & Fluent sentence o Rule based model o Classifiers o Phrase-based MT o Neural MT

5. Neural MT for GEC (Encoder-decoder with attention) 5 x2 xS-1 xSx1 Encoder

6. Neural MT for GEC (Encoder-decoder with attention) 6 x2 xS-1 xSx1 NULL y1 Encoder Decoder

7. Neural MT for GEC (Encoder-decoder with attention) 7 x2 xS-1 xSx1 + NULL y1 y2 Encoder Decoder

8. Neural MT for GEC (Encoder-decoder with attention) 8 x2 xS-1 xSx1 + NULL y1 y2 yT-1 yT Encoder Decoder

9. Neural MT for GEC (Encoder-decoder with attention) 9 Training objective: Maximum Likelihood Estimation log $(&') log $(&)*+) log $(&)) gold label log $(&+) NULL Decoder

10. Two Drawbacks in MLE 10 #1 Word level optimization (not sentence-level) log $(&') log $(&)*+) log $(&)) gold label log $(&+) NULL Decoder

11. Two Drawbacks in MLE 11 #2 Exposure Bias (gold in training, argmax in test) gold label NULL Predicted word (might be erroneous) is fed during test time. y’1 = y1 y’2 y2 y’T-1 yT-1 yT y’T Decoder

12. Reinforcement Learning 12 Sentence level (direct) optimization Decoder

13. Reinforcement Learning 13 ... ... Maximize the expected reward (metric score) Decoder

14. REINFORCE (Williams, 1992) 14 Maximize the expected reward (metric score) Learning Rate (arbitrary) Baseline

15. REINFORCE (Williams, 1992) 15 Maximize the expected reward (metric score) Learning Rate Relevance to Minimum Risk Training in NMT: Learning rate ! in REINFORCE corresponds to the smoothing parameter in MRT. See the appendix.

16. GLEU (Napoles et al., 2015) 16 Penalize n-grams that match between source and hypothesis but not in reference

17. Experiment 17 Data: Training: Cambridge Learner Corpus (FCE) NUCLE Corpus Lang8 Corpus Dev & Test: JFLEG Corpus Model (hyper-)parameters: Embedding: 512, Hidden: 1000, Dropout: 0.2, (for NRL) Sample size: 20, warm start: after 600k updates in MLE Metric (= score, reward): GLEU

18. Results 18 40 45 50 55 60 65 SRC CAMB14 NUS AMU CAMB16 MLE NRL Human SRC 40.5

19. Results 19 40 45 50 55 60 65 SRC CAMB14 NUS AMU CAMB16 MLE NRL Human SRC 40.5 PBMT 46.0~51.4

20. Results 20 40 45 50 55 60 65 SRC CAMB14 NUS AMU CAMB16 MLE NRL Human SRC 40.5 PBMT 46.0~51.4 NMT (MLE) 52.0~52.7

21. Results 21 40 45 50 55 60 65 SRC CAMB14 NUS AMU CAMB16 MLE NRL Human PBMT 46.0~51.4 NMT (MLE) 52.0~52.7 SRC 40.5 NMT (NRL) 53.9

22. Results 22 40 45 50 55 60 65 SRC CAMB14 NUS AMU CAMB16 MLE NRL Human PBMT 46.0~51.4 NMT (MLE) 52.0~52.7 SRC 40.5 NMT (NRL) 53.9 Human 62.3

23. Summary 23 Grammatical Error Correction with NRL ü Sentence-level objective. ü Direct optimization toward the metric. ü NRL > Maximum Likelihood Estimation

24. Example Outputs 24 SRC Fish firming uses the lots of special products such as fish meal . REF Fish firming uses a lot of special products such as fish meal . PBMT Fish firming uses a lot of special products such as fish meal . MLE Fish contains a lot of special products such as fish meals . NRL Fish shops use lots of special products such as fish meal . SRC but found that successful people use the people money and use there idea for a way to success . REF But it was found that successful people use other people 's money and use their ideas as a way to success . PBMT but found that successful people use the money and use these ideas for a way to success . MLE But found that successful people use the people money and use it for a way to success . NRL But found that successful people use the people 's money and use their idea for a way to success .

25. 25

26. 26

27. 27

28. 28

29. 29

Ijcnlp17 sakaguchi

Recommended

Recommended

More Related Content

More from Keisuke Sakaguchi

More from Keisuke Sakaguchi (10)

Recently uploaded

Recently uploaded (20)

Ijcnlp17 sakaguchi