SlideShare a Scribd company logo
1 of 17
Download to read offline
20190116

陈宙斯
Abstract
• Beam Search & Greedy Strategy

• Improvement vs. Cost

• Actor → decoder.hidden_state[t]

• Train: K outputs of beam search, argmax with a target quality
metric like BLEU (pseudo-parallel corpus / base model);

• No reinforcement learning and universal;

• Experiments: 3 corpora & 3 architectures [Q↑ & S↑]
Intro
• Seq-seq: conditioned left-to-right

• Infinite and exponential with seq_len

• Greedy & Beam: 2+BLEU | 3+ROUGE

• Related: termination criterion & search function

• Train: ordinary BP on a model-specific corpus

• Corpus: generated by running the un-augmented model on the training set with large-beam
beam search, and selecting outputs from the resulting k-best list which score highly on our
target metric. 

• Evaluation:

• RNN-based (Luong et al., 2015), ConvS2S (Gehring et al., 2017) and Transformer (Vaswani et
al., 2017)

• IWSLT16 De-En, WMT15 Fi-En and WMT14 De-En
Background
• 2.1 NMT

• 2.2 Decoding

• Greedy (1); Beam (K)

• Noisy parallel approximate decoding (NPAD; Cho, 2016)

Noice → decoder.hidden_state[t]

[idea] Stay active, even random! (better study at home cafe)

• Trainable Greedy Decoding (Gu et al., 2017)

FFNN RL actor → decoder.hidden_state[t] (やっぱり at lab!)

approximate the maximum-a-posteriori → BLEU [不不安定、残念]
Method
• I/O: at = actor(ht, et, st-1) ← dec/att/(state)

• Form of actor: ff(5), ff2(6), GRU(7), gated ff(gate(8)) …
Training
• Pseudo-parallel corpus generated by a base model:

• High model likelihood (not highest it)

• High-quality translations (not highest xt)

• Gen: K beams → highest internal argmax external score

• Train the actor with pseudo-D and fixed base model
Experiments
• 4.1 Settings

• IWSLT16, tst2013(validation) and tst2014(test), 

• WMT15, newstest2013(validation) and newstest2015(test)

• WMT14, newstest2013(validation) and newstest2014(test) 

• + BPE

• evaluations 

• tokenized and cased BLEU (primary). 

• METEOR and TER, multeval with tokenized and case-insensitive scoring.

• Base models are trained from scratch, except for ConvS2S WMT14 En-De
translation (trained model as well as training data) provided by Gehring et al. (2017).
RNN: OpenNMT’s default, Luong

rnn, emb = [500, 500; 600, 300]

ConvS2S: IWSLT16 and WMT 

Transformer: Gu et al. (2018)

(Pseudo-D beam k = 35)
Overall
Effectiveness & Efficiency
Overall
Effectiveness & Efficiency
recover missing tokens
optimize word order
also “correct prepositions”
both
Two Questions
• Two factors: actor & pseudo-D, which one matters?

• Silver standard is a better choice than golden one?

(Pseudo-D seems much more kind to the little driver/actor)
Impact
Likelihood (conditioned LM)Magnitude of Action Vector
L2 norms over the training course on the
IWSLT16 De-En validation set with Transformer.

This suggests that the action adjusts the
decoders hidden state slightly, rather than
overwriting it, enabling to find a sequence that is
not highly scored but corresponding to a high
value of the target metric. (more confident)
Actor & Data
Yes, silver data is the best.
Even bronze is better than gold!
Metric Domain?
感想
• Modification of network:

• Elements & Structure (organs of body)

• A little guy on the shoulder of a blind giant (Transformer).

• Contextual Parameter Generator: language embedding

• The power of data seems much more greater than the
elaborative work of network. A little actor can make the
giant more flexible.
ありがとうございました
Danke schön
Спасибо
谢谢
3Q
!
m(_ _)m

More Related Content

Similar to N20190116

TCI in general pracice - reliability (2006)
TCI in general pracice - reliability (2006)TCI in general pracice - reliability (2006)
TCI in general pracice - reliability (2006)Evangelos Kontopantelis
 
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelChinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelLifeng (Aaron) Han
 
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...David Zibriczky
 
TIE: A Framework for Embedding-based Incremental Temporal Knowledge Graph Com...
TIE: A Framework for Embedding-based Incremental Temporal Knowledge Graph Com...TIE: A Framework for Embedding-based Incremental Temporal Knowledge Graph Com...
TIE: A Framework for Embedding-based Incremental Temporal Knowledge Graph Com...Jiapeng Wu
 
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...Jinho Choi
 
KASYS at the NTCIR-15 WWW-3 Task
KASYS at the NTCIR-15 WWW-3 TaskKASYS at the NTCIR-15 WWW-3 Task
KASYS at the NTCIR-15 WWW-3 TaskKohei Shinden
 
Determining Column Numbers in Rèsumè with Clustering
Determining Column Numbers in Rèsumè with ClusteringDetermining Column Numbers in Rèsumè with Clustering
Determining Column Numbers in Rèsumè with ClusteringKemal Can Kara
 
Simple rules for building robust machine learning models
Simple rules for building robust machine learning modelsSimple rules for building robust machine learning models
Simple rules for building robust machine learning modelsKyriakos Chatzidimitriou
 
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph CompletionNaomi Shiraishi
 
[SEKE 2014] Practical Human Resource Allocation in Software Projects Using Ge...
[SEKE 2014] Practical Human Resource Allocation in Software Projects Using Ge...[SEKE 2014] Practical Human Resource Allocation in Software Projects Using Ge...
[SEKE 2014] Practical Human Resource Allocation in Software Projects Using Ge...Jihun Park
 
[poster] A Compare-Aggregate Model with Latent Clustering for Answer Selection
[poster] A Compare-Aggregate Model with Latent Clustering for Answer Selection[poster] A Compare-Aggregate Model with Latent Clustering for Answer Selection
[poster] A Compare-Aggregate Model with Latent Clustering for Answer SelectionSeoul National University
 
Incremental collaborative filtering via evolutionary co clustering
Incremental collaborative filtering via evolutionary co clusteringIncremental collaborative filtering via evolutionary co clustering
Incremental collaborative filtering via evolutionary co clusteringAllen Wu
 
LSH for
 Prediction Problem in Recommendation
LSH for
 Prediction Problem in RecommendationLSH for
 Prediction Problem in Recommendation
LSH for
 Prediction Problem in RecommendationMaruf Aytekin
 
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Jen Aman
 
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
[CIKM 2014] Deviation-Based Contextual SLIM RecommendersYONG ZHENG
 

Similar to N20190116 (20)

TCI in general pracice - reliability (2006)
TCI in general pracice - reliability (2006)TCI in general pracice - reliability (2006)
TCI in general pracice - reliability (2006)
 
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning ModelChinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
Chinese Named Entity Recognition with Graph-based Semi-supervised Learning Model
 
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
A Combination of Simple Models by Forward Predictor Selection for Job Recomme...
 
TIE: A Framework for Embedding-based Incremental Temporal Knowledge Graph Com...
TIE: A Framework for Embedding-based Incremental Temporal Knowledge Graph Com...TIE: A Framework for Embedding-based Incremental Temporal Knowledge Graph Com...
TIE: A Framework for Embedding-based Incremental Temporal Knowledge Graph Com...
 
HW04.pdf
HW04.pdfHW04.pdf
HW04.pdf
 
crossvalidation.pptx
crossvalidation.pptxcrossvalidation.pptx
crossvalidation.pptx
 
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
 
KASYS at the NTCIR-15 WWW-3 Task
KASYS at the NTCIR-15 WWW-3 TaskKASYS at the NTCIR-15 WWW-3 Task
KASYS at the NTCIR-15 WWW-3 Task
 
Determining Column Numbers in Rèsumè with Clustering
Determining Column Numbers in Rèsumè with ClusteringDetermining Column Numbers in Rèsumè with Clustering
Determining Column Numbers in Rèsumè with Clustering
 
Simple rules for building robust machine learning models
Simple rules for building robust machine learning modelsSimple rules for building robust machine learning models
Simple rules for building robust machine learning models
 
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion
 
[SEKE 2014] Practical Human Resource Allocation in Software Projects Using Ge...
[SEKE 2014] Practical Human Resource Allocation in Software Projects Using Ge...[SEKE 2014] Practical Human Resource Allocation in Software Projects Using Ge...
[SEKE 2014] Practical Human Resource Allocation in Software Projects Using Ge...
 
[poster] A Compare-Aggregate Model with Latent Clustering for Answer Selection
[poster] A Compare-Aggregate Model with Latent Clustering for Answer Selection[poster] A Compare-Aggregate Model with Latent Clustering for Answer Selection
[poster] A Compare-Aggregate Model with Latent Clustering for Answer Selection
 
ictir2016
ictir2016ictir2016
ictir2016
 
Incremental collaborative filtering via evolutionary co clustering
Incremental collaborative filtering via evolutionary co clusteringIncremental collaborative filtering via evolutionary co clustering
Incremental collaborative filtering via evolutionary co clustering
 
LSH for
 Prediction Problem in Recommendation
LSH for
 Prediction Problem in RecommendationLSH for
 Prediction Problem in Recommendation
LSH for
 Prediction Problem in Recommendation
 
Introduction to Genetic algorithm and its significance in VLSI design and aut...
Introduction to Genetic algorithm and its significance in VLSI design and aut...Introduction to Genetic algorithm and its significance in VLSI design and aut...
Introduction to Genetic algorithm and its significance in VLSI design and aut...
 
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
 
HW03 (1).pdf
HW03 (1).pdfHW03 (1).pdf
HW03 (1).pdf
 
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
 

Recently uploaded

OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...masabamasaba
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationJuha-Pekka Tolvanen
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...masabamasaba
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...chiefasafspells
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 

Recently uploaded (20)

OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 

N20190116

  • 2. Abstract • Beam Search & Greedy Strategy • Improvement vs. Cost • Actor → decoder.hidden_state[t] • Train: K outputs of beam search, argmax with a target quality metric like BLEU (pseudo-parallel corpus / base model); • No reinforcement learning and universal; • Experiments: 3 corpora & 3 architectures [Q↑ & S↑]
  • 3. Intro • Seq-seq: conditioned left-to-right • Infinite and exponential with seq_len • Greedy & Beam: 2+BLEU | 3+ROUGE • Related: termination criterion & search function • Train: ordinary BP on a model-specific corpus • Corpus: generated by running the un-augmented model on the training set with large-beam beam search, and selecting outputs from the resulting k-best list which score highly on our target metric. • Evaluation: • RNN-based (Luong et al., 2015), ConvS2S (Gehring et al., 2017) and Transformer (Vaswani et al., 2017) • IWSLT16 De-En, WMT15 Fi-En and WMT14 De-En
  • 4. Background • 2.1 NMT • 2.2 Decoding • Greedy (1); Beam (K) • Noisy parallel approximate decoding (NPAD; Cho, 2016)
 Noice → decoder.hidden_state[t]
 [idea] Stay active, even random! (better study at home cafe) • Trainable Greedy Decoding (Gu et al., 2017)
 FFNN RL actor → decoder.hidden_state[t] (やっぱり at lab!)
 approximate the maximum-a-posteriori → BLEU [不不安定、残念]
  • 5. Method • I/O: at = actor(ht, et, st-1) ← dec/att/(state)
 • Form of actor: ff(5), ff2(6), GRU(7), gated ff(gate(8)) …
  • 6.
  • 7. Training • Pseudo-parallel corpus generated by a base model: • High model likelihood (not highest it) • High-quality translations (not highest xt) • Gen: K beams → highest internal argmax external score • Train the actor with pseudo-D and fixed base model
  • 8. Experiments • 4.1 Settings • IWSLT16, tst2013(validation) and tst2014(test), • WMT15, newstest2013(validation) and newstest2015(test) • WMT14, newstest2013(validation) and newstest2014(test) • + BPE • evaluations • tokenized and cased BLEU (primary). • METEOR and TER, multeval with tokenized and case-insensitive scoring. • Base models are trained from scratch, except for ConvS2S WMT14 En-De translation (trained model as well as training data) provided by Gehring et al. (2017). RNN: OpenNMT’s default, Luong rnn, emb = [500, 500; 600, 300] ConvS2S: IWSLT16 and WMT Transformer: Gu et al. (2018) (Pseudo-D beam k = 35)
  • 11. recover missing tokens optimize word order also “correct prepositions” both
  • 12. Two Questions • Two factors: actor & pseudo-D, which one matters? • Silver standard is a better choice than golden one?
 (Pseudo-D seems much more kind to the little driver/actor)
  • 13. Impact Likelihood (conditioned LM)Magnitude of Action Vector L2 norms over the training course on the IWSLT16 De-En validation set with Transformer. This suggests that the action adjusts the decoders hidden state slightly, rather than overwriting it, enabling to find a sequence that is not highly scored but corresponding to a high value of the target metric. (more confident)
  • 14. Actor & Data Yes, silver data is the best. Even bronze is better than gold!
  • 16. 感想 • Modification of network: • Elements & Structure (organs of body) • A little guy on the shoulder of a blind giant (Transformer). • Contextual Parameter Generator: language embedding • The power of data seems much more greater than the elaborative work of network. A little actor can make the giant more flexible.