SlideShare a Scribd company logo
1 of 16
Download to read offline
Ch. 7: Optimization	
M2	
  Yuichiro	
  Sawai	
1	
15/07/09
Overview	
15/07/09	
 2	
•  MT	
  decoding	
  
	
  
•  Need	
  to	
  find	
  w	
  that	
  assigns	
  higher	
  scores	
  to	
  be@er	
  translaBons	
  
(e,	
  d)	
  
•  Be@er	
  translaBons	
  =	
  translaBons	
  with	
  lower	
  error	
f:	
  source	
  sentence,	
  e:	
  target	
  sentence,	
  d:	
  derivaBon	
  
w:	
  weight	
  vector,	
  h(・):	
  feature	
  funcBon	
  
Loss	
  MinimizaBon	
•  Given	
  parallel	
  corpus	
  (F,	
  E),	
  find	
  w	
  that	
  minimizes	
  loss	
  funcBon	
  
l(・)	
  
•  e.g.,	
  l(F,	
  E;	
  w)	
  =	
  1	
  –	
  BLEU(E,	
  decodew(F))	
  
•  λ	
  is	
  a	
  regularizaBon	
  constant	
  to	
  avoid	
  overfiUng	
  
15/07/09	
 3	
regularizaBon	
  term
Problems	
  to	
  Consider	
1.  Search	
  space	
  is	
  vast	
  
•  impossible	
  to	
  consider	
  all	
  candidates	
  
•  correct	
  translaBon	
  is	
  rarely	
  possible	
  
2.  ApproximaBon	
  of	
  error	
  funcBon	
  
•  Error	
  metrics	
  (e.g.	
  BLEU)	
  are	
  not	
  differenBable	
  
•  Split	
  corpus-­‐level	
  metrics	
  into	
  sentence	
  level	
  
3.  How	
  to	
  calculate	
  argmin	
  wTh	
  
15/07/09	
 4
Batch	
  Learning	
•  Given	
  parallel	
  corpus	
  (F,	
  E),	
  iniBalize	
  w	
  and	
  iteraBvely	
  
1.  decode	
  whole	
  corpus	
  F	
  with	
  current	
  w,	
  and	
  get	
  k-­‐best	
  lists	
  C	
  
2.  opBmize	
  w	
  
	
  
	
  
3.  loop	
  unBl	
  convergence	
  
•  vs.	
  online	
  learning	
  
•  opBmize	
  w	
  per	
  sentence	
  
15/07/09	
 5
Minimum	
  Error	
  Rate	
  Training	
  (MERT)	
•  Given	
  error	
  funcBon	
  error(E,	
  Ê),	
  directly	
  minimize	
  it	
  
•  E:	
  reference	
  translaBons,	
  Ê:	
  system	
  translaBons	
  
•  e.g.	
  error(E,	
  Ê)	
  =	
  1	
  –	
  BLEU(E,	
  Ê)	
  
•  In	
  other	
  words,	
  
•  Since	
  error(・)	
  is	
  not	
  differenBable	
  w.r.t.	
  w,	
  gradient-­‐based	
  
method	
  is	
  not	
  applicable	
  
•  Instead,	
  use	
  Powell’s	
  method	
  
•  gradients	
  not	
  required	
  
15/07/09	
 6
Powell’s	
  Method	
•  IteraBvely,	
  fix	
  a	
  direcBon,	
  and	
  find	
  opBmal	
  w	
  in	
  that	
  direcBon	
  
•  Applicable	
  when	
  gradients	
  are	
  not	
  available	
15/07/09	
 7	
w0	
w1	
 w2	
w3	
x1	
x2
OpBmizaBon	
  in	
  One	
  DirecBon	
•  1-­‐best	
  translaBon	
  parameterized	
  by	
  scalar	
  γ	
15/07/09	
 8	
bm:	
  one-­‐hot	
  vector	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
	
  with	
  mth	
  dim	
  =	
  1	
intercept	
 slope	
γ	
wh	
  +	
  γh	
c1	
c2	
c4	
c3	
Candidates	
  with	
  highest	
  
score	
  are	
  selected	
envelope	
γ	
error	
c1	
c3	
c4	
e.g.)	
  
f	
  =	
  黒い	
  猫	
  を	
  見た	
  
e	
  =	
  I	
  saw	
  a	
  black	
  cat	
  
c1	
  =	
  I	
  saw	
  black	
  cat	
  
c2	
  =	
  saw	
  a	
  black	
  cat	
  
…
Corpus-­‐level	
  Error	
•  Sentence-­‐level	
  losses	
  are	
  summed	
  to	
  get	
  corpus-­‐level	
  error	
15/07/09	
 9	
sentence	
  1	
 sentence	
  2	
add	
sentence-­‐level	
  
error	
sentence-­‐level	
  
envelope	
mulB-­‐sentence	
  
error	
γ*	
 Find	
  γ	
  that	
  minimizes	
  overall	
  error!
Problems	
  of	
  Powell’s	
  Method	
•  SensiBve	
  to	
  iniBalizaBon	
  of	
  w	
  
•  Not	
  suitable	
  for	
  high-­‐dimensional	
  feature	
  vectors	
15/07/09	
 10
Sojmax	
  Loss	
•  TranslaBon	
  probability	
  
	
  
•  Loss	
  is	
  negaBve	
  likelihood	
  of	
  oracle	
  translaBons	
  
	
  	
  	
  where	
  oracle	
  translaBons	
  are	
  
•  Gradient-­‐based	
  methods	
  (e.g.	
  L-­‐BFGS)	
  are	
  applicable	
15/07/09	
 11
Max	
  Margin	
  Loss	
15/07/09	
 12	
•  Make	
  sure	
  distances	
  between	
  correct	
  translaBons	
  and	
  
incorrect	
  translaBons	
  are	
  large	
  
	
  
	
  
•  For	
  example:	
  
•  OpBmizaBon	
  methods	
  for	
  SVM	
  are	
  applicable	
  (e.g.	
  SMO)	
for	
  all	
  oracle	
  and	
  non-­‐oracle	
  pairs	
  …	
penalize	
  when	
 diff	
  in	
  error	
 is	
  greater	
  than	
  diff	
  in	
  score	
f:	
  黒い猫を見た,	
  e	
  (correct):	
  I	
  saw	
  a	
  black	
  cat	
  
e*	
  (oracle) 	
  I	
  saw	
  black	
  cat 	
   	
  0.1 	
   	
   	
  0.4	
  
e	
  	
  	
  (system) 	
  see	
  red	
  dog 	
   	
   	
  0.9 	
   	
   	
  0.3	
  
error	
 score	
  (=wTh)	
large	
 small!	
  bad!
Pairwise	
  Ranking	
  OpBmizaBon	
  (PRO)	
•  Parameter	
  esBmaBon	
  as	
  ranking	
  problem	
  
	
  
•  Classifier	
  learns	
  w	
  to	
  rank	
  candidates	
  by	
  error	
  
•  Generate	
  training	
  examples	
  from	
  pairs	
  of	
  candidates	
  
•  posiBve	
  example:	
  h(cand1)	
  –	
  h(cand2)	
  =	
  (-­‐4,	
  6)	
  
•  negaBve	
  example:	
  h(cand3)	
  –	
  h(cand1)	
  =	
  (3,	
  -­‐7)	
  
•  wT{h(cand1)	
  –	
  h(cand2)}	
  >	
  0	
  ⇔	
  wTh(cand1)	
  >	
  wTh(cand2)	
  
•  Off-­‐the-­‐shelf	
  linear	
  binary	
  classifiers	
  can	
  be	
  used	
15/07/09	
 13	
f:	
  黒い猫を見た,	
  e	
  (correct):	
  I	
  saw	
  a	
  black	
  cat	
  
e	
  	
  	
  (cand1) 	
  I	
  see	
  black	
  cat 	
   	
  0.3 	
   	
  (-­‐1,	
  2) 	
   	
  ???	
  
e	
  	
  	
  (cand2) 	
  see	
  black	
  dog 	
   	
  0.7 	
   	
  (3,	
  -­‐4) 	
   	
  ???	
  
e	
  	
  	
  (cand3) 	
  see	
  red	
  dog 	
   	
   	
  0.9 	
   	
  (2,	
  -­‐5) 	
   	
  ???	
  
error	
 score	
  (=wTh)	
h
Minimum	
  Bayes	
  Risk	
15/07/09	
 14	
•  Minimize	
  expected	
  loss	
  
where	
  
	
  
	
  
	
  
•  γ	
  =	
  0:	
  all	
  candidates	
  are	
  equally	
  likely	
  
•  γ	
  =	
  1:	
  sojmax	
  
•  γ→∞:	
  highest	
  scoring	
  candidate	
  with	
  probability	
  1	
  (MERT)	
  
•  DifferenBable	
  and	
  considers	
  many	
  candidates	
  <e,d>	
  
Sentence-­‐level	
  BLEU	
•  Sentence-­‐level	
  error	
  funcBons	
  are	
  needed	
  for	
  opBmizaBon	
  
•  BLEU	
  is	
  corpus-­‐level	
  metric	
  
	
  
	
  
	
  
•  4-­‐gram	
  precision	
  is	
  ojen	
  0	
  on	
  sentence	
  level	
  
•  varies	
  from	
  human	
  judgments	
  
•  Sentence-­‐level	
  error	
  
•  Linear	
  BLEU	
  
•  (Expected	
  BLEU)	
15/07/09	
 15
Linear	
  BLEU	
•  Linear	
  approximaBon	
  of	
  change	
  in	
  BLEU	
  
c:	
  sum	
  of	
  sentence	
  lengths	
  
mn:	
  #	
  matched	
  n-­‐grams	
  
•  Add	
  one	
  sentence:	
  (c,	
  mn)	
  -­‐>	
  (c’,	
  mn’)	
  
•  Linear	
  BLEU	
  error	
  of	
  candidate	
  e	
  
15/07/09	
 16	
log	
  BLEU	
(c,mn)	
 (c’,m’n)	
Δ	
#	
  matched	
  n-­‐grams	
  in	
  e

More Related Content

What's hot

Lecture 03 lexical analysis
Lecture 03 lexical analysisLecture 03 lexical analysis
Lecture 03 lexical analysisIffat Anjum
 
Lec08-CS110 Computational Engineering
Lec08-CS110 Computational EngineeringLec08-CS110 Computational Engineering
Lec08-CS110 Computational EngineeringSri Harsha Pamu
 
Intermediate code generation (Compiler Design)
Intermediate code generation (Compiler Design)   Intermediate code generation (Compiler Design)
Intermediate code generation (Compiler Design) Tasif Tanzim
 
Optimization of basic blocks
Optimization of basic blocksOptimization of basic blocks
Optimization of basic blocksishwarya516
 
C++ concept of Polymorphism
C++ concept of  PolymorphismC++ concept of  Polymorphism
C++ concept of Polymorphismkiran Patel
 
Exercises on Advances in Verification Methodologies
Exercises on Advances in Verification Methodologies Exercises on Advances in Verification Methodologies
Exercises on Advances in Verification Methodologies Ramdas Mozhikunnath
 
Three Address code
Three Address code Three Address code
Three Address code Pooja Dixit
 
COMPILER DESIGN AND CONSTRUCTION
COMPILER DESIGN AND CONSTRUCTIONCOMPILER DESIGN AND CONSTRUCTION
COMPILER DESIGN AND CONSTRUCTIONAnil Pokhrel
 
Homomorphic Lower Digit Removal and Improved FHE Bootstrapping by Kyoohyung Han
Homomorphic Lower Digit Removal and Improved FHE Bootstrapping by Kyoohyung HanHomomorphic Lower Digit Removal and Improved FHE Bootstrapping by Kyoohyung Han
Homomorphic Lower Digit Removal and Improved FHE Bootstrapping by Kyoohyung Hanvpnmentor
 
Lesson 20: The Mean Value Theorem
Lesson 20: The Mean Value TheoremLesson 20: The Mean Value Theorem
Lesson 20: The Mean Value TheoremMatthew Leingang
 
Introduction to code optimization by dipankar
Introduction to code optimization by dipankarIntroduction to code optimization by dipankar
Introduction to code optimization by dipankarDipankar Nalui
 
Huffman Code Decoding
Huffman Code DecodingHuffman Code Decoding
Huffman Code DecodingRex Yuan
 
Chapter 6 intermediate code generation
Chapter 6   intermediate code generationChapter 6   intermediate code generation
Chapter 6 intermediate code generationVipul Naik
 

What's hot (20)

Lecture 03 lexical analysis
Lecture 03 lexical analysisLecture 03 lexical analysis
Lecture 03 lexical analysis
 
Lec08-CS110 Computational Engineering
Lec08-CS110 Computational EngineeringLec08-CS110 Computational Engineering
Lec08-CS110 Computational Engineering
 
Intermediate code generation (Compiler Design)
Intermediate code generation (Compiler Design)   Intermediate code generation (Compiler Design)
Intermediate code generation (Compiler Design)
 
Code optimization
Code optimization Code optimization
Code optimization
 
Optimization of basic blocks
Optimization of basic blocksOptimization of basic blocks
Optimization of basic blocks
 
C++ concept of Polymorphism
C++ concept of  PolymorphismC++ concept of  Polymorphism
C++ concept of Polymorphism
 
Intermediate code
Intermediate codeIntermediate code
Intermediate code
 
Exercises on Advances in Verification Methodologies
Exercises on Advances in Verification Methodologies Exercises on Advances in Verification Methodologies
Exercises on Advances in Verification Methodologies
 
Three Address code
Three Address code Three Address code
Three Address code
 
Huffman coding
Huffman coding Huffman coding
Huffman coding
 
COMPILER DESIGN AND CONSTRUCTION
COMPILER DESIGN AND CONSTRUCTIONCOMPILER DESIGN AND CONSTRUCTION
COMPILER DESIGN AND CONSTRUCTION
 
Polymorphism
PolymorphismPolymorphism
Polymorphism
 
C++:Lab 2
 C++:Lab 2 C++:Lab 2
C++:Lab 2
 
Homomorphic Lower Digit Removal and Improved FHE Bootstrapping by Kyoohyung Han
Homomorphic Lower Digit Removal and Improved FHE Bootstrapping by Kyoohyung HanHomomorphic Lower Digit Removal and Improved FHE Bootstrapping by Kyoohyung Han
Homomorphic Lower Digit Removal and Improved FHE Bootstrapping by Kyoohyung Han
 
Lesson 20: The Mean Value Theorem
Lesson 20: The Mean Value TheoremLesson 20: The Mean Value Theorem
Lesson 20: The Mean Value Theorem
 
Bc0037
Bc0037Bc0037
Bc0037
 
C++ lab -4
C++ lab -4C++ lab -4
C++ lab -4
 
Introduction to code optimization by dipankar
Introduction to code optimization by dipankarIntroduction to code optimization by dipankar
Introduction to code optimization by dipankar
 
Huffman Code Decoding
Huffman Code DecodingHuffman Code Decoding
Huffman Code Decoding
 
Chapter 6 intermediate code generation
Chapter 6   intermediate code generationChapter 6   intermediate code generation
Chapter 6 intermediate code generation
 

Viewers also liked

[Paper Introduction] A Context-Aware Topic Model for Statistical Machine Tran...
[Paper Introduction] A Context-Aware Topic Model for Statistical Machine Tran...[Paper Introduction] A Context-Aware Topic Model for Statistical Machine Tran...
[Paper Introduction] A Context-Aware Topic Model for Statistical Machine Tran...NAIST Machine Translation Study Group
 
[Paper Introduction] Training a Natural Language Generator From Unaligned Data
[Paper Introduction] Training a Natural Language Generator From Unaligned Data[Paper Introduction] Training a Natural Language Generator From Unaligned Data
[Paper Introduction] Training a Natural Language Generator From Unaligned DataNAIST Machine Translation Study Group
 
[Paper Introduction] Translating into Morphologically Rich Languages with Syn...
[Paper Introduction] Translating into Morphologically Rich Languages with Syn...[Paper Introduction] Translating into Morphologically Rich Languages with Syn...
[Paper Introduction] Translating into Morphologically Rich Languages with Syn...NAIST Machine Translation Study Group
 
[Paper Introduction] Supervised Phrase Table Triangulation with Neural Word E...
[Paper Introduction] Supervised Phrase Table Triangulation with Neural Word E...[Paper Introduction] Supervised Phrase Table Triangulation with Neural Word E...
[Paper Introduction] Supervised Phrase Table Triangulation with Neural Word E...NAIST Machine Translation Study Group
 
[Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Netwo...
[Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Netwo...[Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Netwo...
[Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Netwo...NAIST Machine Translation Study Group
 
[Paper Introduction] Evaluating MT Systems with Second Language Proficiency T...
[Paper Introduction] Evaluating MT Systems with Second Language Proficiency T...[Paper Introduction] Evaluating MT Systems with Second Language Proficiency T...
[Paper Introduction] Evaluating MT Systems with Second Language Proficiency T...NAIST Machine Translation Study Group
 
[Paper Introduction] Distant supervision for relation extraction without labe...
[Paper Introduction] Distant supervision for relation extraction without labe...[Paper Introduction] Distant supervision for relation extraction without labe...
[Paper Introduction] Distant supervision for relation extraction without labe...NAIST Machine Translation Study Group
 
[Paper Introduction] Bilingual word representations with monolingual quality ...
[Paper Introduction] Bilingual word representations with monolingual quality ...[Paper Introduction] Bilingual word representations with monolingual quality ...
[Paper Introduction] Bilingual word representations with monolingual quality ...NAIST Machine Translation Study Group
 

Viewers also liked (12)

[Paper Introduction] A Context-Aware Topic Model for Statistical Machine Tran...
[Paper Introduction] A Context-Aware Topic Model for Statistical Machine Tran...[Paper Introduction] A Context-Aware Topic Model for Statistical Machine Tran...
[Paper Introduction] A Context-Aware Topic Model for Statistical Machine Tran...
 
[Book Reading] 機械翻訳 - Section 2 No.2
 [Book Reading] 機械翻訳 - Section 2 No.2 [Book Reading] 機械翻訳 - Section 2 No.2
[Book Reading] 機械翻訳 - Section 2 No.2
 
[Paper Introduction] Training a Natural Language Generator From Unaligned Data
[Paper Introduction] Training a Natural Language Generator From Unaligned Data[Paper Introduction] Training a Natural Language Generator From Unaligned Data
[Paper Introduction] Training a Natural Language Generator From Unaligned Data
 
[Paper Introduction] Translating into Morphologically Rich Languages with Syn...
[Paper Introduction] Translating into Morphologically Rich Languages with Syn...[Paper Introduction] Translating into Morphologically Rich Languages with Syn...
[Paper Introduction] Translating into Morphologically Rich Languages with Syn...
 
Radiohead
RadioheadRadiohead
Radiohead
 
[Paper Introduction] Supervised Phrase Table Triangulation with Neural Word E...
[Paper Introduction] Supervised Phrase Table Triangulation with Neural Word E...[Paper Introduction] Supervised Phrase Table Triangulation with Neural Word E...
[Paper Introduction] Supervised Phrase Table Triangulation with Neural Word E...
 
[Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Netwo...
[Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Netwo...[Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Netwo...
[Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Netwo...
 
[Paper Introduction] Evaluating MT Systems with Second Language Proficiency T...
[Paper Introduction] Evaluating MT Systems with Second Language Proficiency T...[Paper Introduction] Evaluating MT Systems with Second Language Proficiency T...
[Paper Introduction] Evaluating MT Systems with Second Language Proficiency T...
 
[Paper Introduction] Distant supervision for relation extraction without labe...
[Paper Introduction] Distant supervision for relation extraction without labe...[Paper Introduction] Distant supervision for relation extraction without labe...
[Paper Introduction] Distant supervision for relation extraction without labe...
 
[Paper Introduction] Bilingual word representations with monolingual quality ...
[Paper Introduction] Bilingual word representations with monolingual quality ...[Paper Introduction] Bilingual word representations with monolingual quality ...
[Paper Introduction] Bilingual word representations with monolingual quality ...
 
[Book Reading] 機械翻訳 - Section 3 No.1
[Book Reading] 機械翻訳 - Section 3 No.1[Book Reading] 機械翻訳 - Section 3 No.1
[Book Reading] 機械翻訳 - Section 3 No.1
 
RNN-based Translation Models (Japanese)
RNN-based Translation Models (Japanese)RNN-based Translation Models (Japanese)
RNN-based Translation Models (Japanese)
 

Similar to [Book Reading] 機械翻訳 - Section 7 No.1

Applications of numerical methods
Applications of numerical methodsApplications of numerical methods
Applications of numerical methodsTarun Gehlot
 
Undecidable Problems and Approximation Algorithms
Undecidable Problems and Approximation AlgorithmsUndecidable Problems and Approximation Algorithms
Undecidable Problems and Approximation AlgorithmsMuthu Vinayagam
 
Lecture 16 - Dijkstra's Algorithm.pdf
Lecture 16 - Dijkstra's Algorithm.pdfLecture 16 - Dijkstra's Algorithm.pdf
Lecture 16 - Dijkstra's Algorithm.pdfiftakhar8
 
NIPS2007: learning using many examples
NIPS2007: learning using many examplesNIPS2007: learning using many examples
NIPS2007: learning using many exampleszukun
 
Model Selection and Validation
Model Selection and ValidationModel Selection and Validation
Model Selection and Validationgmorishita
 
Design and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture NotesDesign and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture NotesSreedhar Chowdam
 
Undecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWER
Undecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWERUndecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWER
Undecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWERmuthukrishnavinayaga
 
Bisection & Regual falsi methods
Bisection & Regual falsi methodsBisection & Regual falsi methods
Bisection & Regual falsi methodsDivya Bhatia
 
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)Universitat Politècnica de Catalunya
 
Theorem proving 2018 2019
Theorem proving 2018 2019Theorem proving 2018 2019
Theorem proving 2018 2019Emmanuel Zarpas
 
Theorem proving 2018 2019
Theorem proving 2018 2019Theorem proving 2018 2019
Theorem proving 2018 2019Emmanuel Zarpas
 
super vector machines algorithms using deep
super vector machines algorithms using deepsuper vector machines algorithms using deep
super vector machines algorithms using deepKNaveenKumarECE
 

Similar to [Book Reading] 機械翻訳 - Section 7 No.1 (20)

Applications of numerical methods
Applications of numerical methodsApplications of numerical methods
Applications of numerical methods
 
Undecidable Problems and Approximation Algorithms
Undecidable Problems and Approximation AlgorithmsUndecidable Problems and Approximation Algorithms
Undecidable Problems and Approximation Algorithms
 
Lecture 16 - Dijkstra's Algorithm.pdf
Lecture 16 - Dijkstra's Algorithm.pdfLecture 16 - Dijkstra's Algorithm.pdf
Lecture 16 - Dijkstra's Algorithm.pdf
 
nlp2.pdf
nlp2.pdfnlp2.pdf
nlp2.pdf
 
NIPS2007: learning using many examples
NIPS2007: learning using many examplesNIPS2007: learning using many examples
NIPS2007: learning using many examples
 
Cryptography
CryptographyCryptography
Cryptography
 
N20181126
N20181126N20181126
N20181126
 
Model Selection and Validation
Model Selection and ValidationModel Selection and Validation
Model Selection and Validation
 
Design and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture NotesDesign and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture Notes
 
tutorial.ppt
tutorial.ppttutorial.ppt
tutorial.ppt
 
Undecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWER
Undecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWERUndecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWER
Undecidable Problems - COPING WITH THE LIMITATIONS OF ALGORITHM POWER
 
Bisection & Regual falsi methods
Bisection & Regual falsi methodsBisection & Regual falsi methods
Bisection & Regual falsi methods
 
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
 
Bisection method
Bisection methodBisection method
Bisection method
 
Lecture1
Lecture1Lecture1
Lecture1
 
Theorem proving 2018 2019
Theorem proving 2018 2019Theorem proving 2018 2019
Theorem proving 2018 2019
 
Theorem proving 2018 2019
Theorem proving 2018 2019Theorem proving 2018 2019
Theorem proving 2018 2019
 
Approx
ApproxApprox
Approx
 
Lecture1
Lecture1Lecture1
Lecture1
 
super vector machines algorithms using deep
super vector machines algorithms using deepsuper vector machines algorithms using deep
super vector machines algorithms using deep
 

Recently uploaded

University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGSIVASHANKAR N
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdfKamal Acharya
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesPrabhanshu Chaturvedi
 

Recently uploaded (20)

University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 

[Book Reading] 機械翻訳 - Section 7 No.1

  • 1. Ch. 7: Optimization M2  Yuichiro  Sawai 1 15/07/09
  • 2. Overview 15/07/09 2 •  MT  decoding     •  Need  to  find  w  that  assigns  higher  scores  to  be@er  translaBons   (e,  d)   •  Be@er  translaBons  =  translaBons  with  lower  error f:  source  sentence,  e:  target  sentence,  d:  derivaBon   w:  weight  vector,  h(・):  feature  funcBon  
  • 3. Loss  MinimizaBon •  Given  parallel  corpus  (F,  E),  find  w  that  minimizes  loss  funcBon   l(・)   •  e.g.,  l(F,  E;  w)  =  1  –  BLEU(E,  decodew(F))   •  λ  is  a  regularizaBon  constant  to  avoid  overfiUng   15/07/09 3 regularizaBon  term
  • 4. Problems  to  Consider 1.  Search  space  is  vast   •  impossible  to  consider  all  candidates   •  correct  translaBon  is  rarely  possible   2.  ApproximaBon  of  error  funcBon   •  Error  metrics  (e.g.  BLEU)  are  not  differenBable   •  Split  corpus-­‐level  metrics  into  sentence  level   3.  How  to  calculate  argmin  wTh   15/07/09 4
  • 5. Batch  Learning •  Given  parallel  corpus  (F,  E),  iniBalize  w  and  iteraBvely   1.  decode  whole  corpus  F  with  current  w,  and  get  k-­‐best  lists  C   2.  opBmize  w       3.  loop  unBl  convergence   •  vs.  online  learning   •  opBmize  w  per  sentence   15/07/09 5
  • 6. Minimum  Error  Rate  Training  (MERT) •  Given  error  funcBon  error(E,  Ê),  directly  minimize  it   •  E:  reference  translaBons,  Ê:  system  translaBons   •  e.g.  error(E,  Ê)  =  1  –  BLEU(E,  Ê)   •  In  other  words,   •  Since  error(・)  is  not  differenBable  w.r.t.  w,  gradient-­‐based   method  is  not  applicable   •  Instead,  use  Powell’s  method   •  gradients  not  required   15/07/09 6
  • 7. Powell’s  Method •  IteraBvely,  fix  a  direcBon,  and  find  opBmal  w  in  that  direcBon   •  Applicable  when  gradients  are  not  available 15/07/09 7 w0 w1 w2 w3 x1 x2
  • 8. OpBmizaBon  in  One  DirecBon •  1-­‐best  translaBon  parameterized  by  scalar  γ 15/07/09 8 bm:  one-­‐hot  vector                          with  mth  dim  =  1 intercept slope γ wh  +  γh c1 c2 c4 c3 Candidates  with  highest   score  are  selected envelope γ error c1 c3 c4 e.g.)   f  =  黒い  猫  を  見た   e  =  I  saw  a  black  cat   c1  =  I  saw  black  cat   c2  =  saw  a  black  cat   …
  • 9. Corpus-­‐level  Error •  Sentence-­‐level  losses  are  summed  to  get  corpus-­‐level  error 15/07/09 9 sentence  1 sentence  2 add sentence-­‐level   error sentence-­‐level   envelope mulB-­‐sentence   error γ* Find  γ  that  minimizes  overall  error!
  • 10. Problems  of  Powell’s  Method •  SensiBve  to  iniBalizaBon  of  w   •  Not  suitable  for  high-­‐dimensional  feature  vectors 15/07/09 10
  • 11. Sojmax  Loss •  TranslaBon  probability     •  Loss  is  negaBve  likelihood  of  oracle  translaBons        where  oracle  translaBons  are   •  Gradient-­‐based  methods  (e.g.  L-­‐BFGS)  are  applicable 15/07/09 11
  • 12. Max  Margin  Loss 15/07/09 12 •  Make  sure  distances  between  correct  translaBons  and   incorrect  translaBons  are  large       •  For  example:   •  OpBmizaBon  methods  for  SVM  are  applicable  (e.g.  SMO) for  all  oracle  and  non-­‐oracle  pairs  … penalize  when diff  in  error is  greater  than  diff  in  score f:  黒い猫を見た,  e  (correct):  I  saw  a  black  cat   e*  (oracle)  I  saw  black  cat    0.1      0.4   e      (system)  see  red  dog      0.9      0.3   error score  (=wTh) large small!  bad!
  • 13. Pairwise  Ranking  OpBmizaBon  (PRO) •  Parameter  esBmaBon  as  ranking  problem     •  Classifier  learns  w  to  rank  candidates  by  error   •  Generate  training  examples  from  pairs  of  candidates   •  posiBve  example:  h(cand1)  –  h(cand2)  =  (-­‐4,  6)   •  negaBve  example:  h(cand3)  –  h(cand1)  =  (3,  -­‐7)   •  wT{h(cand1)  –  h(cand2)}  >  0  ⇔  wTh(cand1)  >  wTh(cand2)   •  Off-­‐the-­‐shelf  linear  binary  classifiers  can  be  used 15/07/09 13 f:  黒い猫を見た,  e  (correct):  I  saw  a  black  cat   e      (cand1)  I  see  black  cat    0.3    (-­‐1,  2)    ???   e      (cand2)  see  black  dog    0.7    (3,  -­‐4)    ???   e      (cand3)  see  red  dog      0.9    (2,  -­‐5)    ???   error score  (=wTh) h
  • 14. Minimum  Bayes  Risk 15/07/09 14 •  Minimize  expected  loss   where         •  γ  =  0:  all  candidates  are  equally  likely   •  γ  =  1:  sojmax   •  γ→∞:  highest  scoring  candidate  with  probability  1  (MERT)   •  DifferenBable  and  considers  many  candidates  <e,d>  
  • 15. Sentence-­‐level  BLEU •  Sentence-­‐level  error  funcBons  are  needed  for  opBmizaBon   •  BLEU  is  corpus-­‐level  metric         •  4-­‐gram  precision  is  ojen  0  on  sentence  level   •  varies  from  human  judgments   •  Sentence-­‐level  error   •  Linear  BLEU   •  (Expected  BLEU) 15/07/09 15
  • 16. Linear  BLEU •  Linear  approximaBon  of  change  in  BLEU   c:  sum  of  sentence  lengths   mn:  #  matched  n-­‐grams   •  Add  one  sentence:  (c,  mn)  -­‐>  (c’,  mn’)   •  Linear  BLEU  error  of  candidate  e   15/07/09 16 log  BLEU (c,mn) (c’,m’n) Δ #  matched  n-­‐grams  in  e