Submit Search
Upload
MT Study SCFG
•
0 likes
•
750 views
Akiva Miura
Follow
NAIST MT Study Group - Synchronous Context-Free Grammar
Read less
Read more
Software
Report
Share
Report
Share
1 of 26
Download now
Download to read offline
Recommended
8051 micro controller
8051 micro controller
Poojith Chowdhary
Instruction sets picc done by Priyanga KR
Instruction sets picc done by Priyanga KR
PriyangaKR1
OpenJDK Concurrent Collectors
OpenJDK Concurrent Collectors
Monica Beckwith
Validation of a Low-Cost Transitional Turbulence Model for Low-Reynolds-Numb...
Validation of a Low-Cost Transitional Turbulence Model for Low-Reynolds-Numb...
counse
8085 instruction-set
8085 instruction-set
Muhammadalizardari
Intro to Garbage Collection
Intro to Garbage Collection
Monica Beckwith
Grid based distributed in memory indexing for moving objects
Grid based distributed in memory indexing for moving objects
Yunsu Lee
Os6 2
Os6 2
issbp
Recommended
8051 micro controller
8051 micro controller
Poojith Chowdhary
Instruction sets picc done by Priyanga KR
Instruction sets picc done by Priyanga KR
PriyangaKR1
OpenJDK Concurrent Collectors
OpenJDK Concurrent Collectors
Monica Beckwith
Validation of a Low-Cost Transitional Turbulence Model for Low-Reynolds-Numb...
Validation of a Low-Cost Transitional Turbulence Model for Low-Reynolds-Numb...
counse
8085 instruction-set
8085 instruction-set
Muhammadalizardari
Intro to Garbage Collection
Intro to Garbage Collection
Monica Beckwith
Grid based distributed in memory indexing for moving objects
Grid based distributed in memory indexing for moving objects
Yunsu Lee
Os6 2
Os6 2
issbp
Paris Fringe au Théâtre les Feux de la Rampe
Paris Fringe au Théâtre les Feux de la Rampe
TheatreDesFeuxdelaRampe
Trên đường tìm ngọc
Trên đường tìm ngọc
Thị Thanh Trần
O.M.Bugge - CV (2016)
O.M.Bugge - CV (2016)
Capt. O.M.Bugge
Journal On LDO From IJEETC
Journal On LDO From IJEETC
Sadanand Patil
O.m.bugge cv (2016)
O.m.bugge cv (2016)
Capt. O.M.Bugge
構文情報に基づく機械翻訳のための能動学習手法と人手翻訳による評価
構文情報に基づく機械翻訳のための能動学習手法と人手翻訳による評価
Akiva Miura
MOC Presentation
MOC Presentation
Capt. O.M.Bugge
YouTube Channel Optimisations
YouTube Channel Optimisations
Claudio Iobbi
Resume of Sheikh Nayeem for Co
Resume of Sheikh Nayeem for Co
Sheikh Nayeem
Kansai MT Pivot Arekore
Kansai MT Pivot Arekore
Akiva Miura
Contoh tugas simulasi
Contoh tugas simulasi
ghiovand
Teoremaluipitagora
Teoremaluipitagora
olimpiaanca
FPGAを用いた処理のロボット向けコンポーネントの設計生産性評価
FPGAを用いた処理のロボット向けコンポーネントの設計生産性評価
Kazushi Yamashina
Experiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah Watkins
Ceph Community
One Problem, Two Structures, Six Solvers and Ten Years of Personnel Schedulin...
One Problem, Two Structures, Six Solvers and Ten Years of Personnel Schedulin...
Pierre Schaus
FPGAの処理をソフトウェアコンポーネント化する設計ツールcReCompの高機能化の検討
FPGAの処理をソフトウェアコンポーネント化する設計ツールcReCompの高機能化の検討
Kazushi Yamashina
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...
Fabian Pedregosa
Tree-based Translation Models (『機械翻訳』§6.2-6.3)
Tree-based Translation Models (『機械翻訳』§6.2-6.3)
Yusuke Oda
Selection analysis using HyPhy
Selection analysis using HyPhy
Bioinformatics and Computational Biosciences Branch
Metasepi team meeting: Ajhc Project Overview
Metasepi team meeting: Ajhc Project Overview
Kiwamu Okabe
NBIS RNA-seq course
NBIS RNA-seq course
Phil Ewels
RNA-Seq Analysis: Everything You Always Wanted to Know...and then some
RNA-Seq Analysis: Everything You Always Wanted to Know...and then some
basepairtech
More Related Content
Viewers also liked
Paris Fringe au Théâtre les Feux de la Rampe
Paris Fringe au Théâtre les Feux de la Rampe
TheatreDesFeuxdelaRampe
Trên đường tìm ngọc
Trên đường tìm ngọc
Thị Thanh Trần
O.M.Bugge - CV (2016)
O.M.Bugge - CV (2016)
Capt. O.M.Bugge
Journal On LDO From IJEETC
Journal On LDO From IJEETC
Sadanand Patil
O.m.bugge cv (2016)
O.m.bugge cv (2016)
Capt. O.M.Bugge
構文情報に基づく機械翻訳のための能動学習手法と人手翻訳による評価
構文情報に基づく機械翻訳のための能動学習手法と人手翻訳による評価
Akiva Miura
MOC Presentation
MOC Presentation
Capt. O.M.Bugge
YouTube Channel Optimisations
YouTube Channel Optimisations
Claudio Iobbi
Resume of Sheikh Nayeem for Co
Resume of Sheikh Nayeem for Co
Sheikh Nayeem
Kansai MT Pivot Arekore
Kansai MT Pivot Arekore
Akiva Miura
Contoh tugas simulasi
Contoh tugas simulasi
ghiovand
Teoremaluipitagora
Teoremaluipitagora
olimpiaanca
Viewers also liked
(12)
Paris Fringe au Théâtre les Feux de la Rampe
Paris Fringe au Théâtre les Feux de la Rampe
Trên đường tìm ngọc
Trên đường tìm ngọc
O.M.Bugge - CV (2016)
O.M.Bugge - CV (2016)
Journal On LDO From IJEETC
Journal On LDO From IJEETC
O.m.bugge cv (2016)
O.m.bugge cv (2016)
構文情報に基づく機械翻訳のための能動学習手法と人手翻訳による評価
構文情報に基づく機械翻訳のための能動学習手法と人手翻訳による評価
MOC Presentation
MOC Presentation
YouTube Channel Optimisations
YouTube Channel Optimisations
Resume of Sheikh Nayeem for Co
Resume of Sheikh Nayeem for Co
Kansai MT Pivot Arekore
Kansai MT Pivot Arekore
Contoh tugas simulasi
Contoh tugas simulasi
Teoremaluipitagora
Teoremaluipitagora
Similar to MT Study SCFG
FPGAを用いた処理のロボット向けコンポーネントの設計生産性評価
FPGAを用いた処理のロボット向けコンポーネントの設計生産性評価
Kazushi Yamashina
Experiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah Watkins
Ceph Community
One Problem, Two Structures, Six Solvers and Ten Years of Personnel Schedulin...
One Problem, Two Structures, Six Solvers and Ten Years of Personnel Schedulin...
Pierre Schaus
FPGAの処理をソフトウェアコンポーネント化する設計ツールcReCompの高機能化の検討
FPGAの処理をソフトウェアコンポーネント化する設計ツールcReCompの高機能化の検討
Kazushi Yamashina
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...
Fabian Pedregosa
Tree-based Translation Models (『機械翻訳』§6.2-6.3)
Tree-based Translation Models (『機械翻訳』§6.2-6.3)
Yusuke Oda
Selection analysis using HyPhy
Selection analysis using HyPhy
Bioinformatics and Computational Biosciences Branch
Metasepi team meeting: Ajhc Project Overview
Metasepi team meeting: Ajhc Project Overview
Kiwamu Okabe
NBIS RNA-seq course
NBIS RNA-seq course
Phil Ewels
RNA-Seq Analysis: Everything You Always Wanted to Know...and then some
RNA-Seq Analysis: Everything You Always Wanted to Know...and then some
basepairtech
Network Measurement with P4 and C on Netronome Agilio
Network Measurement with P4 and C on Netronome Agilio
Open-NFP
EIGRP
EIGRP
newbie2019
Containerisation and Dynamic Frameworks in ICCMA’19
Containerisation and Dynamic Frameworks in ICCMA’19
Carlo Taticchi
自律移動ロボット向けハード・ソフト協調のためのコンポーネント設計支援ツール
自律移動ロボット向けハード・ソフト協調のためのコンポーネント設計支援ツール
Kazushi Yamashina
Improving Genetic Algorithm (GA) based NoC mapping algorithm using a formal ...
Improving Genetic Algorithm (GA) based NoC mapping algorithm using a formal ...
Vinita Palaniveloo
Introduction to Apache Kafka
Introduction to Apache Kafka
Shiao-An Yuan
Sequencing, Alignment and Assembly
Sequencing, Alignment and Assembly
Shaun Jackman
control techniques
control techniques
Kranthi Kumar
p4srv6 (P4-16) design document rev1.0
p4srv6 (P4-16) design document rev1.0
Kentaro Ebisawa
System design using HDL - Module 5
System design using HDL - Module 5
Aravinda Koithyar
Similar to MT Study SCFG
(20)
FPGAを用いた処理のロボット向けコンポーネントの設計生産性評価
FPGAを用いた処理のロボット向けコンポーネントの設計生産性評価
Experiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah Watkins
One Problem, Two Structures, Six Solvers and Ten Years of Personnel Schedulin...
One Problem, Two Structures, Six Solvers and Ten Years of Personnel Schedulin...
FPGAの処理をソフトウェアコンポーネント化する設計ツールcReCompの高機能化の検討
FPGAの処理をソフトウェアコンポーネント化する設計ツールcReCompの高機能化の検討
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...
Tree-based Translation Models (『機械翻訳』§6.2-6.3)
Tree-based Translation Models (『機械翻訳』§6.2-6.3)
Selection analysis using HyPhy
Selection analysis using HyPhy
Metasepi team meeting: Ajhc Project Overview
Metasepi team meeting: Ajhc Project Overview
NBIS RNA-seq course
NBIS RNA-seq course
RNA-Seq Analysis: Everything You Always Wanted to Know...and then some
RNA-Seq Analysis: Everything You Always Wanted to Know...and then some
Network Measurement with P4 and C on Netronome Agilio
Network Measurement with P4 and C on Netronome Agilio
EIGRP
EIGRP
Containerisation and Dynamic Frameworks in ICCMA’19
Containerisation and Dynamic Frameworks in ICCMA’19
自律移動ロボット向けハード・ソフト協調のためのコンポーネント設計支援ツール
自律移動ロボット向けハード・ソフト協調のためのコンポーネント設計支援ツール
Improving Genetic Algorithm (GA) based NoC mapping algorithm using a formal ...
Improving Genetic Algorithm (GA) based NoC mapping algorithm using a formal ...
Introduction to Apache Kafka
Introduction to Apache Kafka
Sequencing, Alignment and Assembly
Sequencing, Alignment and Assembly
control techniques
control techniques
p4srv6 (P4-16) design document rev1.0
p4srv6 (P4-16) design document rev1.0
System design using HDL - Module 5
System design using HDL - Module 5
Recently uploaded
A Deep Dive into Secure Product Development Frameworks.pdf
A Deep Dive into Secure Product Development Frameworks.pdf
ICS
Test Automation Design Patterns_ A Comprehensive Guide.pdf
Test Automation Design Patterns_ A Comprehensive Guide.pdf
kalichargn70th171
Novo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMs
Neo4j
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
MulesoftMunichMeetup
From Knowledge Graphs via Lego Bricks to scientific conversations.pptx
From Knowledge Graphs via Lego Bricks to scientific conversations.pptx
Neo4j
Effective Strategies for Wix's Scaling challenges - GeeCon
Effective Strategies for Wix's Scaling challenges - GeeCon
Natan Silnitsky
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
Abortion Pill Prices Germiston ](+27832195400*)[ 🏥 Women's Abortion Clinic in...
Abortion Pill Prices Germiston ](+27832195400*)[ 🏥 Women's Abortion Clinic in...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
Spring into AI presented by Dan Vega 5/14
Spring into AI presented by Dan Vega 5/14
VMware Tanzu
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
Neo4j
From Theory to Practice: Utilizing SpiraPlan's REST API
From Theory to Practice: Utilizing SpiraPlan's REST API
Inflectra
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
Abortion Clinic In Springs ](+27832195400*)[ 🏥 Safe Abortion Pills in Springs...
Abortion Clinic In Springs ](+27832195400*)[ 🏥 Safe Abortion Pills in Springs...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
Neo4j
BusinessGPT - Security and Governance for Generative AI
BusinessGPT - Security and Governance for Generative AI
AGATSoftware
Weeding your micro service landscape.pdf
Weeding your micro service landscape.pdf
timtebeek1
Encryption Recap: A Refresher on Key Concepts
Encryption Recap: A Refresher on Key Concepts
thomashtkim
Jax, FL Admin Community Group 05.14.2024 Combined Deck
Jax, FL Admin Community Group 05.14.2024 Combined Deck
Marc Lester
Prompt Engineering - an Art, a Science, or your next Job Title?
Prompt Engineering - an Art, a Science, or your next Job Title?
Maxim Salnikov
GraphSummit Milan - Neo4j: The Art of the Possible with Graph
GraphSummit Milan - Neo4j: The Art of the Possible with Graph
Neo4j
Recently uploaded
(20)
A Deep Dive into Secure Product Development Frameworks.pdf
A Deep Dive into Secure Product Development Frameworks.pdf
Test Automation Design Patterns_ A Comprehensive Guide.pdf
Test Automation Design Patterns_ A Comprehensive Guide.pdf
Novo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMs
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
From Knowledge Graphs via Lego Bricks to scientific conversations.pptx
From Knowledge Graphs via Lego Bricks to scientific conversations.pptx
Effective Strategies for Wix's Scaling challenges - GeeCon
Effective Strategies for Wix's Scaling challenges - GeeCon
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
Abortion Pill Prices Germiston ](+27832195400*)[ 🏥 Women's Abortion Clinic in...
Abortion Pill Prices Germiston ](+27832195400*)[ 🏥 Women's Abortion Clinic in...
Spring into AI presented by Dan Vega 5/14
Spring into AI presented by Dan Vega 5/14
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
UNI DI NAPOLI FEDERICO II - Il ruolo dei grafi nell'AI Conversazionale Ibrida
From Theory to Practice: Utilizing SpiraPlan's REST API
From Theory to Practice: Utilizing SpiraPlan's REST API
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
Abortion Pill Prices Mthatha (@](+27832195400*)[ 🏥 Women's Abortion Clinic In...
Abortion Clinic In Springs ](+27832195400*)[ 🏥 Safe Abortion Pills in Springs...
Abortion Clinic In Springs ](+27832195400*)[ 🏥 Safe Abortion Pills in Springs...
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
Workshop: Enabling GenAI Breakthroughs with Knowledge Graphs - GraphSummit Milan
BusinessGPT - Security and Governance for Generative AI
BusinessGPT - Security and Governance for Generative AI
Weeding your micro service landscape.pdf
Weeding your micro service landscape.pdf
Encryption Recap: A Refresher on Key Concepts
Encryption Recap: A Refresher on Key Concepts
Jax, FL Admin Community Group 05.14.2024 Combined Deck
Jax, FL Admin Community Group 05.14.2024 Combined Deck
Prompt Engineering - an Art, a Science, or your next Job Title?
Prompt Engineering - an Art, a Science, or your next Job Title?
GraphSummit Milan - Neo4j: The Art of the Possible with Graph
GraphSummit Milan - Neo4j: The Art of the Possible with Graph
MT Study SCFG
1.
Tree-‐Based Machine Transla0on
Synchronous Context-‐Free Grammar Introduced by Akiva Miura, AHC-‐Lab 2015/06/18 15/06/18 2015©Akiva Miura AHC-‐Lab, IS, NAIST 1 MT Study Group
2.
Contents 15/06/18 2015©Akiva
Miura AHC-‐Lab, IS, NAIST 2 6.2 Synchronous Context-‐Free Grammar 6.2.1 Characteris0cs 6.2.2 Training 6.2.3 Syntac0c Labels 6.2.4 Features 6.2.5 Decoding
3.
SCFG 15/06/18 2015©Akiva
Miura AHC-‐Lab, IS, NAIST 3 Synchronous Context-‐Free Grammar (SCFG): • bilingual extension of CFG • can be applied for machine transla0on by source language side parsing (transducing)
4.
Formalism 15/06/18 2015©Akiva
Miura AHC-‐Lab, IS, NAIST 4 SCFG is defined as: where: G = N Σ Δ R A N Σ Δ R A
5.
Rewrite Rules 15/06/18
2015©Akiva Miura AHC-‐Lab, IS, NAIST 5 * → α β φ ∈ R , α ( ) 1* 1 1 β ( ) 1* 1 φ : 1 1 1* 1 : α β → α β φ → α β , 1 1 1 : α β
6.
Rules Example 15/06/18
2015©Akiva Miura AHC-‐Lab, IS, NAIST 6 Example of rewrite rules: S → <NP1 が VP2, NP1 VP2> VP → <NP1 を V2, V2 NP1> VP → <PP1 V2, V2 PP1> VP → <NP1 V2, V2 NP1> PP → <NP1 の P2, P2 NP1> NP → <NP1 の NP2, NP2 of NP1> V → <開けた, opened> |<座った,sat> P → <上に, on> NP → <犬, the dog> | <ドア, the door> | <本, the book> | <上に, the upper>
7.
Deriva0on Example 15/06/18
2015©Akiva Miura AHC-‐Lab, IS, NAIST 7 Example of deriva0on: <S1, S1> ⇒ <NP2 が VP3, NP2 VP3> ⇒ <犬 が VP3, the dog VP3> ⇒ <犬 が NP4 を V5, the dog V5 NP4> ⇒ <犬 が ドア を V5, the dog V5 the door> ⇒ <犬 が ドア を 開けた, the dog opened the door>
8.
Parse Tree Example
15/06/18 2015©Akiva Miura AHC-‐Lab, IS, NAIST 8 Example of deriva0on trees: 犬 NP2 が NP3 S1 NP4 を V5 ドア 開けた the dog NP2 VP3 S1 V5 NP4 opened the door
9.
Contents 15/06/18 2015©Akiva
Miura AHC-‐Lab, IS, NAIST 9 6.2 Synchronous Context-‐Free Grammar 6.2.1 Characteris0cs 6.2.2 Training 6.2.3 Syntac0c Labels 6.2.4 Features 6.2.5 Decoding
10.
Normal Form 15/06/18
2015©Akiva Miura AHC-‐Lab, IS, NAIST 10 • SCFG has almost the same characteris0cs with CFG, but does not have normal form Explana0on: rank : # of non-‐terminals in the right part of rule binariza0on : conversion of rules with rank >= 3 to rules with rank <= 2 Any CFG can be converted to Chomsky Normal Form, but SCFG can’t
11.
Binariza0on of Rank-‐3
Rules 15/06/18 2015©Akiva Miura AHC-‐Lab, IS, NAIST 11 • Any Rank-‐3 SCFG rule can be binarized: e.g. X → <A1 B2 C3, C3 B2 A1> introducing new non-‐terminal X’ X → <X’ 1 C2, C2 X’ 1> X’ → <A1 B2, B2 A1>
12.
Binariza0on of Rank-‐4
Rules 15/06/18 2015©Akiva Miura AHC-‐Lab, IS, NAIST 12 • Not all rank-‐4 SCFG rules can be binarized: e.g. X → <A1 B2 C3 D4, C3 A1 D4 B2> X → <A1 B2 C3 D4, B2 D4 A1 C3> A1 X B2 C3 D4 C3 A1 D4 B2 X A1 X B2 C3 D4 B2 D4 A1 C3 X these are called “inside-out”
13.
Rela0on of Grammar
Ranks 15/06/18 2015©Akiva Miura AHC-‐Lab, IS, NAIST 13 • r-‐CFG is set of languages produced by rank-‐r rules • Any r-‐CFG can be converted to equivalent 2-‐CFG Ø 1-‐CFG ⊊ 2-‐CFG = 3-‐CFG = 4-‐CFG = … = r-‐CFG • r-‐SCFG is set of language pairs produced by rank-‐r rules • 3-‐SCFG can be converted to equivalent 2-‐SCFG • r-‐SCFG (r ≧ 4) can not be banarized Ø 1-‐SCFG ⊊ 2-‐SCFG = 3-‐SCFG ⊊ 4-‐SCFG ⊊ … ⊊ r-‐SCFG
14.
Contents 15/06/18 2015©Akiva
Miura AHC-‐Lab, IS, NAIST 14 6.2 Synchronous Context-‐Free Grammar 6.2.1 Characteris0cs 6.2.2 Training 6.2.3 Syntac0c Labels 6.2.4 Features 6.2.5 Decoding
15.
Training 15/06/18 Automa0c training
of synchronous rules: 彼1 は2 近 い3 う ち4 に5 国 会6 を7 解 散8 す る9 he1 ■ will2 disolve3 ■ ■ the4 ■ diet5 ■ in6 ■ the7 near8 ■ ■ future9 ■ ■ Word Alignment 近 い3 う ち4 に5 国 会6 を7 解 散8 す る9 disolve3 ■ ■ the4 ■ diet5 ■ in6 ■ the7 near8 ■ ■ future9 ■ ■ X1 に5 X2 解 散8 す る9 dissolve3 ■ ■ X2 ■ in6 ■ the7 X1 ■ Phrase Extraction ↑ Synchronous Rule Extraction →
16.
Rule Extrac0on 15/06/18
2015©Akiva Miura AHC-‐Lab, IS, NAIST 16 These rules are extracted hierarchically, then called “Hierarchical Phrases/Rules” (Hiero) .21 2 ,. 1 1 1 .,1 1 .1. R ← ∅ 2 . ∈ Φ R ← R ∪ →{ } 2 1: → α β . . ∈ Φ α = α α β = β β R ← R ∪ → α α β β{ } 2 R = R ∈ ∪
17.
Rule Restric0on 15/06/18
2015©Akiva Miura AHC-‐Lab, IS, NAIST 17 • Hierarchical rule extraction method is exhaustive, then the trained grammar will be oversized and very ambiguous! Ø need to limit the rules: • minimal phrase pairs for the same alignment • span length limitation (e.g. 2 ≦ length ≦ 10) • rule length limitation (e.g. length ≦ 5) • rank of rules (rank ≦ 2) • prohibition of contiguous non-‐‑‒terminals (X1 X2) • including at least 1 word alignment
18.
Glue Rules 15/06/18
2015©Akiva Miura AHC-‐Lab, IS, NAIST 18 • Because of the span length limitation, the grammars might be impossible to cover long sentences. Ø introducing heuristically initial synchronous rules called “gleu rules”: S → <S1 X2, S1 X2> S → <X1, X1> • for long distance reordering (such as En↔Ja), we can introduce also: S → <S1 X2, X2 S1>
19.
Contents 15/06/18 2015©Akiva
Miura AHC-‐Lab, IS, NAIST 19 6.2 Synchronous Context-‐Free Grammar 6.2.1 Characteris0cs 6.2.2 Training 6.2.3 Syntac0c Labels 6.2.4 Features 6.2.5 Decoding
20.
Syntac0c Labels 15/06/18
2015©Akiva Miura AHC-‐Lab, IS, NAIST 20 • In standard Hiero rules, using only 2 non-‐terminals: S, X • s0ll very ambiguous (might be slow and inaccurate) Ø introducing syntac0c labels from parse tree 近 い3 う ち4 に5 国 会6 を7 解 散8 す る9 disolve3 ■ ■ the4 ■ diet5 ■ in6 ■ the7 near8 ■ ■ future9 ■ ■ NP PP NP VP IN+DT VP/PP VPVB
21.
Contents 15/06/18 2015©Akiva
Miura AHC-‐Lab, IS, NAIST 21 6.2 Synchronous Context-‐Free Grammar 6.2.1 Characteris0cs 6.2.2 Training 6.2.3 Syntac0c Labels 6.2.4 Features 6.2.5 Decoding
22.
Features 15/06/18 2015©Akiva
Miura AHC-‐Lab, IS, NAIST 22 • Decoding with SCFG also uses log linear model, and the features are almost the same with PBMT • If phrase pairs include non-‐terminals, count of phrases is not 1 per occurrence, but normalized by number of matched rules • Addi0onal penal0es: • rule count penalty: • glue rule count penalty: = − = − ∈ ∧ ∈ R{ }
23.
Contents 15/06/18 2015©Akiva
Miura AHC-‐Lab, IS, NAIST 23 6.2 Synchronous Context-‐Free Grammar 6.2.1 Characteris0cs 6.2.2 Training 6.2.3 Syntac0c Labels 6.2.4 Features 6.2.5 Decoding
24.
Decoding 15/06/18 2015©Akiva
Miura AHC-‐Lab, IS, NAIST 24 • SCFG decoding maximizes the viterbi deriva0on with linear combina0on of the features: = () , = () , ',* ω ( )( )∑ ',* ω ( )( )∑ ≈ () , ∈D G = = ω ( )
25.
Transla0on Forest 15/06/18
2015©Akiva Miura AHC-‐Lab, IS, NAIST 25 • Example of decoding: ⽝犬0,1 が1,2 本2,3 の3,4 座った5,6 上に4,5 NP0,1 VP2,6 S0,6 PP2,5 NP2,5 NP2,3 P4,5 NP4,5 V5,6 the dog sat the upper on the book NP0,1 V5,6 NP4,5 P4,5 NP2,3 PP2,5 of NP2,5 S0,6 ↑ Source language side syntax parsing Target language side transla0on forest ↑
26.
End Slide 15/06/18
2015©Akiva Miura AHC-‐Lab, IS, NAIST 26
Download now