SlideShare a Scribd company logo
Integrating Case Frame into Japanese to Chinese
Hierarchical Phrase-based Translation Model
Jin’an Xu, Jiangming Liu, YuFeng Chen, Yujie Zhang, Fang Ming,
Shaotong Li
Natural Language Processing group, Beijing Jiaotong University
Outlook
! Motivation
! Case Frame
! Method
! Experiment
! Conclusion
Motivation
• Hierarchical phrase-based model limit
•  Rule length limit (10)
•  Variables limit (2)
•  Glue rules
•  Long-term reordering
• Linguistic features (Japanese)
•  subject – object – verb structure
•  auxiliary words
Outlook
! Motivation
! Case Frame
! Method
! Experiment
! Conclusion
Verb Case Frame
Subject
•  Deep verb case frame
•  Specific to Japanese explicit case frame
case auxiliary
送って	
 行き	
 ます	
 香港 特別 行政区 から の 学生 を	
 空港	
 まで 	
 車 	
で	
ToolObject Goal Verb
Object Time Location Tool……
Verb
attribute
Verb Case Frame
香港 特別 行政区 	
空港	
 まで 	
 車 	
で 	
送って	
 行き	
 ます	
 
から の 学生 を	
私 達 は	
 今日	
今天 我们 开车 送
VerbTool Object GoalTime Agent
•  Deep verb case frame between paralleled sentences in two
languages
送って	
 行き	
 ます:	
 [+ ATOGT]
Agent Time Object Goal Tool Verb
送:	
 [+ ATOGT]
来自 香港 特别
行政区的学生
去 飞机场
Verb Case Frame
•  Deep case frame to shallow case frame for Japanese
送って	
 行き	
 ます	
 香港 特別 行政区
から の 学生 を	
Tool
空港	
 まで 	
 車 	
で	
Object Goal Verb
私 達 は	
 今日	
Agent Time
ガ(は) 時間(NULL) ヲ(を) マデ(まで) デ(で)
Outlook
! Motivation
! Case Frame
! Method
! Experiment
! Conclusion
Method
• Case Frame Rule extraction
•  Obtain case frame rules from paralleled sentences with word
alignments
• Transforming to SCFG
•  Transform case frame rules into hiero rules.
Method
• Case Frame Rule extraction
•  examples of case frame rules
•  Lexical rule
(ガ) 私 達 は , 我们
•  Reordering rule
(x1:ガ) (x2: 時間) (x3:ヲ) (x4:マデ) (x5: デ) 送って 行き ます(NULL), x2 x1 x5 送 x3 x4
Method
• Transforming to SCFG
•  examples o1:
r6: X X _
X _
o6: (x1: (
x2
r1: X _
(a) the example of phrase rule transformation
(b) the example of reordering rule transformation
Outlook
! Motivation
! Case Frame
! Method
! Experiment
! Conclusion
Experiment
•  Experiment data
•  CWMT 2011 Japanese-Chinese Corpus (sentence pairs)
•  Training data: 280 thousand
•  Development data: 500
•  Testing data: 900
•  ASPEC-JC Corpus (sentence pairs)
•  Training data: 680 thousand
•  Development data: 2090
•  Testing data: 1800
Experiment
•  System
•  exp1: Strong hierarchical phrase-based system (baseline)
•  exp2: exp1 with case frame rules
•  exp3: exp1 with manually case frame rules
•  Variables in rule are without distinction during decoding
Experiment
•  Experiment result
system system CWMT ASPEC
# of Rules BLEU(%) # of Rules BLEU(%)
exp1 baseline 14.7M 28.05 215.67M 27.17
exp2 exp1+cfr 14.7M+0.71M 28.36 215.67+7.21M 27.48
exp3 exp1+cfr / 28.65 / 27.63
Experiment
•  Examples
Hiero+CFR 北京 市政府 通过 郊外区县 医院 依靠 区域 医疗 中⼼心 的建设 进⼀一步 提⾼高了 医疗服务的能⼒力 和 ⽔水平
北京	
 市は	
 さらに	
 郊外	
 区	
 県	
 の	
 病院	
 に	
 依拠する区域	
 医療センターの	
 建設	
 を	
 通じて	
 医療サービスの	
 能力	
 と水準	
 を	
 高めた	
 
Experiment
•  Examples
北京 市 进⼀一步 通过 依据 郊外 区 县 的 医院 的 区域 医疗 中⼼心 的 建设 提⾼高 医疗 服务 的 能⼒力 和 ⽔水准Reference
北京 市政府 进⼀一步 郊外 区 县 医院 施政 的 区域 医疗 中⼼心 通过 建设 增加 医疗 服务 的 能⼒力 和 ⽔水平Baseline
1 2	
   3	
   4	
   5	
   6	
   7	
   8	
   9	
   10 11 12 13 14 15 16 17 18 19 20 21 22 23
transformed case rule X-> (X1 X2 X3 通じて,X1	
 通过	
 X3 X2) is applied in span [1, 17]
transformed case rule X-> (X 高めた,	
 提高	
 了	
 X) is applied in span [18, 23]
Outlook
! Motivation
! Case Frame
! Method
! Experiment
! Conclusion
Conclusion
!  This paper presented an approach to improve HPB model
systems by augmenting the SCFG with Japanese CFRs.
!  The CF are used to introduce new linguistically sensible
hypotheses into the translation search space while
maintaining the Hiero robustness qualities and avoiding
computational explosions.
!  We obtain significant improvements over a strong HPB
baseline in the Japanese-to-Chinese task.
Future work
•  Soft/hard constraints on case frame rule matching
•  Challenge to resolve the problem of tense and aspect etc.
Q &A

More Related Content

What's hot

単語・句の分散表現の学習
単語・句の分散表現の学習単語・句の分散表現の学習
単語・句の分散表現の学習
Naoaki Okazaki
 
Rainbow
RainbowRainbow
STAIR Lab Seminar 202105
STAIR Lab Seminar 202105STAIR Lab Seminar 202105
STAIR Lab Seminar 202105
Sho Takase
 
Hierarchical and Interpretable Skill Acquisition in Multi-task Reinforcement ...
Hierarchical and Interpretable Skill Acquisition in Multi-task Reinforcement ...Hierarchical and Interpretable Skill Acquisition in Multi-task Reinforcement ...
Hierarchical and Interpretable Skill Acquisition in Multi-task Reinforcement ...
Keisuke Nakata
 
ICML2017 参加報告会 山本康生
ICML2017 参加報告会 山本康生ICML2017 参加報告会 山本康生
ICML2017 参加報告会 山本康生
Yahoo!デベロッパーネットワーク
 
全体セミナー20170629
全体セミナー20170629全体セミナー20170629
全体セミナー20170629
Jiro Nishitoba
 
[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...
[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...
[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...
Deep Learning JP
 
Extract and edit
Extract and editExtract and edit
Extract and edit
禎晃 山崎
 
JSAI 1K3-2 知識ベースに基づく言語横断質問応答における訳質の影響
JSAI 1K3-2 知識ベースに基づく言語横断質問応答における訳質の影響JSAI 1K3-2 知識ベースに基づく言語横断質問応答における訳質の影響
JSAI 1K3-2 知識ベースに基づく言語横断質問応答における訳質の影響Kyoshiro Sugiyama
 
[DL輪読会]It's not just size that maters small language models are also few sho...
[DL輪読会]It's not just size that maters  small language models are also few sho...[DL輪読会]It's not just size that maters  small language models are also few sho...
[DL輪読会]It's not just size that maters small language models are also few sho...
Deep Learning JP
 
Learning Composition Models for Phrase Embeddings
Learning Composition Models for Phrase EmbeddingsLearning Composition Models for Phrase Embeddings
Learning Composition Models for Phrase Embeddings
Sho Takase
 
[DL輪読会]Adversarial Feature Matching for Text Generation
[DL輪読会]Adversarial Feature Matching for Text Generation[DL輪読会]Adversarial Feature Matching for Text Generation
[DL輪読会]Adversarial Feature Matching for Text Generation
Deep Learning JP
 

What's hot (13)

単語・句の分散表現の学習
単語・句の分散表現の学習単語・句の分散表現の学習
単語・句の分散表現の学習
 
Rainbow
RainbowRainbow
Rainbow
 
STAIR Lab Seminar 202105
STAIR Lab Seminar 202105STAIR Lab Seminar 202105
STAIR Lab Seminar 202105
 
Hierarchical and Interpretable Skill Acquisition in Multi-task Reinforcement ...
Hierarchical and Interpretable Skill Acquisition in Multi-task Reinforcement ...Hierarchical and Interpretable Skill Acquisition in Multi-task Reinforcement ...
Hierarchical and Interpretable Skill Acquisition in Multi-task Reinforcement ...
 
ICML2017 参加報告会 山本康生
ICML2017 参加報告会 山本康生ICML2017 参加報告会 山本康生
ICML2017 参加報告会 山本康生
 
全体セミナー20170629
全体セミナー20170629全体セミナー20170629
全体セミナー20170629
 
[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...
[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...
[DL輪読会]Imagination-Augmented Agents for Deep Reinforcement Learning / Learnin...
 
Extract and edit
Extract and editExtract and edit
Extract and edit
 
JSAI 1K3-2 知識ベースに基づく言語横断質問応答における訳質の影響
JSAI 1K3-2 知識ベースに基づく言語横断質問応答における訳質の影響JSAI 1K3-2 知識ベースに基づく言語横断質問応答における訳質の影響
JSAI 1K3-2 知識ベースに基づく言語横断質問応答における訳質の影響
 
[DL輪読会]It's not just size that maters small language models are also few sho...
[DL輪読会]It's not just size that maters  small language models are also few sho...[DL輪読会]It's not just size that maters  small language models are also few sho...
[DL輪読会]It's not just size that maters small language models are also few sho...
 
Prml Hackathon
Prml HackathonPrml Hackathon
Prml Hackathon
 
Learning Composition Models for Phrase Embeddings
Learning Composition Models for Phrase EmbeddingsLearning Composition Models for Phrase Embeddings
Learning Composition Models for Phrase Embeddings
 
[DL輪読会]Adversarial Feature Matching for Text Generation
[DL輪読会]Adversarial Feature Matching for Text Generation[DL輪読会]Adversarial Feature Matching for Text Generation
[DL輪読会]Adversarial Feature Matching for Text Generation
 

Similar to Jinan Xu - 2015 - Integrating Case Frame into Japanese to Chinese Hierarchical Phrase-based Translation Model

Generating Better Search Engine Text Advertisements with Deep Reinforcement L...
Generating Better Search Engine Text Advertisements with Deep Reinforcement L...Generating Better Search Engine Text Advertisements with Deep Reinforcement L...
Generating Better Search Engine Text Advertisements with Deep Reinforcement L...
harmonylab
 
第3回nips読み会・関西『variational inference foundations and modern methods』
第3回nips読み会・関西『variational inference  foundations and modern methods』第3回nips読み会・関西『variational inference  foundations and modern methods』
第3回nips読み会・関西『variational inference foundations and modern methods』
koji ochiai
 
Hangyo emnlp paperreading2016
Hangyo emnlp paperreading2016Hangyo emnlp paperreading2016
Hangyo emnlp paperreading2016
Hangyo Masatsugu
 
Variational Template Machine for Data-to-Text Generation
Variational Template Machine for Data-to-Text GenerationVariational Template Machine for Data-to-Text Generation
Variational Template Machine for Data-to-Text Generation
harmonylab
 
Pythonを含む多くのプログラミング言語を扱う処理フレームワークとパターン、鷲崎弘宜、PyConJP 2016 招待講演
Pythonを含む多くのプログラミング言語を扱う処理フレームワークとパターン、鷲崎弘宜、PyConJP 2016 招待講演Pythonを含む多くのプログラミング言語を扱う処理フレームワークとパターン、鷲崎弘宜、PyConJP 2016 招待講演
Pythonを含む多くのプログラミング言語を扱う処理フレームワークとパターン、鷲崎弘宜、PyConJP 2016 招待講演
Hironori Washizaki
 
All-but-the-Top: Simple and Effective Postprocessing for Word Representations
All-but-the-Top: Simple and Effective Postprocessing for Word RepresentationsAll-but-the-Top: Simple and Effective Postprocessing for Word Representations
All-but-the-Top: Simple and Effective Postprocessing for Word Representations
Makoto Takenaka
 
[DL輪読会]Meta-Learning Probabilistic Inference for Prediction
[DL輪読会]Meta-Learning Probabilistic Inference for Prediction[DL輪読会]Meta-Learning Probabilistic Inference for Prediction
[DL輪読会]Meta-Learning Probabilistic Inference for Prediction
Deep Learning JP
 
dont_count_predict_in_acl2014
dont_count_predict_in_acl2014dont_count_predict_in_acl2014
dont_count_predict_in_acl2014Sho Takase
 
大規模画像認識とその周辺
大規模画像認識とその周辺大規模画像認識とその周辺
大規模画像認識とその周辺n_hidekey
 
Hyperoptとその周辺について
Hyperoptとその周辺についてHyperoptとその周辺について
Hyperoptとその周辺について
Keisuke Hosaka
 
Nl237 presentation
Nl237 presentationNl237 presentation
Nl237 presentation
Roy Ray
 
猫でも分かるVariational AutoEncoder
猫でも分かるVariational AutoEncoder猫でも分かるVariational AutoEncoder
猫でも分かるVariational AutoEncoder
Sho Tatsuno
 
Abstractive Text Summarization @Retrieva seminar
Abstractive Text Summarization @Retrieva seminarAbstractive Text Summarization @Retrieva seminar
Abstractive Text Summarization @Retrieva seminar
Kodaira Tomonori
 
4thNLPDL
4thNLPDL4thNLPDL
4thNLPDL
Sho Takase
 
WordNetで作ろう! 言語横断検索サービス
WordNetで作ろう! 言語横断検索サービスWordNetで作ろう! 言語横断検索サービス
WordNetで作ろう! 言語横断検索サービス
Shintaro Takemura
 
Identifying Users’ Topical Tasks in Web Search
Identifying Users’ Topical Tasks in Web SearchIdentifying Users’ Topical Tasks in Web Search
Identifying Users’ Topical Tasks in Web Searchharapon
 
CRF を使った Web 本文抽出
CRF を使った Web 本文抽出CRF を使った Web 本文抽出
CRF を使った Web 本文抽出
Shuyo Nakatani
 
Supervised Learning of Universal Sentence Representations from Natural Langua...
Supervised Learning of Universal Sentence Representations from Natural Langua...Supervised Learning of Universal Sentence Representations from Natural Langua...
Supervised Learning of Universal Sentence Representations from Natural Langua...
Naoaki Okazaki
 
文献紹介:Extracting Opinion Expression with semi-Markov Conditional Random Fields
文献紹介:Extracting Opinion Expression with semi-Markov Conditional Random Fields文献紹介:Extracting Opinion Expression with semi-Markov Conditional Random Fields
文献紹介:Extracting Opinion Expression with semi-Markov Conditional Random Fields
Shohei Okada
 

Similar to Jinan Xu - 2015 - Integrating Case Frame into Japanese to Chinese Hierarchical Phrase-based Translation Model (20)

Generating Better Search Engine Text Advertisements with Deep Reinforcement L...
Generating Better Search Engine Text Advertisements with Deep Reinforcement L...Generating Better Search Engine Text Advertisements with Deep Reinforcement L...
Generating Better Search Engine Text Advertisements with Deep Reinforcement L...
 
第3回nips読み会・関西『variational inference foundations and modern methods』
第3回nips読み会・関西『variational inference  foundations and modern methods』第3回nips読み会・関西『variational inference  foundations and modern methods』
第3回nips読み会・関西『variational inference foundations and modern methods』
 
Hangyo emnlp paperreading2016
Hangyo emnlp paperreading2016Hangyo emnlp paperreading2016
Hangyo emnlp paperreading2016
 
Variational Template Machine for Data-to-Text Generation
Variational Template Machine for Data-to-Text GenerationVariational Template Machine for Data-to-Text Generation
Variational Template Machine for Data-to-Text Generation
 
Pythonを含む多くのプログラミング言語を扱う処理フレームワークとパターン、鷲崎弘宜、PyConJP 2016 招待講演
Pythonを含む多くのプログラミング言語を扱う処理フレームワークとパターン、鷲崎弘宜、PyConJP 2016 招待講演Pythonを含む多くのプログラミング言語を扱う処理フレームワークとパターン、鷲崎弘宜、PyConJP 2016 招待講演
Pythonを含む多くのプログラミング言語を扱う処理フレームワークとパターン、鷲崎弘宜、PyConJP 2016 招待講演
 
All-but-the-Top: Simple and Effective Postprocessing for Word Representations
All-but-the-Top: Simple and Effective Postprocessing for Word RepresentationsAll-but-the-Top: Simple and Effective Postprocessing for Word Representations
All-but-the-Top: Simple and Effective Postprocessing for Word Representations
 
[DL輪読会]Meta-Learning Probabilistic Inference for Prediction
[DL輪読会]Meta-Learning Probabilistic Inference for Prediction[DL輪読会]Meta-Learning Probabilistic Inference for Prediction
[DL輪読会]Meta-Learning Probabilistic Inference for Prediction
 
dont_count_predict_in_acl2014
dont_count_predict_in_acl2014dont_count_predict_in_acl2014
dont_count_predict_in_acl2014
 
大規模画像認識とその周辺
大規模画像認識とその周辺大規模画像認識とその周辺
大規模画像認識とその周辺
 
Hyperoptとその周辺について
Hyperoptとその周辺についてHyperoptとその周辺について
Hyperoptとその周辺について
 
Nl237 presentation
Nl237 presentationNl237 presentation
Nl237 presentation
 
猫でも分かるVariational AutoEncoder
猫でも分かるVariational AutoEncoder猫でも分かるVariational AutoEncoder
猫でも分かるVariational AutoEncoder
 
Abstractive Text Summarization @Retrieva seminar
Abstractive Text Summarization @Retrieva seminarAbstractive Text Summarization @Retrieva seminar
Abstractive Text Summarization @Retrieva seminar
 
4thNLPDL
4thNLPDL4thNLPDL
4thNLPDL
 
PFI Christmas seminar 2009
PFI Christmas seminar 2009PFI Christmas seminar 2009
PFI Christmas seminar 2009
 
WordNetで作ろう! 言語横断検索サービス
WordNetで作ろう! 言語横断検索サービスWordNetで作ろう! 言語横断検索サービス
WordNetで作ろう! 言語横断検索サービス
 
Identifying Users’ Topical Tasks in Web Search
Identifying Users’ Topical Tasks in Web SearchIdentifying Users’ Topical Tasks in Web Search
Identifying Users’ Topical Tasks in Web Search
 
CRF を使った Web 本文抽出
CRF を使った Web 本文抽出CRF を使った Web 本文抽出
CRF を使った Web 本文抽出
 
Supervised Learning of Universal Sentence Representations from Natural Langua...
Supervised Learning of Universal Sentence Representations from Natural Langua...Supervised Learning of Universal Sentence Representations from Natural Langua...
Supervised Learning of Universal Sentence Representations from Natural Langua...
 
文献紹介:Extracting Opinion Expression with semi-Markov Conditional Random Fields
文献紹介:Extracting Opinion Expression with semi-Markov Conditional Random Fields文献紹介:Extracting Opinion Expression with semi-Markov Conditional Random Fields
文献紹介:Extracting Opinion Expression with semi-Markov Conditional Random Fields
 

More from Association for Computational Linguistics

Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal TextMuis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Association for Computational Linguistics
 
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Association for Computational Linguistics
 
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour AnalysisCastro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Association for Computational Linguistics
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Association for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Association for Computational Linguistics
 
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text SimplificationElior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Association for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Association for Computational Linguistics
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Association for Computational Linguistics
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Association for Computational Linguistics
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Association for Computational Linguistics
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
Association for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
Association for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Association for Computational Linguistics
 
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Association for Computational Linguistics
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
Association for Computational Linguistics
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Association for Computational Linguistics
 

More from Association for Computational Linguistics (20)

Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal TextMuis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
 
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
 
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour AnalysisCastro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text SimplificationElior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
 

Jinan Xu - 2015 - Integrating Case Frame into Japanese to Chinese Hierarchical Phrase-based Translation Model

  • 1. Integrating Case Frame into Japanese to Chinese Hierarchical Phrase-based Translation Model Jin’an Xu, Jiangming Liu, YuFeng Chen, Yujie Zhang, Fang Ming, Shaotong Li Natural Language Processing group, Beijing Jiaotong University
  • 3. Motivation • Hierarchical phrase-based model limit •  Rule length limit (10) •  Variables limit (2) •  Glue rules •  Long-term reordering • Linguistic features (Japanese) •  subject – object – verb structure •  auxiliary words
  • 5. Verb Case Frame Subject •  Deep verb case frame •  Specific to Japanese explicit case frame case auxiliary 送って 行き ます 香港 特別 行政区 から の 学生 を 空港 まで 車 で ToolObject Goal Verb Object Time Location Tool…… Verb attribute
  • 6. Verb Case Frame 香港 特別 行政区 空港 まで 車 で 送って 行き ます から の 学生 を 私 達 は 今日 今天 我们 开车 送 VerbTool Object GoalTime Agent •  Deep verb case frame between paralleled sentences in two languages 送って 行き ます: [+ ATOGT] Agent Time Object Goal Tool Verb 送: [+ ATOGT] 来自 香港 特别 行政区的学生 去 飞机场
  • 7. Verb Case Frame •  Deep case frame to shallow case frame for Japanese 送って 行き ます 香港 特別 行政区 から の 学生 を Tool 空港 まで 車 で Object Goal Verb 私 達 は 今日 Agent Time ガ(は) 時間(NULL) ヲ(を) マデ(まで) デ(で)
  • 9. Method • Case Frame Rule extraction •  Obtain case frame rules from paralleled sentences with word alignments • Transforming to SCFG •  Transform case frame rules into hiero rules.
  • 10. Method • Case Frame Rule extraction •  examples of case frame rules •  Lexical rule (ガ) 私 達 は , 我们 •  Reordering rule (x1:ガ) (x2: 時間) (x3:ヲ) (x4:マデ) (x5: デ) 送って 行き ます(NULL), x2 x1 x5 送 x3 x4
  • 11. Method • Transforming to SCFG •  examples o1: r6: X X _ X _ o6: (x1: ( x2 r1: X _ (a) the example of phrase rule transformation (b) the example of reordering rule transformation
  • 13. Experiment •  Experiment data •  CWMT 2011 Japanese-Chinese Corpus (sentence pairs) •  Training data: 280 thousand •  Development data: 500 •  Testing data: 900 •  ASPEC-JC Corpus (sentence pairs) •  Training data: 680 thousand •  Development data: 2090 •  Testing data: 1800
  • 14. Experiment •  System •  exp1: Strong hierarchical phrase-based system (baseline) •  exp2: exp1 with case frame rules •  exp3: exp1 with manually case frame rules •  Variables in rule are without distinction during decoding
  • 15. Experiment •  Experiment result system system CWMT ASPEC # of Rules BLEU(%) # of Rules BLEU(%) exp1 baseline 14.7M 28.05 215.67M 27.17 exp2 exp1+cfr 14.7M+0.71M 28.36 215.67+7.21M 27.48 exp3 exp1+cfr / 28.65 / 27.63
  • 17. Hiero+CFR 北京 市政府 通过 郊外区县 医院 依靠 区域 医疗 中⼼心 的建设 进⼀一步 提⾼高了 医疗服务的能⼒力 和 ⽔水平 北京 市は さらに 郊外 区 県 の 病院 に 依拠する区域 医療センターの 建設 を 通じて 医療サービスの 能力 と水準 を 高めた Experiment •  Examples 北京 市 进⼀一步 通过 依据 郊外 区 县 的 医院 的 区域 医疗 中⼼心 的 建设 提⾼高 医疗 服务 的 能⼒力 和 ⽔水准Reference 北京 市政府 进⼀一步 郊外 区 县 医院 施政 的 区域 医疗 中⼼心 通过 建设 增加 医疗 服务 的 能⼒力 和 ⽔水平Baseline 1 2   3   4   5   6   7   8   9   10 11 12 13 14 15 16 17 18 19 20 21 22 23 transformed case rule X-> (X1 X2 X3 通じて,X1 通过 X3 X2) is applied in span [1, 17] transformed case rule X-> (X 高めた, 提高 了 X) is applied in span [18, 23]
  • 19. Conclusion !  This paper presented an approach to improve HPB model systems by augmenting the SCFG with Japanese CFRs. !  The CF are used to introduce new linguistically sensible hypotheses into the translation search space while maintaining the Hiero robustness qualities and avoiding computational explosions. !  We obtain significant improvements over a strong HPB baseline in the Japanese-to-Chinese task.
  • 20. Future work •  Soft/hard constraints on case frame rule matching •  Challenge to resolve the problem of tense and aspect etc.
  • 21. Q &A