SlideShare a Scribd company logo
15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 1
A Context-Aware Topic Model
for Statistical Machine Translation
Jinsong Su, Deyi Xiong, Yang Liu, Xianpei Han, Hongyu Lin,
Junfeng Yao, Min Zhang
ACL 2015
Introduced by Yusuke Oda
@odashi_t
2015/9/10 NAIST MT-Study Group
15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 2
Lexical Selection for SMT
● Lexical selection is important for SMT
● Two categories in previous studies for lexical selection:
– Incorporating sentence-level (local) contexts
– Integrating document-level (global) topics
● Considering the correlation between local and global information
– Have never been explored
– But both are highly correlated
sentence-level
contexts
document-level
topics
15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 3
Proposed Model
● Context-aware topic model (CATM)
– Jointly model both local and global contexts for lexical selection
– Based on topic modeling
– Performing Gibbs sampling to learn parameters of the model
● Terms
– Topical words: telated to topics of the document
● In this study, we use content words (= noun, verb, adjective, adverb)
– Contextual words: effect translation selections of topical words
● In this study, we use all words in the sentence
– Target-side topical items: are translation candidates of source topical words
15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 4
Assumption
● Assumption
– Topic consistency: all should be consistent with in the document
– Context compatibility: all should be compatible with neighboring
Topical words
Target-side topical items
Contextual words
Topic
15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 5
Graphical Representation of Proposed Model
Topic distribution
of the document
Topic
Target-side topical item
Neighboring
target-side topical item
Topic distribution over
Distribution ofDistribution of
15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 6
Generation Steps
15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 7
Joint Probability
● Objective: fitting below joint probability distribution given training data :
● …OMG, too complex.
15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 8
Gibbs Sampling (1)
● Directly fitting the joint probability is intractable to compute
● Use Gibbs sampling instead
● Given the training data ,
the joint distribution of is propotion to:
15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 9
Gibbs Sampling (2)
● Sampling ● Sampling
● Sampling
● indicates
the count of b in a range a
● (-i) indicates ignoring i-th content
15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 10
Experiments
● Domain: Chinese to English
● Corpus:
– Training: FBIS / Hansards (1M sent., 54.6k doc.)
– Dev: NIST MT05
– Test: NIST MT06 / 08
● Alignment: GIZA++ / grow-diag-final-and
● Hyperparameters:
– number of topic = 25
– α = 50 / number of topics
– β = 0.1
– γ = 1.0 / number of topical words
– δ = 2000 / number of contextual words
15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 11
Result: Impact of Window Size
● Best performance under window size = 12
– Sufficient for predicting target-side translations for ambiguous source-side topical words
12 words 12 wordsAttention
15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 12
Result: Overall Performance
● Proposed method achieves the best performance
with statistical significance
BLEU4
15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 13
Result: Effect of Correlation Modeling
● Comparing with separated models
– CATM (Content): substitutes uniform distribution for
● Omitting effects from topics
– CATM (Topic): window size = 0
● Omitting effects from contexts
– CATM (Log-linear): combining above two wusing log-linear mannar
● Proposed model achieves best performance
– Jointly learning both context and topic is effective for lexical selection.
15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 14
Topic Examples
15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 15
Summaries
● Context-aware topic model (CATM)
– Jointly learning context and topic information
– Is the first work in author's knowledge
– Achieves highest translation performance than
using only context or topic information
and naively combining using log-linear mannar
● Future work
– Considering modeling for phrase-level as well as word-level
– Improving model with monolingual corpora
15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 16
Impressions
● Is it correct to use sequence-of-words window as the context?
– How about using some syntax information?
● This model uses the word alignment (GIZA++)
for selecting translation candidates
– How about the effect of alignment accuracy?

More Related Content

Similar to [Paper Introduction] A Context-Aware Topic Model for Statistical Machine Translation

Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Yusuke Oda
 
Use of Definitive Screening Designs to Optimize an Analytical Method
Use of Definitive Screening Designs to Optimize an Analytical MethodUse of Definitive Screening Designs to Optimize an Analytical Method
Use of Definitive Screening Designs to Optimize an Analytical Method
Philip Ramsey
 
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Deep Learning Italia
 
estimation-for-software-projects-chapter-26-ppt.pptx
estimation-for-software-projects-chapter-26-ppt.pptxestimation-for-software-projects-chapter-26-ppt.pptx
estimation-for-software-projects-chapter-26-ppt.pptx
ubaidullah75790
 
NLG, Training, Inference & Evaluation
NLG, Training, Inference & Evaluation NLG, Training, Inference & Evaluation
NLG, Training, Inference & Evaluation
Deep Learning Italia
 
Early Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data CubesEarly Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data Cubes
Enrico Daga
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
Dr. Haxel Consult
 
A Survey on Software Release Planning Models - Slides for the Presentation @ ...
A Survey on Software Release Planning Models - Slides for the Presentation @ ...A Survey on Software Release Planning Models - Slides for the Presentation @ ...
A Survey on Software Release Planning Models - Slides for the Presentation @ ...
Supersede
 
Negative Total Float to Improve a Multi-objective Integer Non-linear Program...
Negative Total Float to Improve a Multi-objective Integer  Non-linear Program...Negative Total Float to Improve a Multi-objective Integer  Non-linear Program...
Negative Total Float to Improve a Multi-objective Integer Non-linear Program...
IJECEIAES
 
NL to OCL via SBVR
NL to OCL via SBVRNL to OCL via SBVR
NL to OCL via SBVR
Imran Bajwa
 
MS Presentation
MS PresentationMS Presentation
MS Presentation
rajeeja
 
Study on Structural Optimization of truss members using Meta- heuristic Algor...
Study on Structural Optimization of truss members using Meta- heuristic Algor...Study on Structural Optimization of truss members using Meta- heuristic Algor...
Study on Structural Optimization of truss members using Meta- heuristic Algor...
IRJET Journal
 
Optimization 1
Optimization 1Optimization 1
Optimization 1
Amit Sharma
 
Introduction to Model-Based Machine Learning
Introduction to Model-Based Machine LearningIntroduction to Model-Based Machine Learning
Introduction to Model-Based Machine Learning
Daniel Emaasit
 
Francesco Serafin
Francesco Serafin Francesco Serafin
Francesco Serafin
Riccardo Rigon
 
Game Assignments in computer Science
Game Assignments in computer ScienceGame Assignments in computer Science
Game Assignments in computer Science
Katrin Becker
 
005614116.pdf
005614116.pdf005614116.pdf
005614116.pdf
EidTahir
 
BOIL: Towards Representation Change for Few-shot Learning
BOIL: Towards Representation Change for Few-shot LearningBOIL: Towards Representation Change for Few-shot Learning
BOIL: Towards Representation Change for Few-shot Learning
Hyungjun Yoo
 
Adopting a situated learning framework for (big) data projects
Adopting a situated learning framework for (big) data projectsAdopting a situated learning framework for (big) data projects
Adopting a situated learning framework for (big) data projects
Cranfield University
 
OOAD & ST LAB MANUAL.pdfOose feasibility study in detail Oose feasibility stu...
OOAD & ST LAB MANUAL.pdfOose feasibility study in detail Oose feasibility stu...OOAD & ST LAB MANUAL.pdfOose feasibility study in detail Oose feasibility stu...
OOAD & ST LAB MANUAL.pdfOose feasibility study in detail Oose feasibility stu...
shohi1
 

Similar to [Paper Introduction] A Context-Aware Topic Model for Statistical Machine Translation (20)

Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
 
Use of Definitive Screening Designs to Optimize an Analytical Method
Use of Definitive Screening Designs to Optimize an Analytical MethodUse of Definitive Screening Designs to Optimize an Analytical Method
Use of Definitive Screening Designs to Optimize an Analytical Method
 
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
 
estimation-for-software-projects-chapter-26-ppt.pptx
estimation-for-software-projects-chapter-26-ppt.pptxestimation-for-software-projects-chapter-26-ppt.pptx
estimation-for-software-projects-chapter-26-ppt.pptx
 
NLG, Training, Inference & Evaluation
NLG, Training, Inference & Evaluation NLG, Training, Inference & Evaluation
NLG, Training, Inference & Evaluation
 
Early Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data CubesEarly Analysis and Debuggin of Linked Open Data Cubes
Early Analysis and Debuggin of Linked Open Data Cubes
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
 
A Survey on Software Release Planning Models - Slides for the Presentation @ ...
A Survey on Software Release Planning Models - Slides for the Presentation @ ...A Survey on Software Release Planning Models - Slides for the Presentation @ ...
A Survey on Software Release Planning Models - Slides for the Presentation @ ...
 
Negative Total Float to Improve a Multi-objective Integer Non-linear Program...
Negative Total Float to Improve a Multi-objective Integer  Non-linear Program...Negative Total Float to Improve a Multi-objective Integer  Non-linear Program...
Negative Total Float to Improve a Multi-objective Integer Non-linear Program...
 
NL to OCL via SBVR
NL to OCL via SBVRNL to OCL via SBVR
NL to OCL via SBVR
 
MS Presentation
MS PresentationMS Presentation
MS Presentation
 
Study on Structural Optimization of truss members using Meta- heuristic Algor...
Study on Structural Optimization of truss members using Meta- heuristic Algor...Study on Structural Optimization of truss members using Meta- heuristic Algor...
Study on Structural Optimization of truss members using Meta- heuristic Algor...
 
Optimization 1
Optimization 1Optimization 1
Optimization 1
 
Introduction to Model-Based Machine Learning
Introduction to Model-Based Machine LearningIntroduction to Model-Based Machine Learning
Introduction to Model-Based Machine Learning
 
Francesco Serafin
Francesco Serafin Francesco Serafin
Francesco Serafin
 
Game Assignments in computer Science
Game Assignments in computer ScienceGame Assignments in computer Science
Game Assignments in computer Science
 
005614116.pdf
005614116.pdf005614116.pdf
005614116.pdf
 
BOIL: Towards Representation Change for Few-shot Learning
BOIL: Towards Representation Change for Few-shot LearningBOIL: Towards Representation Change for Few-shot Learning
BOIL: Towards Representation Change for Few-shot Learning
 
Adopting a situated learning framework for (big) data projects
Adopting a situated learning framework for (big) data projectsAdopting a situated learning framework for (big) data projects
Adopting a situated learning framework for (big) data projects
 
OOAD & ST LAB MANUAL.pdfOose feasibility study in detail Oose feasibility stu...
OOAD & ST LAB MANUAL.pdfOose feasibility study in detail Oose feasibility stu...OOAD & ST LAB MANUAL.pdfOose feasibility study in detail Oose feasibility stu...
OOAD & ST LAB MANUAL.pdfOose feasibility study in detail Oose feasibility stu...
 

More from NAIST Machine Translation Study Group

[Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Netwo...
[Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Netwo...[Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Netwo...
[Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Netwo...
NAIST Machine Translation Study Group
 
[Paper Introduction] Distant supervision for relation extraction without labe...
[Paper Introduction] Distant supervision for relation extraction without labe...[Paper Introduction] Distant supervision for relation extraction without labe...
[Paper Introduction] Distant supervision for relation extraction without labe...
NAIST Machine Translation Study Group
 
On using monolingual corpora in neural machine translation
On using monolingual corpora in neural machine translationOn using monolingual corpora in neural machine translation
On using monolingual corpora in neural machine translation
NAIST Machine Translation Study Group
 
RNN-based Translation Models (Japanese)
RNN-based Translation Models (Japanese)RNN-based Translation Models (Japanese)
RNN-based Translation Models (Japanese)
NAIST Machine Translation Study Group
 
[Paper Introduction] Efficient top down btg parsing for machine translation p...
[Paper Introduction] Efficient top down btg parsing for machine translation p...[Paper Introduction] Efficient top down btg parsing for machine translation p...
[Paper Introduction] Efficient top down btg parsing for machine translation p...
NAIST Machine Translation Study Group
 
[Paper Introduction] Translating into Morphologically Rich Languages with Syn...
[Paper Introduction] Translating into Morphologically Rich Languages with Syn...[Paper Introduction] Translating into Morphologically Rich Languages with Syn...
[Paper Introduction] Translating into Morphologically Rich Languages with Syn...
NAIST Machine Translation Study Group
 
[Paper Introduction] Supervised Phrase Table Triangulation with Neural Word E...
[Paper Introduction] Supervised Phrase Table Triangulation with Neural Word E...[Paper Introduction] Supervised Phrase Table Triangulation with Neural Word E...
[Paper Introduction] Supervised Phrase Table Triangulation with Neural Word E...
NAIST Machine Translation Study Group
 
[Paper Introduction] Evaluating MT Systems with Second Language Proficiency T...
[Paper Introduction] Evaluating MT Systems with Second Language Proficiency T...[Paper Introduction] Evaluating MT Systems with Second Language Proficiency T...
[Paper Introduction] Evaluating MT Systems with Second Language Proficiency T...
NAIST Machine Translation Study Group
 
[Paper Introduction] Bilingual word representations with monolingual quality ...
[Paper Introduction] Bilingual word representations with monolingual quality ...[Paper Introduction] Bilingual word representations with monolingual quality ...
[Paper Introduction] Bilingual word representations with monolingual quality ...
NAIST Machine Translation Study Group
 
[Book Reading] 機械翻訳 - Section 3 No.1
[Book Reading] 機械翻訳 - Section 3 No.1[Book Reading] 機械翻訳 - Section 3 No.1
[Book Reading] 機械翻訳 - Section 3 No.1
NAIST Machine Translation Study Group
 
[Paper Introduction] Training a Natural Language Generator From Unaligned Data
[Paper Introduction] Training a Natural Language Generator From Unaligned Data[Paper Introduction] Training a Natural Language Generator From Unaligned Data
[Paper Introduction] Training a Natural Language Generator From Unaligned Data
NAIST Machine Translation Study Group
 
[Book Reading] 機械翻訳 - Section 5 No.2
[Book Reading] 機械翻訳 - Section 5 No.2[Book Reading] 機械翻訳 - Section 5 No.2
[Book Reading] 機械翻訳 - Section 5 No.2
NAIST Machine Translation Study Group
 
[Book Reading] 機械翻訳 - Section 7 No.1
[Book Reading] 機械翻訳 - Section 7 No.1[Book Reading] 機械翻訳 - Section 7 No.1
[Book Reading] 機械翻訳 - Section 7 No.1
NAIST Machine Translation Study Group
 
[Book Reading] 機械翻訳 - Section 2 No.2
 [Book Reading] 機械翻訳 - Section 2 No.2 [Book Reading] 機械翻訳 - Section 2 No.2
[Book Reading] 機械翻訳 - Section 2 No.2
NAIST Machine Translation Study Group
 

More from NAIST Machine Translation Study Group (14)

[Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Netwo...
[Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Netwo...[Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Netwo...
[Paper Introduction] Efficient Lattice Rescoring Using Recurrent Neural Netwo...
 
[Paper Introduction] Distant supervision for relation extraction without labe...
[Paper Introduction] Distant supervision for relation extraction without labe...[Paper Introduction] Distant supervision for relation extraction without labe...
[Paper Introduction] Distant supervision for relation extraction without labe...
 
On using monolingual corpora in neural machine translation
On using monolingual corpora in neural machine translationOn using monolingual corpora in neural machine translation
On using monolingual corpora in neural machine translation
 
RNN-based Translation Models (Japanese)
RNN-based Translation Models (Japanese)RNN-based Translation Models (Japanese)
RNN-based Translation Models (Japanese)
 
[Paper Introduction] Efficient top down btg parsing for machine translation p...
[Paper Introduction] Efficient top down btg parsing for machine translation p...[Paper Introduction] Efficient top down btg parsing for machine translation p...
[Paper Introduction] Efficient top down btg parsing for machine translation p...
 
[Paper Introduction] Translating into Morphologically Rich Languages with Syn...
[Paper Introduction] Translating into Morphologically Rich Languages with Syn...[Paper Introduction] Translating into Morphologically Rich Languages with Syn...
[Paper Introduction] Translating into Morphologically Rich Languages with Syn...
 
[Paper Introduction] Supervised Phrase Table Triangulation with Neural Word E...
[Paper Introduction] Supervised Phrase Table Triangulation with Neural Word E...[Paper Introduction] Supervised Phrase Table Triangulation with Neural Word E...
[Paper Introduction] Supervised Phrase Table Triangulation with Neural Word E...
 
[Paper Introduction] Evaluating MT Systems with Second Language Proficiency T...
[Paper Introduction] Evaluating MT Systems with Second Language Proficiency T...[Paper Introduction] Evaluating MT Systems with Second Language Proficiency T...
[Paper Introduction] Evaluating MT Systems with Second Language Proficiency T...
 
[Paper Introduction] Bilingual word representations with monolingual quality ...
[Paper Introduction] Bilingual word representations with monolingual quality ...[Paper Introduction] Bilingual word representations with monolingual quality ...
[Paper Introduction] Bilingual word representations with monolingual quality ...
 
[Book Reading] 機械翻訳 - Section 3 No.1
[Book Reading] 機械翻訳 - Section 3 No.1[Book Reading] 機械翻訳 - Section 3 No.1
[Book Reading] 機械翻訳 - Section 3 No.1
 
[Paper Introduction] Training a Natural Language Generator From Unaligned Data
[Paper Introduction] Training a Natural Language Generator From Unaligned Data[Paper Introduction] Training a Natural Language Generator From Unaligned Data
[Paper Introduction] Training a Natural Language Generator From Unaligned Data
 
[Book Reading] 機械翻訳 - Section 5 No.2
[Book Reading] 機械翻訳 - Section 5 No.2[Book Reading] 機械翻訳 - Section 5 No.2
[Book Reading] 機械翻訳 - Section 5 No.2
 
[Book Reading] 機械翻訳 - Section 7 No.1
[Book Reading] 機械翻訳 - Section 7 No.1[Book Reading] 機械翻訳 - Section 7 No.1
[Book Reading] 機械翻訳 - Section 7 No.1
 
[Book Reading] 機械翻訳 - Section 2 No.2
 [Book Reading] 機械翻訳 - Section 2 No.2 [Book Reading] 機械翻訳 - Section 2 No.2
[Book Reading] 機械翻訳 - Section 2 No.2
 

Recently uploaded

1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
MadhavJungKarki
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
RamonNovais6
 
Pressure Relief valve used in flow line to release the over pressure at our d...
Pressure Relief valve used in flow line to release the over pressure at our d...Pressure Relief valve used in flow line to release the over pressure at our d...
Pressure Relief valve used in flow line to release the over pressure at our d...
cannyengineerings
 
Generative AI Use cases applications solutions and implementation.pdf
Generative AI Use cases applications solutions and implementation.pdfGenerative AI Use cases applications solutions and implementation.pdf
Generative AI Use cases applications solutions and implementation.pdf
mahaffeycheryld
 
Supermarket Management System Project Report.pdf
Supermarket Management System Project Report.pdfSupermarket Management System Project Report.pdf
Supermarket Management System Project Report.pdf
Kamal Acharya
 
Introduction to verilog basic modeling .ppt
Introduction to verilog basic modeling   .pptIntroduction to verilog basic modeling   .ppt
Introduction to verilog basic modeling .ppt
AmitKumar730022
 
Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...
Prakhyath Rai
 
VARIABLE FREQUENCY DRIVE. VFDs are widely used in industrial applications for...
VARIABLE FREQUENCY DRIVE. VFDs are widely used in industrial applications for...VARIABLE FREQUENCY DRIVE. VFDs are widely used in industrial applications for...
VARIABLE FREQUENCY DRIVE. VFDs are widely used in industrial applications for...
PIMR BHOPAL
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENTNATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
Addu25809
 
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
Paris Salesforce Developer Group
 
Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
bjmsejournal
 
Gas agency management system project report.pdf
Gas agency management system project report.pdfGas agency management system project report.pdf
Gas agency management system project report.pdf
Kamal Acharya
 
smart pill dispenser is designed to improve medication adherence and safety f...
smart pill dispenser is designed to improve medication adherence and safety f...smart pill dispenser is designed to improve medication adherence and safety f...
smart pill dispenser is designed to improve medication adherence and safety f...
um7474492
 
5g-5G SA reg. -standalone-access-registration.pdf
5g-5G SA reg. -standalone-access-registration.pdf5g-5G SA reg. -standalone-access-registration.pdf
5g-5G SA reg. -standalone-access-registration.pdf
devtomar25
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
shadow0702a
 
SCALING OF MOS CIRCUITS m .pptx
SCALING OF MOS CIRCUITS m                 .pptxSCALING OF MOS CIRCUITS m                 .pptx
SCALING OF MOS CIRCUITS m .pptx
harshapolam10
 
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
ydzowc
 
一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理
一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理
一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理
upoux
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
uqyfuc
 

Recently uploaded (20)

1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
1FIDIC-CONSTRUCTION-CONTRACT-2ND-ED-2017-RED-BOOK.pdf
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
 
Pressure Relief valve used in flow line to release the over pressure at our d...
Pressure Relief valve used in flow line to release the over pressure at our d...Pressure Relief valve used in flow line to release the over pressure at our d...
Pressure Relief valve used in flow line to release the over pressure at our d...
 
Generative AI Use cases applications solutions and implementation.pdf
Generative AI Use cases applications solutions and implementation.pdfGenerative AI Use cases applications solutions and implementation.pdf
Generative AI Use cases applications solutions and implementation.pdf
 
Supermarket Management System Project Report.pdf
Supermarket Management System Project Report.pdfSupermarket Management System Project Report.pdf
Supermarket Management System Project Report.pdf
 
Introduction to verilog basic modeling .ppt
Introduction to verilog basic modeling   .pptIntroduction to verilog basic modeling   .ppt
Introduction to verilog basic modeling .ppt
 
Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...
 
VARIABLE FREQUENCY DRIVE. VFDs are widely used in industrial applications for...
VARIABLE FREQUENCY DRIVE. VFDs are widely used in industrial applications for...VARIABLE FREQUENCY DRIVE. VFDs are widely used in industrial applications for...
VARIABLE FREQUENCY DRIVE. VFDs are widely used in industrial applications for...
 
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
 
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENTNATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
NATURAL DEEP EUTECTIC SOLVENTS AS ANTI-FREEZING AGENT
 
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
AI + Data Community Tour - Build the Next Generation of Apps with the Einstei...
 
Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
 
Gas agency management system project report.pdf
Gas agency management system project report.pdfGas agency management system project report.pdf
Gas agency management system project report.pdf
 
smart pill dispenser is designed to improve medication adherence and safety f...
smart pill dispenser is designed to improve medication adherence and safety f...smart pill dispenser is designed to improve medication adherence and safety f...
smart pill dispenser is designed to improve medication adherence and safety f...
 
5g-5G SA reg. -standalone-access-registration.pdf
5g-5G SA reg. -standalone-access-registration.pdf5g-5G SA reg. -standalone-access-registration.pdf
5g-5G SA reg. -standalone-access-registration.pdf
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
 
SCALING OF MOS CIRCUITS m .pptx
SCALING OF MOS CIRCUITS m                 .pptxSCALING OF MOS CIRCUITS m                 .pptx
SCALING OF MOS CIRCUITS m .pptx
 
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
 
一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理
一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理
一比一原版(uofo毕业证书)美国俄勒冈大学毕业证如何办理
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
 

[Paper Introduction] A Context-Aware Topic Model for Statistical Machine Translation

  • 1. 15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 1 A Context-Aware Topic Model for Statistical Machine Translation Jinsong Su, Deyi Xiong, Yang Liu, Xianpei Han, Hongyu Lin, Junfeng Yao, Min Zhang ACL 2015 Introduced by Yusuke Oda @odashi_t 2015/9/10 NAIST MT-Study Group
  • 2. 15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 2 Lexical Selection for SMT ● Lexical selection is important for SMT ● Two categories in previous studies for lexical selection: – Incorporating sentence-level (local) contexts – Integrating document-level (global) topics ● Considering the correlation between local and global information – Have never been explored – But both are highly correlated sentence-level contexts document-level topics
  • 3. 15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 3 Proposed Model ● Context-aware topic model (CATM) – Jointly model both local and global contexts for lexical selection – Based on topic modeling – Performing Gibbs sampling to learn parameters of the model ● Terms – Topical words: telated to topics of the document ● In this study, we use content words (= noun, verb, adjective, adverb) – Contextual words: effect translation selections of topical words ● In this study, we use all words in the sentence – Target-side topical items: are translation candidates of source topical words
  • 4. 15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 4 Assumption ● Assumption – Topic consistency: all should be consistent with in the document – Context compatibility: all should be compatible with neighboring Topical words Target-side topical items Contextual words Topic
  • 5. 15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 5 Graphical Representation of Proposed Model Topic distribution of the document Topic Target-side topical item Neighboring target-side topical item Topic distribution over Distribution ofDistribution of
  • 6. 15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 6 Generation Steps
  • 7. 15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 7 Joint Probability ● Objective: fitting below joint probability distribution given training data : ● …OMG, too complex.
  • 8. 15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 8 Gibbs Sampling (1) ● Directly fitting the joint probability is intractable to compute ● Use Gibbs sampling instead ● Given the training data , the joint distribution of is propotion to:
  • 9. 15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 9 Gibbs Sampling (2) ● Sampling ● Sampling ● Sampling ● indicates the count of b in a range a ● (-i) indicates ignoring i-th content
  • 10. 15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 10 Experiments ● Domain: Chinese to English ● Corpus: – Training: FBIS / Hansards (1M sent., 54.6k doc.) – Dev: NIST MT05 – Test: NIST MT06 / 08 ● Alignment: GIZA++ / grow-diag-final-and ● Hyperparameters: – number of topic = 25 – α = 50 / number of topics – β = 0.1 – γ = 1.0 / number of topical words – δ = 2000 / number of contextual words
  • 11. 15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 11 Result: Impact of Window Size ● Best performance under window size = 12 – Sufficient for predicting target-side translations for ambiguous source-side topical words 12 words 12 wordsAttention
  • 12. 15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 12 Result: Overall Performance ● Proposed method achieves the best performance with statistical significance BLEU4
  • 13. 15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 13 Result: Effect of Correlation Modeling ● Comparing with separated models – CATM (Content): substitutes uniform distribution for ● Omitting effects from topics – CATM (Topic): window size = 0 ● Omitting effects from contexts – CATM (Log-linear): combining above two wusing log-linear mannar ● Proposed model achieves best performance – Jointly learning both context and topic is effective for lexical selection.
  • 14. 15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 14 Topic Examples
  • 15. 15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 15 Summaries ● Context-aware topic model (CATM) – Jointly learning context and topic information – Is the first work in author's knowledge – Achieves highest translation performance than using only context or topic information and naively combining using log-linear mannar ● Future work – Considering modeling for phrase-level as well as word-level – Improving model with monolingual corpora
  • 16. 15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 16 Impressions ● Is it correct to use sequence-of-words window as the context? – How about using some syntax information? ● This model uses the word alignment (GIZA++) for selecting translation candidates – How about the effect of alignment accuracy?