计划

基于机器学习的机器翻译自动评价系统
指导教师：夏天
杨子枢
互联网中心

• Machine Learning
• Automatic Machine Translation Evaluation Basics
• Key techniques
• State-of-the-art
• Goal & Motivation
• Study plan

Machine Learning
• A branch of AI; A scientific discipline concerned with the design
and development of algorithms that take as input empirical
data, and yield patterns or predictions thought to be features of
the underlying mechanism that generated the data

Machine Learning
• In Action

Machine Learning
• Broad categories
– Supervised learning
• Classification, Regression
– Unsupervised learning
• Density estimation, Clustering, Dimensionality reduction
– Semi-supervised learning
– Active learning
– Reinforcement learning
– Transfer learning
– Many more …

Supervised Learning
• Task : learn a predictive function h: XY
Classification 分类
Regression 回归
Discrete Labels
Continuous Labels

Classifier?
• Non-parametric approach
• Parametric approach

Automatic Machine Translation
Basics
• What is Machine Translation ( MT )?
• What is Machine Translation Evaluation?
More art than science:
数学+语言学+计算机科学
 信、达、雅？结果质量想多了！多大程度上，某项任务
 系统开发者：体现系统问题，修复完善翻译系统
用户：系统可用性

Evaluation Basics
• 人工评测
• 准确度高；耗费大量人力物力
• 流利度(fluency)：语法正确性、惯用语选择
• 忠实度(adequacy)：传达原文意思？信息丢失？增加？曲解？
• 分级：
• 自动评测
• 针对
• 成本低、速度快；准确率（与人工评价的相关度）低
• 准确率：
• 召回率：
基于机器学习
• 回归模型
• 分类模型
流利度
5 流畅的英语
4 较好的英语
3 非母语的表达
2 不流畅的英语
1 表达不清的英语
_
correct
precision
output length

correct
recall
reference length



Basics
• 自动评测：
• 基于语言学检测点的方法
• 基于字符串相似度计算的方法
– 基于距离的方法
» 句子越容易改写成另一个句子，则两句的相似度越高
» 利用定义的改写操作及其代价计算译文之间的距离
» 错误率的形式
– 基于ｎ元语法的方法
» 基于ｎ元语法的方法的基本模式是在候选译文和参考译文的ｎ元语法集合
上建立匹配，然后计算匹配得分
– 基于词对齐的方法
• 机器学习方法
– 分类学习
– 回归学习
– 排序学习

Basics
• 机器学习方法
• 分类
– 区分机器译文和参考译文
– 从多个系统中选取最好的译文
– SVM
– 关键点在于分类特征的选取
• 回归
– 训练数据看作（ｘ，ｈ）的集合（其中ｘ是特征向量，ｈ是人工评价），通过训练
调节不同特征的权值使得特征组合产生的评价尽可能地接近训练数据的人工评分
– SVM回归模型
• 混合多种评测方法
– 线性组合
– 相关度降序排列，逐个加到最优集合，直到性能不提高
• 综合多个特征
– 除了选择代表译文在词汇、句法、语义等层面上的质量的特征，但是机器学习还可以使用非
直接评价的特征，如词的数量、句子长度、未知词、句法树的性质、解析工具的准确度、
语言模型相关特征、实义词和虚词相关特征等
Word-alignment distribution

State-of-the-art
• BLEU系统 (Bilingual Evaluation Understudy)
• 准确率（precision）与召回率（recall）
• N-gram match
• METEOR系统
• 在BLEU的基础上增加了词根还原和同义词检测
• 评价标准
System A: Israeli officials responsibility of airport safety
Reference: Israeli officials are responsible for airport security
System B: airport security Israeli officials are responsible
BLEU: A is better
METEOR: B is better

ML MTE EXAMPLE
（EnglishJapanese 2010）
• Classification yet regression model-like, i.e. continuous score
 Literal translation  unnaturalness in MT
 Reveal unnatural translations by some classification features
 Classification accuracy + comparison with manual evaluation result

注
意
不
同
源
-
目
标
语
言
对
的
语
料
库
特
点

EXAMPLE:
• solution：
• Classification feature: aligned word pairs feature
• SVM
• Scores? The distance to the hyper-plane
• Experiments and test：
流程：
比较各个模型的性能，相关系数

MT Evaluation Key Techniques
(so far)
• including but not limited to:
SVM
I. popular/successful
II. various problem:
a) linearly separable : maximal margin hyperlane
b) linearly inseparable: soft margin, noise
c) nonlinear: 属性变换，或者 kernel
III. robust
IV. learning problem solved using

Goal & Motivation
• 一、实现上述论文中的基于分类器评价系统算法
• 二、实现其他论文中基于回归的评价系统
• 三、混合两种评价系统
Motivation：
• 因为本身对语言比较感兴趣，对机器翻译问题好奇。
• 机器学习是以后学习研究的主要工具，这次毕设相当于一次入门和实
践
• Challenge: 评测则是 more art than science , 不好定量。需要多了
解别人的思想

Study plan
• 吴恩达（Andrew Ng）斯坦福公开课
• 参考书籍
• 论文学习与实现
• 平行语料库
• 现有机器翻译系统
• 现有机器翻译自动评价系统

Preliminary Work
• C++ 和 Matlab
• 在使用中学习
• 吴恩达（Andrew Ng）斯坦福公开课
• 讲解清晰易懂
• Coursera版本：优：新、简洁明了，forum 缺：比较粗略
• 网易公开课版本：优：有公式推导、课堂答疑、详尽缺：no forum
• 目标：巩固数学知识，知道其在机器学习领域的应用，建立基本概念
• 参考书籍与文献
• 介绍机器翻译的基本原理、术语和思考方法
• 《人工智能与机器翻译》《统计机器翻译》等等，后者由德国权威书写，介
绍大量的机器翻译基础知识、核心方法及前沿研究。
• 国际评测大会报告

Future Work
• 论文学习与实现
• Classification-based method:A Machine Learning-Based
Evaluation Method for Machine Translation
– Classification-based method, SVM, EnglishJapanese （应该在英译中上试验）
– Tools: GIZA++(word aligner)，morphological analysis system, TinySVM,
BLEU
– Spearman rank correlation, Fisher Z-transformation
• A Learning approach to improving sentence-level MT
evaluation
– SVM, ChineseEnglish
• Regression-based method: A Paraphrase-Based Approach to
Machine Translation Evaluation
• 方法的评估：学会使用BLEU、NIST、METEOR打分
• 围绕几篇论文实现一个系统
将分类方法和回归方法做一个混合
？目前最大疑问？如何去评价一个评价系统？

计划

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (12)

Similar to 计划

Similar to 计划 (8)

计划

Editor's Notes