Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
PubhD talk: MT serving the society
1. MT serving the society
(Aaron) Lifeng Han / ADAPT @DCU
LIFENG.HAN@adaptcentre.ie
linkedin.com/in/aaronhan
PubhD, Dublin 2017.03.1st
The ADAPT Centre is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund.
2. www.adaptcentre.iePresenter
Lifeng Han (or Aaron)
2016.12-on, PhD student in ADAPT Centre @ DCU
2016.10-11, RA researcher in ADAPT Centre
2016.03-2016.07, Guest researcher in Uni. Of Amsterdam
2014.09-2016.02, Employee in Uni. Of Amsterdam
2014.07. Master of Computer Science, Bachelor in Mathematics
News: https://aaronlifenghan.jimdo.com/news/
Poet: https://poethan.wordpress.com/
Like: Sports/arts/music/photography/poetry/cooking/cycling/drawing
4. www.adaptcentre.ieWhat is MT
MT means Machine Translation.
Use the machine / computer to translate human/natural languages
- e.g. from English to German/French/Spanish/Irish/Chinese
- And opposite directions MT
To work out with MT
- teach the computer to understand human languages
- teach the computer to learn grammar / semantics
- teach the computer to learn algorithms
5. www.adaptcentre.ieHow MT began and developed - began
The original idea is from ‘ the Tower of Babel’ (Genesis)
- 11:5 LORD came down to see the city and the tower the people were
building.
- 11:6 The LORD said, "If as one people speaking the same language
they have begun to do this, then nothing they plan to do will be
impossible for them.
The second idea is from René Descartes (1629)
- a universal language,
- with equivalent ideas in different tongues sharing one symbol.[3]
- Philosophical statement: ‘I think, therefore I am’
The third idea is from Warren Weaver "machine translation“
- appeared in <Memorandum on Translation> 1949
6. www.adaptcentre.ie
How MT began and developed - developed
MT Models:
Rule-based MT (RBMT)
Statistical MT (SMT)
Example-based MT (EBMT)
Hybrid MT (HMT)
Neural MT (NMT)
7. www.adaptcentre.ie
How MT began and developed - RBMT
RBMT paradigm: used mostly in the creation of dictionaries and
grammar programs.
- transfer-based machine translation
- interlingual machine translation
- dictionary-based machine translation
Approaches:
- linking the structure of the input sentence with the structure of output
sentence
- by parser, analyser for source lang., generator for target lan., a
transfer lexicon for the actual MT
8. www.adaptcentre.ie
How MT began and developed - RBMT
RBMT: Linguistics motivated:
- more information about the linguistics of the source and target
languages
- using the morphological and syntactic rules and semantic analysis of
both languages
Downfall:
- everything must be made explicit
- orthographical variation and erroneous input must be made part of
the source language analyser in order to cope with it
- lexical selection rules must be written for all instances of ambiguity
9. www.adaptcentre.ie
How MT began and developed - SMT
Ideas of SMT introduced by Warren Weaver in 1949
- including the ideas of applying Claude Shannon's information theory.
SMT re-introduced in the late 1980s and early 1990s
- by researchers at IBM's Thomas J. Watson Research Center
- contributed to the significant resurgence in interest in MT
10. www.adaptcentre.ie
How MT began and developed - SMT
A document is translated according to the probability distribution p(e|f):
- a string e in the target language (e.g. English) is the translation of a
string f in the source language (e.g. French), if by Bayes Theorem:
P(e|f) ~ P(f|e)P(e);
- translation model p(f|e): the probability that the source string is the
translation of the target string
- language model p(e): the probability of seeing that target language
string.
Splits the problem into two sub-problems. Finding the best translation e
is done by picking up the one that gives the highest probability
11. www.adaptcentre.ie
How MT began and developed - SMT
SMT derivations:
- Word-based
- Phrase-based
- Hierachical phrase-based
- Syntax-based (constituency structure vs dependency structure)
- Semantic integration
Problems of syntax-based model:
- Long distance dependency is still problem
- no linguistic restrictions imposed on the variables.
- when the translated piece of text is longer than a shreshold, models
can not use syntax-based rules, instead using so-called ‘glue rules’
12. www.adaptcentre.ie
How MT began and developed - NMT
Neural MT:
A deep learning based approach to MT
- Radical departure from phrase-based statistical translation
approaches, in which a translation system consists of subcomponents
that are separately engineered
- all parts of the neural translation model are trained jointly (end-to-
end) to maximize the translation performance
refer: https://en.wikipedia.org/wiki/Neural_machine_translation
13. www.adaptcentre.ie
How MT began and developed - NMT
Began from ‘word-to-vector’, by NN
Word embedding
Neural Language model
Encoder-Decoder model
New: Attention mechanism, e.g. adding alignment information etc.
14. www.adaptcentre.ie
How MT began and developed - NMT
Benefits of NMT:
Each output predicted from
- encoding of the full input sentence
- all previously produced output words (theoritically)
Word embeddings allow generalization
- ‘cat’ and ‘cats’ can have similar representations
- similar goes to ‘home’ and ‘house’
- better fluency
- better handling sentence-level context
15. www.adaptcentre.ie
How MT began and developed - NMT
Disadvantages:
- limited vocabulary, allows limited vocabulary size
- no explicit modeling of coverage / bad with rare words
- development challenges / speed / hardware / process not transparent
- traditional SMT allows customization / using own terminology/
customers domain / rules for dates, units / markup tags handling etc.
but NMT not.
16. www.adaptcentre.ie
How MT works now
For large data available language pairs, e.g.
French/German/Spanish/Chinese-English.
- Chinese English, the output can make meaning preservation most
cased, but word reordering/grammer is not good enough
For low resource language pairs:
- both adequacy and fluency need to be improved largely
For cheapness, it still needs big machines to work behind
17. www.adaptcentre.ie
How MT serves the society
Scientific communication
- researchers to understand each other’s work/paper/theoreis
Technological communication
- engineers help each others to fix the projects
Commercial communication
- when we buy stuffs from other countries, patent translations
Cultural communication
- social nets, news, travels, costumes, arts
18. www.adaptcentre.ie
How you are connected with MT daily
The papers you read everyday:
- even though you read English articles, the authors gained their ideas
probably from different languages’ articles
The food you bought everyday:
- produced by international companies who need transaltions always
The furniture/cloth you bought:
- multilingual translations introductions
The letter you receive monthly: waternet / trash/ etc. in NL/dutch
- use multimodal MT just make a picture and translation comes
The social net/news you read online:
- reporters from different countries by their own languages
19. www.adaptcentre.ie
References
Qun Liu. Dependence-based SMT talk. ILLC, UvA. 2014.Nov.
Philipp Koehn. Neural MT web seminar. Omniscientech. 2017.Jan.25th.
(Aaron) Lifeng Han. ‘Neural Machine Translation: Are we building 'The
Tower of Babel‘ again?’ Talk. DCU, Dublin. 2017.01.25th.
https://en.wikipedia.org/wiki/Machine_translation
https://en.wikipedia.org/wiki/Rule-based_machine_translation
https://en.wikipedia.org/wiki/Statistical_machine_translation