1) The document examines what neural machine translation models learn about morphology through experiments analyzing the hidden states of NMT models.
2) It finds that character-based word representations better capture morphological information than word-based representations, and that lower encoder layers learn more about a word's structure while higher layers improve translation.
3) The target language does not significantly impact how much the model learns about source language morphology, and decoder states do not capture rich morphological information.
This presentation is a briefing of a paper about Networks and Natural Language Processing. It describes many graph based methods and algorithms that help in syntactic parsing, lexical semantics and other applications.
NLP Town's Yves Peirsman talk about how word embeddings, LSTMs/RNNs, Attention, Encoder-Decoder architectures, and more have helped NLP forward, including which challenges remain to be tackled (and some techniques to do just that).
State-of-the-Art Text Classification using Deep Contextual Word RepresentationsAusaf Ahmed
Introduced a state-of-the-art text classifier by addressing the capability of language semantics and polysemy in Natural Language Processing tasks. Used contextual representations of a word to achieve a ~5% increase in metrics outperforming existing models.
This presentation is a briefing of a paper about Networks and Natural Language Processing. It describes many graph based methods and algorithms that help in syntactic parsing, lexical semantics and other applications.
NLP Town's Yves Peirsman talk about how word embeddings, LSTMs/RNNs, Attention, Encoder-Decoder architectures, and more have helped NLP forward, including which challenges remain to be tackled (and some techniques to do just that).
State-of-the-Art Text Classification using Deep Contextual Word RepresentationsAusaf Ahmed
Introduced a state-of-the-art text classifier by addressing the capability of language semantics and polysemy in Natural Language Processing tasks. Used contextual representations of a word to achieve a ~5% increase in metrics outperforming existing models.
This lecture talks about parsing. Briefly gives overview on lexicon, categorization, grammar rules, syntactic tree, word senses and various challenges of natural language processing
The slide talks about the aspect in binding and scope that programmer of modern language might not be fully aware, but good to know nontheless. Concept of scope and binding makes some programming language special case behavior more explainable and rememberable.
Learning to understand phrases by embedding the dictionaryRoelof Pieters
review of "Learning to Understand Phrases by Embedding the Dictionary" by Felix Hill, Kyunghyun Cho, Anna Korhonen, Yoshua Bengio
at KTH's Deep Learning reading group:
www.csc.kth.se/cvap/cvg/rg/
Intent Classifier with Facebook fastText
Facebook Developer Circle, Malang
22 February 2017
This is slide for Facebook Developer Circle meetup.
This is for beginner.
This lectures provides students with an introduction to natural language processing, with a specific focus on the basics of two applications: vector semantics and text classification.
(Lecture at the QUARTZ PhD Winter School (http://www.quartz-itn.eu/training/winter-school/ in Padua, Italy on February 12, 2018)
Representation Learning of Vectors of Words and PhrasesFelipe Moraes
Talk about representation learning using word vectors such as Word2Vec, Paragraph Vector. Also introduced to neural network language models. Expose some applications using NNLM such as sentiment analysis and information retrieval.
Language variety identification aims at labelling texts in a native lan- guage (e.g. Spanish, Portuguese, English) with its specific variation (e.g. Ar- gentina, Chile, Mexico, Peru, Spain; Brazil, Portugal; UK, US). In this work we propose a low dimensionality representation (LDR) to address this task with five different varieties of Spanish: Argentina, Chile, Mexico, Peru and Spain. We compare our LDR method with common state-of-the-art representations and show an increase in accuracy of ∼35%. Furthermore, we compare LDR with two reference distributed representation models. Experimental results show com- petitive performance while dramatically reducing the dimensionality — and in- creasing the big data suitability — to only 6 features per variety. Additionally, we analyse the behaviour of the employed machine learning algorithms and the most discriminating features. Finally, we employ an alternative dataset to test the robustness of our low dimensionality representation with another set of similar languages.
Visual-Semantic Embeddings: some thoughts on LanguageRoelof Pieters
Language technology is rapidly evolving. A resurgence in the use of distributed semantic representations and word embeddings, combined with the rise of deep neural networks has led to new approaches and new state of the art results in many natural language processing tasks. One such exciting - and most recent - trend can be seen in multimodal approaches fusing techniques and models of natural language processing (NLP) with that of computer vision.
The talk is aimed at giving an overview of the NLP part of this trend. It will start with giving a short overview of the challenges in creating deep networks for language, as well as what makes for a “good” language models, and the specific requirements of semantic word spaces for multi-modal embeddings.
This lecture talks about parsing. Briefly gives overview on lexicon, categorization, grammar rules, syntactic tree, word senses and various challenges of natural language processing
The slide talks about the aspect in binding and scope that programmer of modern language might not be fully aware, but good to know nontheless. Concept of scope and binding makes some programming language special case behavior more explainable and rememberable.
Learning to understand phrases by embedding the dictionaryRoelof Pieters
review of "Learning to Understand Phrases by Embedding the Dictionary" by Felix Hill, Kyunghyun Cho, Anna Korhonen, Yoshua Bengio
at KTH's Deep Learning reading group:
www.csc.kth.se/cvap/cvg/rg/
Intent Classifier with Facebook fastText
Facebook Developer Circle, Malang
22 February 2017
This is slide for Facebook Developer Circle meetup.
This is for beginner.
This lectures provides students with an introduction to natural language processing, with a specific focus on the basics of two applications: vector semantics and text classification.
(Lecture at the QUARTZ PhD Winter School (http://www.quartz-itn.eu/training/winter-school/ in Padua, Italy on February 12, 2018)
Representation Learning of Vectors of Words and PhrasesFelipe Moraes
Talk about representation learning using word vectors such as Word2Vec, Paragraph Vector. Also introduced to neural network language models. Expose some applications using NNLM such as sentiment analysis and information retrieval.
Language variety identification aims at labelling texts in a native lan- guage (e.g. Spanish, Portuguese, English) with its specific variation (e.g. Ar- gentina, Chile, Mexico, Peru, Spain; Brazil, Portugal; UK, US). In this work we propose a low dimensionality representation (LDR) to address this task with five different varieties of Spanish: Argentina, Chile, Mexico, Peru and Spain. We compare our LDR method with common state-of-the-art representations and show an increase in accuracy of ∼35%. Furthermore, we compare LDR with two reference distributed representation models. Experimental results show com- petitive performance while dramatically reducing the dimensionality — and in- creasing the big data suitability — to only 6 features per variety. Additionally, we analyse the behaviour of the employed machine learning algorithms and the most discriminating features. Finally, we employ an alternative dataset to test the robustness of our low dimensionality representation with another set of similar languages.
Visual-Semantic Embeddings: some thoughts on LanguageRoelof Pieters
Language technology is rapidly evolving. A resurgence in the use of distributed semantic representations and word embeddings, combined with the rise of deep neural networks has led to new approaches and new state of the art results in many natural language processing tasks. One such exciting - and most recent - trend can be seen in multimodal approaches fusing techniques and models of natural language processing (NLP) with that of computer vision.
The talk is aimed at giving an overview of the NLP part of this trend. It will start with giving a short overview of the challenges in creating deep networks for language, as well as what makes for a “good” language models, and the specific requirements of semantic word spaces for multi-modal embeddings.
Named Entity Recognition using Hidden Markov Model (HMM)kevig
Named Entity Recognition (NER) is the subtask of Natural Language Processing (NLP) which is the branch
of artificial intelligence. It has many applications mainly in machine translation, text to speech synthesis,
natural language understanding, Information Extraction, Information retrieval, question answering etc.
The aim of NER is to classify words into some predefined categories like location name, person name,
organization name, date, time etc. In this paper we describe the Hidden Markov Model (HMM) based
approach of machine learning in detail to identify the named entities. The main idea behind the use of
HMM model for building NER system is that it is language independent and we can apply this system for
any language domain. In our NER system the states are not fixed means it is of dynamic in nature one can
use it according to their interest. The corpus used by our NER system is also not domain specific.
Named Entity Recognition using Hidden Markov Model (HMM)kevig
Named Entity Recognition (NER) is the subtask of Natural Language Processing (NLP) which is the branch of artificial intelligence. It has many applications mainly in machine translation, text to speech synthesis, natural language understanding, Information Extraction, Information retrieval, question answering etc. The aim of NER is to classify words into some predefined categories like location name, person name, organization name, date, time etc. In this paper we describe the Hidden Markov Model (HMM) based approach of machine learning in detail to identify the named entities. The main idea behind the use of HMM model for building NER system is that it is language independent and we can apply this system for any language domain. In our NER system the states are not fixed means it is of dynamic in nature one can use it according to their interest. The corpus used by our NER system is also not domain specific
Named Entity Recognition using Hidden Markov Model (HMM)kevig
Named Entity Recognition (NER) is the subtask of Natural Language Processing (NLP) which is the branch of artificial intelligence. It has many applications mainly in machine translation, text to speech synthesis, natural language understanding, Information Extraction, Information retrieval, question answering etc. The aim of NER is to classify words into some predefined categories like location name, person name,
organization name, date, time etc. In this paper we describe the Hidden Markov Model (HMM) based approach of machine learning in detail to identify the named entities. The main idea behind the use of HMM model for building NER system is that it is language independent and we can apply this system for
any language domain. In our NER system the states are not fixed means it is of dynamic in nature one can use it according to their interest. The corpus used by our NER system is also not domain specific.
Babak Rasolzadeh: The importance of entitiesZoltan Varju
Meltwater is a Business Intelligence company of +1000 individuals spread across ~60 offices in ~30 countries with over 26,000 clients. At Meltwater we see ourselves as a Outside Insights company, meaning we seek to deliver similar type of business analytics & insights as traditional CRM dashboards and ERP systems used to, except by leveraging data outside the firewall (social media, news, blogs etc.) we believe the insights can be much more decisive and predictive for our clients business. Part of the challenge with this is of course structuring the unstructured data out there. This is why the Data Science team at Meltwater has the mission to ingest, categorize, label, classify, and a whole range of other enrichments on the content that we crawl in order to index it properly in our big data architecture and make it available for our insights dashboard. We do these enrichments in +17 languages.
Babak Rasolzadeh is the Director of Data Science & NLP at Meltwater and has a team of 24 engineers on this team. Prior to Meltwater, Babak was the co-founder of OculusAI, a computer vision start-up in Sweden, that was sold to Meltwater in 2013. He holds a PhD in Computer Vision, from KTH in Sweden, and has worked on things ranging from self-driving cars to humanoid robots and mobile object recognition. He is an advisor for several startups here in US and Sweden.
Shedding Light on Software Engineering-specific Metaphors and IdiomsMia Mohammad Imran
Use of figurative language, such as metaphors and idioms, is common in our daily-life communications, and it can also be found in Software Engineering (SE) channels, such as comments on GitHub. Automatically interpreting figurative language is a challenging task, even with modern Large Language Models (LLMs), as it often involves subtle nuances. This is particularly true in the SE domain, where figurative language is frequently used to convey technical concepts, often bearing developer affect (e.g., 'spaghetti code). Surprisingly, there is a lack of studies on how figurative language in SE communications impacts the performance of automatic tools that focus on understanding developer communications, e.g., bug prioritization, incivility detection. Furthermore, it is an open question to what extent state-of-the-art LLMs interpret figurative expressions in domain-specific communication such as software engineering. To address this gap, we study the prevalence and impact of figurative language in SE communication channels. This study contributes to understanding the role of figurative language in SE, the potential of LLMs in interpreting them, and its impact on automated SE communication analysis. Our results demonstrate the effectiveness of fine-tuning LLMs with figurative language in SE and its potential impact on automated tasks that involve affect. We found that, among three state-of-the-art LLMs, the best improved fine-tuned versions have an average improvement of 6.66% on a GitHub emotion classification dataset, 7.07% on a GitHub incivility classification dataset, and 3.71% on a Bugzilla bug report prioritization dataset.
BERT: Bidirectional Encoder Representations from TransformersLiangqun Lu
BERT was developed by Google AI Language and came out Oct. 2018. It has achieved the best performance in many NLP tasks. So if you are interested in NLP, studying BERT is a good way to go.
What can typological knowledge bases and language representations tell us abo...Isabelle Augenstein
One of the core challenges in typology is to record properties of languages in a structured way. As a result of manual efforts, typological knowledge bases have emerged, which contains information about languages’ phonological, morphological and syntactic properties; as well as information about language families. Ideally, such typological knowledge bases would provide useful information for multilingual NLP models to learn how to selectively share parameters.
A related area of research suggests a different way of encoding properties of languages, namely to learn language representation vectors directly from text documents.
In this talk, I will analyse and contrast these two ways of encoding linguistic properties, as well as present research on how the two can benefit one another.
[PACLING2019] Improving Context-aware Neural Machine Translation with Target-...Hayahide Yamagishi
This is the slide used in the oral presentation at PACLING2019.
(For Japanese speakers) 本発表は私の修論発表と同等ですので、日本語がわかる方は以下のスライドの方が読みやすいかもしれません。
https://www.slideshare.net/HayahideYamagishi/ss-181147693/HayahideYamagishi/ss-181147693
Font has been changed the original one (Hiragino Maru Gothic Pro W4) into the other one by the SlideShare.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
[ACL2017読み会] What do Neural Machine Translation Models Learn about Morphology?
1. What do Neural Machine
Translation Models Learn
about Morphology?
Yonatan Belinkov, Nadir Durrani, Fahim Dalvi,
Hassan Sajjad and James Glass
@ 8/11 ACL2017 Reading
M1 Hayahide Yamagishi
2. Introduction
● “Little is known about what and how much NMT models learn
about each language and its features.”
● They try to answer the following questions
1. Which parts of the NMT architecture capture word structure?
2. What is the division of labor between defferent components?
3. How do different word representation help learn better morphology and
modeling of infrequent words?
4. How does the target language affect the learning of word structure?
● Task: Part-of-Speech tagging and morphological tagging
2
3. Task
● Part-of-Speech (POS) tagging
○ computer → NN
○ computers → NNS
● Morphological tagging
○ he → 3, single, male, subject
○ him → 3, single, male, object
● Task: hidden states → tag
○ They would like to test each hidden state.
○ If the accuracy is high, hidden states learn about the word representation.
3
4. Methodology
1. Training the NMT models (Bahdanau attention, LSTM)
2. Using the trained models as a feature extractor.
3. Training the feedforward NN using the state-tag pairs
○ 1layer: input layer, hidden layer, output layer
4. Test
● “Our goal is not to beat the state-of-the-art on a given task.”
● “We also experimented with a linear classifier and observed
similar trends to the non-linear case.”
4
5. Data
● Language Pair:
○ {Arabic, German, French, Czech} - English
○ Arabic - Hebrew (Both languages are morphologically-rich and similar.)
○ Arabic - German (Both languages are morphologically-rich but different.)
● Parallel corpus: TED
● POS annotated data
○ Gold: included in some datasets
○ Predict: from the free taggers
5
6. Char-base Encoder
● Character-aware Neural Language
Model [Kim+, AAAI2016]
● Character-based Neural Machine
Translation [Costa-jussa and Fonollosa,
ACL2016]
● Character embedding
→ word embedding
● Obtained word embeddings are inputted
into the word-based RNN-LM.
6
7. Effect of word representation (Encoder)
● Word-based vs. Char-based model
● Char-based models are stronger.
7
8. Impact of word frequency
● Frequent words don’t need the character information.
● “The char-based model is able to learn character n-gram
patterns that are important for identifying word structure.”
8
10. Analyzing specific tags
● Arabic → Determiner “Al-” becomes a prefix.
● Char-based model can distinguish “DT+NNS” from “NNS”.
10
11. Effect of encoder depth
● LSTM carries the context information → Layer 0 is worse.
● States from layer 1 is more effective than states from layer 2.
11
12. Effect of encoder depth
● Char-based models have the similar tendencies.
12
13. Effect of encoder depth
● BLEU: 2-layer NMT > 1-layer NMT
○ word / char : +1.11 / +0.56
● Layer 1 learns the word representation
● Layer 2 learns the word meaning
● Word representation < word representation + word meaning
13
14. Effect of target language
● Translating into morphologically-rich language is harder.
○ Arabic-English: 24.69
○ English-Arabic: 13.37
● “How does the target language affect the learned source
language representations?”
○ “Does translating into a morphologically-rich language require more
knowledge about source language morphology?”
● Experiment: Arabic - {Arabic, Hebrew, German, English}
○ Arabic-Arabic: Autoencoder
14
16. Effect of target languages
● They expected translating into morph-rich languages would
make the model learn more about morphology. → No
● The accuracy doesn’t correlate with the BLEU score
○ Autoencoder couldn’t learn the morphological representation.
○ If the model only works as a recreator, it doesn’t have to learn it.
○ “A better translation model learns more informative representation.”
● Possible explanation
○ Arabic-English is simply better than -Hebrew and -German.
○ These models may not be able to afford to understand the representations of
word structure.
16
17. Decoder Analysis
● Similar experiments
○ Decoder’s input is the correct previous word.
○ Char-based decoder’s input is the char-based representation.
○ Char-based decoder’s output is the word-level.
● Arabic-English or English-Arabic
17
18. Effect of decoder states
● Decoder states doesn’t have a morphological information.
● BLEU doesn’t correlate the accuracy
○ French-English: 37.8 BLEU / 54.26% accuracy
18
19. Effect of attention
● Encoder states
○ Task: creating a generic, close to language-independent representation of
source sentence .
○ When the attention is attached, these are treated as a memo.
○ When the model translates the noun, the attention sees the noun words.
● Decoder states
○ Task: using encoder’s representation to generate the target sentence in a
specific language.
○ “Without the attention mechanism, the decoder is forced to learn more
informative representations of the target language.”
19
20. Effect of word representation (Decoder)
● Char-based representations don’t hep the decoder
○ The decoder’s predictions are still done at word level.
○ “In Arabic-English the char-based model reduces the number of generated
unknown words in the MT test set by 25%.”
○ “In English-Arabic the number of unknown words remains roughly the same
between word-based and char-based models.”
20
21. Conclusion
● Their results lead to the following conclusions
○ Char-based representations are better than word-based ones
○ Lower layers captures morphology, while deeper layers improve translation
performance.
○ Translating into morphologically-poorer languages leads to better source
representations.
○ The attentional decoder learns impoverished representations that do not
carry much information about morphology.
● “Jointly learning translation and morphology can possibly
lead to better representations and improved translation.”
21