8. Qun Liu (DCU) Hybrid Solutions for TranslationRIILP
The document provides an overview of hybrid machine translation approaches. It discusses selective machine translation which selects the best translation from multiple systems. Pipelined machine translation uses one system for pre-processing or post-processing of another system. Statistical post-editing uses statistical machine translation as a post-editor for rule-based machine translation outputs to improve the translation quality.
This slides covers introduction about machine translation, some technique using in MT such as example based MT and statistical MT, main challenge facing us in machine translation, and some examples of application using in MT
PR-315: Taming Transformers for High-Resolution Image SynthesisHyeongmin Lee
요즘 Transformer 구조를 language랑 vision 관계 없이 여기저기 적용해보려는 시도가 매우 다양하게 이루어지고 있는데요, 그래서 이번주 제 발표에서는 이를 High-resolution image synthesis에 활용한, CVPR 2021 Oral Session에서 발표될 논문 하나를 소개해보려고 합니다!
** 방송 기기 문제로 이번 영상은 아이패드 필기 없이 진행됩니다!! **
논문 링크: https://arxiv.org/abs/2012.09841
영상 링크: https://youtu.be/GcbT0IGt0xE
Transition-based dependency parsing uses a stack and buffer to incrementally build dependency trees for sentences. It handles both projective and non-projective structures by allowing swap transitions to reorder words. Integrating graph-based and transition-based parsing leads to improved accuracy by combining their strengths. While transition-based parsing runs in linear time on projective structures, non-projective parsing has a worst case quadratic time complexity but in practice remains linear time.
The document discusses natural language and natural language processing (NLP). It defines natural language as languages used for everyday communication like English, Japanese, and Swahili. NLP is concerned with enabling computers to understand and interpret natural languages. The summary explains that NLP involves morphological, syntactic, semantic, and pragmatic analysis of text to extract meaning and understand context. The goal of NLP is to allow humans to communicate with computers using their own language.
DELAB - sequence generation seminar
Title
Open vocabulary problem
Table of contents
1. Open vocabulary problem
1-1. Open vocabulary problem
1-2. Ignore rare words
1-3. Approximative Softmax
1-4. Back-off Models
1-5. Character-level model
2. Solution1: Byte Pair Encoding(BPE)
3. Solution2: WordPieceModel(WPM)
Intro to Auto Speech Recognition -- How ML Learns Speech-to-TextYoshiyuki Igarashi
Automatic speech recognition (ASR) is the technology that converts speech to written text. There are two main approaches: static systems that use acoustic, pronunciation, and language models sequentially; and end-to-end neural networks that use deep neural networks for feature extraction, acoustic modeling, and language modeling. Challenges for ASR systems include noise, variations in accents and ages, transferring learning across dialects, and operating locally on devices without internet.
8. Qun Liu (DCU) Hybrid Solutions for TranslationRIILP
The document provides an overview of hybrid machine translation approaches. It discusses selective machine translation which selects the best translation from multiple systems. Pipelined machine translation uses one system for pre-processing or post-processing of another system. Statistical post-editing uses statistical machine translation as a post-editor for rule-based machine translation outputs to improve the translation quality.
This slides covers introduction about machine translation, some technique using in MT such as example based MT and statistical MT, main challenge facing us in machine translation, and some examples of application using in MT
PR-315: Taming Transformers for High-Resolution Image SynthesisHyeongmin Lee
요즘 Transformer 구조를 language랑 vision 관계 없이 여기저기 적용해보려는 시도가 매우 다양하게 이루어지고 있는데요, 그래서 이번주 제 발표에서는 이를 High-resolution image synthesis에 활용한, CVPR 2021 Oral Session에서 발표될 논문 하나를 소개해보려고 합니다!
** 방송 기기 문제로 이번 영상은 아이패드 필기 없이 진행됩니다!! **
논문 링크: https://arxiv.org/abs/2012.09841
영상 링크: https://youtu.be/GcbT0IGt0xE
Transition-based dependency parsing uses a stack and buffer to incrementally build dependency trees for sentences. It handles both projective and non-projective structures by allowing swap transitions to reorder words. Integrating graph-based and transition-based parsing leads to improved accuracy by combining their strengths. While transition-based parsing runs in linear time on projective structures, non-projective parsing has a worst case quadratic time complexity but in practice remains linear time.
The document discusses natural language and natural language processing (NLP). It defines natural language as languages used for everyday communication like English, Japanese, and Swahili. NLP is concerned with enabling computers to understand and interpret natural languages. The summary explains that NLP involves morphological, syntactic, semantic, and pragmatic analysis of text to extract meaning and understand context. The goal of NLP is to allow humans to communicate with computers using their own language.
DELAB - sequence generation seminar
Title
Open vocabulary problem
Table of contents
1. Open vocabulary problem
1-1. Open vocabulary problem
1-2. Ignore rare words
1-3. Approximative Softmax
1-4. Back-off Models
1-5. Character-level model
2. Solution1: Byte Pair Encoding(BPE)
3. Solution2: WordPieceModel(WPM)
Intro to Auto Speech Recognition -- How ML Learns Speech-to-TextYoshiyuki Igarashi
Automatic speech recognition (ASR) is the technology that converts speech to written text. There are two main approaches: static systems that use acoustic, pronunciation, and language models sequentially; and end-to-end neural networks that use deep neural networks for feature extraction, acoustic modeling, and language modeling. Challenges for ASR systems include noise, variations in accents and ages, transferring learning across dialects, and operating locally on devices without internet.
Natural Language Processing (NLP) - IntroductionAritra Mukherjee
This presentation provides a beginner-friendly introduction towards Natural Language Processing in a way that arouses interest in the field. I have made the effort to include as many easy to understand examples as possible.
The document describes algorithms for solving the union-find problem, which involves maintaining disjoint sets under union and find operations. It introduces the quick-find, quick-union, and weighted quick-union algorithms. Quick-find is too slow for union operations, which can require quadratic time. Quick-union improves on this but find operations can be slow. Weighted quick-union balances trees during union to keep depths logarithmic, improving performance of both operations.
This report shows what a dependency structure is, why a dependency structure is useful, and how to parse natural sentences to dependency structures. The report describes two stat-of-art dependency parsers, MaltParser and MSTParser, and shows comparisons between the parsers and ways to integrate them. Finally, it suggests a new parsing algorithm and possible applications using dependency structures.
The document summarizes the BLEU method for automatically evaluating machine translation systems. BLEU calculates n-gram precision between a candidate translation and multiple reference translations, with modifications to address weaknesses. It combines the average logarithm of modified n-gram precisions with a brevity penalty for translations longer than references. Evaluation tests on multiple translation systems found BLEU scores reliably distinguished system quality and correlated well with human judgements.
This document provides an overview of the OpenNLP natural language processing tool. It discusses the various NLP tasks that OpenNLP can perform, including tokenization, POS tagging, named entity recognition, chunking, parsing, and co-reference resolution. It also describes how models for these tasks are trained in OpenNLP using annotated training data. The document concludes by listing some advantages and limitations of OpenNLP.
The document provides an overview of question answering systems, including their evolution from information retrieval, common evaluation benchmarks like TREC and CLEF, and examples of major QA projects like Watson. It also discusses the movement towards leveraging semantic technologies and linked open data to power next generation QA systems, as seen in projects like SINA which transform natural language queries into formal queries over structured knowledge bases.
The document discusses dependency parsing in natural language processing. It begins by defining dependency as a syntactic or semantic relation between tokens. It then contrasts constituent structure, which groups tokens into phrases bottom-up, with dependency structure, which builds a graph connecting tokens with edges. The document goes on to describe the components of a dependency graph, including vertices, arcs, and relations. It also discusses projectivity, head rules to convert constituent trees to dependency trees, and different approaches to dependency parsing like transition-based and graph-based parsing.
This lectures provides students with an introduction to natural language processing, with a specific focus on the basics of two applications: vector semantics and text classification.
(Lecture at the QUARTZ PhD Winter School (http://www.quartz-itn.eu/training/winter-school/ in Padua, Italy on February 12, 2018)
Natural language processing and transformer modelsDing Li
The document discusses several approaches for text classification using machine learning algorithms:
1. Count the frequency of individual words in tweets and sum for each tweet to create feature vectors for classification models like regression. However, this loses some word context information.
2. Use Bayes' rule and calculate word probabilities conditioned on class to perform naive Bayes classification. Laplacian smoothing is used to handle zero probabilities.
3. Incorporate word n-grams and context by calculating word probabilities within n-gram contexts rather than independently. This captures more linguistic information than the first two approaches.
Introduction to natural language processing (NLP)Alia Hamwi
The document provides an introduction to natural language processing (NLP). It defines NLP as a field of artificial intelligence devoted to creating computers that can use natural language as input and output. Some key NLP applications mentioned include data analysis of user-generated content, conversational agents, translation, classification, information retrieval, and summarization. The document also discusses various linguistic levels of analysis like phonology, morphology, syntax, and semantics that involve ambiguity challenges. Common NLP tasks like part-of-speech tagging, named entity recognition, parsing, and information extraction are described. Finally, the document outlines the typical steps in an NLP pipeline including data collection, text cleaning, preprocessing, feature engineering, modeling and evaluation.
最近のNLP×DeepLearningのベースになっている"Transformer"について、研究室の勉強会用に作成した資料です。参考資料の引用など正確を期したつもりですが、誤りがあれば指摘お願い致します。
This is a material for the lab seminar about "Transformer", which is the base of recent NLP x Deep Learning research.
https://telecombcn-dl.github.io/2017-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
Following are the questions which I tried to answer in this ppt
What is text summarization.
What is automatic text summarization?
How it has evolved over the time?
What are different methods?
How deep learning is used for text summarization?
business application
in first few slides extractive summarization is explained, with pro and cons in next section abstractive on is explained.
In the last section business application of each one is highlighted
Big Data and Natural Language ProcessingMichel Bruley
Natural Language Processing (NLP) is the branch of computer science focused on developing systems that allow computers to communicate with people using everyday language.
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAILviv Startup Club
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
AI & BigData Online Day 2021
Website - https://aiconf.com.ua/
Youtube - https://www.youtube.com/startuplviv
FB - https://www.facebook.com/aiconf
This document provides an outline on natural language processing and machine vision. It begins with an introduction to different levels of natural language analysis, including phonetic, syntactic, semantic, and pragmatic analysis. Phonetic analysis constructs words from phonemes using frequency spectrograms. Syntactic analysis builds a structural description of sentences through parsing. Semantic analysis generates a partial meaning representation from syntax, while pragmatic analysis uses context. The document also introduces machine vision as a technology using optical sensors and cameras for industrial quality control through detection of faults. It operates through sensing images, processing/analyzing images, and various applications.
Conversational AI with Rasa - PyData WorkshopTom Bocklisch
Workshop building a simple chatbot with Rasa NLU and Core. Additional resources can be found in the repository https://github.com/tmbo/rasa-demo-pydata18/edit/master/README.md
Introduction to Transformers for NLP - Olga PetrovaAlexey Grigorev
Olga Petrova gives an introduction to transformers for natural language processing (NLP). She begins with an overview of representing words using tokenization, word embeddings, and one-hot encodings. Recurrent neural networks (RNNs) are discussed as they are important for modeling sequential data like text, but they struggle with long-term dependencies. Attention mechanisms were developed to address this by allowing the model to focus on relevant parts of the input. Transformers use self-attention and have achieved state-of-the-art results in many NLP tasks. Bidirectional Encoder Representations from Transformers (BERT) provides contextualized word embeddings trained on large corpora.
Natural language processing (NLP) involves developing systems that allow computers to understand and communicate using human language. NLP aims to understand syntax, semantics, and pragmatics. It addresses challenges like ambiguity, where a sentence can have multiple possible meanings. Syntactic parsing is the process of analyzing a sentence's structure using a context-free grammar to produce a parse tree. Top-down and bottom-up parsing are two approaches to syntactic parsing where top-down starts with the start symbol and bottom-up starts with the sentence's terminal symbols.
The document discusses various challenges in natural language processing (NLP) including ambiguity in speech, text, and word senses. It provides examples of how ambiguity can occur with homophones, attachment of prepositions, and multiple meanings of words. Resolving these ambiguities is important for tasks in NLP like question answering, machine translation, and information extraction.
Searching for the best translation combinationMatīss
Matīss Rikters presents research on combining translations from multiple machine translation systems. The document discusses several approaches to multi-system machine translation including combining whole translations, combining translations of sentence chunks, and combining translations of linguistically motivated chunks. It evaluates these approaches on Latvian-English translation tasks and explores using neural network language models and other enhancements to select the best translation combinations.
The document summarizes Matīss Rikters' research into hybrid machine translation systems that combine translations from multiple machine translation systems. It discusses different approaches to combining translations, including combining full sentence translations, combining translations of linguistically motivated sentence chunks, and selecting the best translation. It provides examples of results comparing different combination methods and systems. It also outlines future work, including adding handling of multi-word expressions and using neural network language models to select the best translation.
Natural Language Processing (NLP) - IntroductionAritra Mukherjee
This presentation provides a beginner-friendly introduction towards Natural Language Processing in a way that arouses interest in the field. I have made the effort to include as many easy to understand examples as possible.
The document describes algorithms for solving the union-find problem, which involves maintaining disjoint sets under union and find operations. It introduces the quick-find, quick-union, and weighted quick-union algorithms. Quick-find is too slow for union operations, which can require quadratic time. Quick-union improves on this but find operations can be slow. Weighted quick-union balances trees during union to keep depths logarithmic, improving performance of both operations.
This report shows what a dependency structure is, why a dependency structure is useful, and how to parse natural sentences to dependency structures. The report describes two stat-of-art dependency parsers, MaltParser and MSTParser, and shows comparisons between the parsers and ways to integrate them. Finally, it suggests a new parsing algorithm and possible applications using dependency structures.
The document summarizes the BLEU method for automatically evaluating machine translation systems. BLEU calculates n-gram precision between a candidate translation and multiple reference translations, with modifications to address weaknesses. It combines the average logarithm of modified n-gram precisions with a brevity penalty for translations longer than references. Evaluation tests on multiple translation systems found BLEU scores reliably distinguished system quality and correlated well with human judgements.
This document provides an overview of the OpenNLP natural language processing tool. It discusses the various NLP tasks that OpenNLP can perform, including tokenization, POS tagging, named entity recognition, chunking, parsing, and co-reference resolution. It also describes how models for these tasks are trained in OpenNLP using annotated training data. The document concludes by listing some advantages and limitations of OpenNLP.
The document provides an overview of question answering systems, including their evolution from information retrieval, common evaluation benchmarks like TREC and CLEF, and examples of major QA projects like Watson. It also discusses the movement towards leveraging semantic technologies and linked open data to power next generation QA systems, as seen in projects like SINA which transform natural language queries into formal queries over structured knowledge bases.
The document discusses dependency parsing in natural language processing. It begins by defining dependency as a syntactic or semantic relation between tokens. It then contrasts constituent structure, which groups tokens into phrases bottom-up, with dependency structure, which builds a graph connecting tokens with edges. The document goes on to describe the components of a dependency graph, including vertices, arcs, and relations. It also discusses projectivity, head rules to convert constituent trees to dependency trees, and different approaches to dependency parsing like transition-based and graph-based parsing.
This lectures provides students with an introduction to natural language processing, with a specific focus on the basics of two applications: vector semantics and text classification.
(Lecture at the QUARTZ PhD Winter School (http://www.quartz-itn.eu/training/winter-school/ in Padua, Italy on February 12, 2018)
Natural language processing and transformer modelsDing Li
The document discusses several approaches for text classification using machine learning algorithms:
1. Count the frequency of individual words in tweets and sum for each tweet to create feature vectors for classification models like regression. However, this loses some word context information.
2. Use Bayes' rule and calculate word probabilities conditioned on class to perform naive Bayes classification. Laplacian smoothing is used to handle zero probabilities.
3. Incorporate word n-grams and context by calculating word probabilities within n-gram contexts rather than independently. This captures more linguistic information than the first two approaches.
Introduction to natural language processing (NLP)Alia Hamwi
The document provides an introduction to natural language processing (NLP). It defines NLP as a field of artificial intelligence devoted to creating computers that can use natural language as input and output. Some key NLP applications mentioned include data analysis of user-generated content, conversational agents, translation, classification, information retrieval, and summarization. The document also discusses various linguistic levels of analysis like phonology, morphology, syntax, and semantics that involve ambiguity challenges. Common NLP tasks like part-of-speech tagging, named entity recognition, parsing, and information extraction are described. Finally, the document outlines the typical steps in an NLP pipeline including data collection, text cleaning, preprocessing, feature engineering, modeling and evaluation.
最近のNLP×DeepLearningのベースになっている"Transformer"について、研究室の勉強会用に作成した資料です。参考資料の引用など正確を期したつもりですが、誤りがあれば指摘お願い致します。
This is a material for the lab seminar about "Transformer", which is the base of recent NLP x Deep Learning research.
https://telecombcn-dl.github.io/2017-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
Following are the questions which I tried to answer in this ppt
What is text summarization.
What is automatic text summarization?
How it has evolved over the time?
What are different methods?
How deep learning is used for text summarization?
business application
in first few slides extractive summarization is explained, with pro and cons in next section abstractive on is explained.
In the last section business application of each one is highlighted
Big Data and Natural Language ProcessingMichel Bruley
Natural Language Processing (NLP) is the branch of computer science focused on developing systems that allow computers to communicate with people using everyday language.
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAILviv Startup Club
Yurii Pashchenko: Zero-shot learning capabilities of CLIP model from OpenAI
AI & BigData Online Day 2021
Website - https://aiconf.com.ua/
Youtube - https://www.youtube.com/startuplviv
FB - https://www.facebook.com/aiconf
This document provides an outline on natural language processing and machine vision. It begins with an introduction to different levels of natural language analysis, including phonetic, syntactic, semantic, and pragmatic analysis. Phonetic analysis constructs words from phonemes using frequency spectrograms. Syntactic analysis builds a structural description of sentences through parsing. Semantic analysis generates a partial meaning representation from syntax, while pragmatic analysis uses context. The document also introduces machine vision as a technology using optical sensors and cameras for industrial quality control through detection of faults. It operates through sensing images, processing/analyzing images, and various applications.
Conversational AI with Rasa - PyData WorkshopTom Bocklisch
Workshop building a simple chatbot with Rasa NLU and Core. Additional resources can be found in the repository https://github.com/tmbo/rasa-demo-pydata18/edit/master/README.md
Introduction to Transformers for NLP - Olga PetrovaAlexey Grigorev
Olga Petrova gives an introduction to transformers for natural language processing (NLP). She begins with an overview of representing words using tokenization, word embeddings, and one-hot encodings. Recurrent neural networks (RNNs) are discussed as they are important for modeling sequential data like text, but they struggle with long-term dependencies. Attention mechanisms were developed to address this by allowing the model to focus on relevant parts of the input. Transformers use self-attention and have achieved state-of-the-art results in many NLP tasks. Bidirectional Encoder Representations from Transformers (BERT) provides contextualized word embeddings trained on large corpora.
Natural language processing (NLP) involves developing systems that allow computers to understand and communicate using human language. NLP aims to understand syntax, semantics, and pragmatics. It addresses challenges like ambiguity, where a sentence can have multiple possible meanings. Syntactic parsing is the process of analyzing a sentence's structure using a context-free grammar to produce a parse tree. Top-down and bottom-up parsing are two approaches to syntactic parsing where top-down starts with the start symbol and bottom-up starts with the sentence's terminal symbols.
The document discusses various challenges in natural language processing (NLP) including ambiguity in speech, text, and word senses. It provides examples of how ambiguity can occur with homophones, attachment of prepositions, and multiple meanings of words. Resolving these ambiguities is important for tasks in NLP like question answering, machine translation, and information extraction.
Searching for the best translation combinationMatīss
Matīss Rikters presents research on combining translations from multiple machine translation systems. The document discusses several approaches to multi-system machine translation including combining whole translations, combining translations of sentence chunks, and combining translations of linguistically motivated chunks. It evaluates these approaches on Latvian-English translation tasks and explores using neural network language models and other enhancements to select the best translation combinations.
The document summarizes Matīss Rikters' research into hybrid machine translation systems that combine translations from multiple machine translation systems. It discusses different approaches to combining translations, including combining full sentence translations, combining translations of linguistically motivated sentence chunks, and selecting the best translation. It provides examples of results comparing different combination methods and systems. It also outlines future work, including adding handling of multi-word expressions and using neural network language models to select the best translation.
The document describes an interactive multi-system machine translation tool called K-Translate that combines translations from multiple machine translation systems. K-Translate first splits input sentences into syntactic chunks, translates each chunk using different online MT APIs, and then selects the best translated chunks to recombine into the final translation. It provides a user-friendly interface that visualizes the chunking and system selection. The document outlines previous work on multi-system MT and discusses future plans to improve K-Translate through additional parsing methods, language models, and quality estimation techniques.
Hybrid machine translation by combining multiple machine translation systemsMatīss
This document discusses methods for combining multiple machine translation systems to produce a superior final translation. It begins with introducing rule-based machine translation, statistical machine translation, and neural machine translation. Then it discusses various methods for combining outputs from different MT systems, including combining translations at the full sentence level, sentence fragment level using simple or advanced chunking, and exhaustive search. It presents experiments comparing the BLEU scores of single systems, dual/triple hybrid systems, and the full exhaustive search on various language corpora. The document aims to develop methods for combining Latvian and other morphologically rich and less-resourced language MT systems.
Searching for the Best Machine Translation CombinationMatīss
Matīss Rikters is researching hybrid machine translation methods. He used a count-based language model for candidate selection from full translations, combining translations of sentence chunks, and combining translations of linguistically motivated chunks. He also used a character-level neural language model for candidate selection. His methods achieved BLEU scores up to 19.51. Future work includes completing experiments on English-Estonian, winning the WMT17 news translation task for English-Latvian, performing chunking on the target side, and experimenting with other language models for candidate selection.
The document describes a voice activated text editing software project that uses speech recognition, speech synthesis, and machine learning based text summarization. The software allows users to create notes, import documents, and perform text functions using voice commands. It was created to reduce the time spent manually typing documents and to provide independent digital note-taking for visually impaired individuals. The system design and algorithms for extractive and abstractive text summarization are presented along with experimental results and comparisons to other systems.
Обзор последних разработок в области text-to-speech.
+Аудио-файлы с синтетической речью
https://soundcloud.com/user-872135531/sets/samples-of-synthesized-speech-for-modern-text-to-speech-systems-review
Integration of speech recognition with computer assisted translationChamani Shiranthika
This document discusses the integration of speech recognition with computer-assisted translation. It begins by introducing machine translation and computer-assisted translation, then describes how automatic speech recognition works and how it can be integrated with translation. Key approaches to integration include using word graphs from ASR and MT systems or rescoring ASR hypotheses with translation models. Neural machine translation models that use attention mechanisms are also discussed. The document concludes by noting areas for further development in reducing human effort in translation and increasing quality and effectiveness of speech-to-text and translation tools.
Slides for the following paper: NLP Data Cleansing Based on Linguistic Ontology Constraints
Abstract: Linked Data comprises of an unprecedented volume of structured data on the Web and is adopted from an increasing number of domains. However, the varying quality of published data forms a barrier for further adoption, especially for Linked Data consumers. In this paper, we extend a previously developed methodology of Linked Data quality assessment, which is inspired by test-driven software development. Specifically, we enrich it with ontological support and different levels of result reporting and describe how the method is applied in the Natural Language Processing (NLP) area. NLP is – compared to other domains, such as biology – a late Linked Data adopter. However, it has seen a
steep rise of activity in the creation of data and ontologies. NLP data quality assessment has become an important need for NLP datasets. In our study, we analysed 11 datasets using the lemon and NIF vocabularies in 277 test cases and point out common quality issues.
This document provides a tutorial on recurrent neural networks (RNNs) and long short-term memory (LSTM) networks. It begins with introductions to the speaker and an overview of the content. It then explains RNNs and how they work sequentially through hidden layers. Issues like vanishing gradients are discussed. LSTMs are introduced as an advanced RNN that can retain information over longer periods of time using gates. Pre-trained word embeddings like Word2Vec, GloVe, and FastText are briefly explained. Finally, homework is assigned to build a sentiment analysis model using an LSTM and pre-trained word embeddings on a Chinese text dataset.
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...AI Frontiers
This document discusses sequence to sequence learning with Tensor2Tensor (T2T) and sequence models. It provides an overview of T2T, which is a library for deep learning models and datasets. It discusses basics of sequence models including recurrent neural networks (RNNs), convolutional models, and the Transformer model based on attention. It encourages experimenting with different sequence models and datasets in T2T.
https://telecombcn-dl.github.io/2017-dlsl/
Winter School on Deep Learning for Speech and Language. UPC BarcelonaTech ETSETB TelecomBCN.
The aim of this course is to train students in methods of deep learning for speech and language. Recurrent Neural Networks (RNN) will be presented and analyzed in detail to understand the potential of these state of the art tools for time series processing. Engineering tips and scalability issues will be addressed to solve tasks such as machine translation, speech recognition, speech synthesis or question answering. Hands-on sessions will provide development skills so that attendees can become competent in contemporary data analytics tools.
The document discusses formal language theory and its applications in natural language processing (NLP). It covers two main goals in computational linguistics - theoretical interest in formally characterizing natural language and practical interest in using well-understood frameworks like finite state models to solve NLP problems. Finite state devices are widely used in NLP tasks due to their efficiency and ability to model linguistic phenomena like words through dictionaries and rules. While finite state models provide a useful approximation of language, natural languages pose challenges like ambiguity, long distance dependencies and non-regular features that require extensions to basic finite state models.
The document summarizes an academic thesis defense presentation on evaluating machine translation. It introduces the background of machine translation evaluation (MTE), existing MTE methods like BLEU, METEOR, WER, and their weaknesses. It then outlines the designed model for a new MTE metric called LEPOR, including designed factors like an enhanced length penalty and n-gram position difference penalty. The document concludes by discussing experiments, enhanced models, and applications in shared tasks to evaluate LEPOR's performance.
LEPOR: an augmented machine translation evaluation metric - Thesis PPT Lifeng (Aaron) Han
The document provides an overview of machine translation evaluation (MTE). It discusses existing MTE methods like BLEU, METEOR, WER, and their weaknesses. The author's thesis proposes a new metric called LEPOR that incorporates additional factors to address weaknesses. The additional factors include an enhanced length penalty, n-gram position difference penalty, and tunable parameters to handle cross-language performance differences. The thesis will experiment with LEPOR on various language pairs and shared tasks to evaluate its performance.
Multi-system machine translation using online APIs for English-LatvianMatīss
This paper describes a hybrid machine translation (HMT) system that employs several online MT system application program interfaces (APIs) forming a Multi-System Machine Translation (MSMT) approach. The goal is to im-prove the automated translation of English – Latvian texts over each of the individual MT APIs. The selection of the best hypothesis translation is done by calculating the perplexity for each hypothesis. Experiment results show a slight improvement of BLEU score and WER (word error rate).
Michigan Information Retrieval Enthusiasts Group Meetup - August 19, 2010ivan provalov
Two presentation from the Michigan Information Retrieval Enthusiasts Group Meetup on August 19 by Cengage Learning search platform development team.
Scaling Performance Tuning With Lucene by John Nader discusses primary performance hot spots related to scaling to a multi-million document collection. This includes the team's experiences with memory consumption, GC tuning, query expansion, and filter performance. Discusses both the tools used to identify issues and the techniques used to address them.
Relevance Tuning Using TREC Dataset by Rohit Laungani and Ivan Provalov describes the TREC dataset used by the team to improve the relevance of the Lucene-based search platform. Goes over IBM paper and describe the approaches tried: Lexical Affinities, Stemming, Pivot Length Normalization, Sweet Spot Similarity, Term Frequency Average Normalization. Talks about Pseudo Relevance Feedback.
240311_JW_labseminar[Sequence to Sequence Learning with Neural Networks].pptxthanhdowork
This document summarizes a research paper on using sequence to sequence learning with LSTM for machine translation. It introduces the motivation for using LSTMs for problems with variable length sequences. The model uses an encoder LSTM to generate a fixed-length context vector from the input sequence, and a decoder LSTM to generate the output sequence from the context vector. The experiments apply this model to the WMT'14 English-to-French translation task and achieve better results than statistical machine translation baselines. The researchers conclude LSTMs should perform well on other sequence learning problems with sufficient training data.
Similar to Hybrid Machine Translation by Combining Multiple Machine Translation Systems (20)
This document summarizes a research project that analyzed language related to food on Twitter. The project collected over 2.5 million Latvian tweets mentioning food over several years. Researchers performed sentiment analysis, named entity recognition, and question answering on annotated subsets of the data. Analysis found relationships between food mentions and weather, time of day, and price increases. The researchers published several papers examining how sentiment about different foods changes over time and in various weather conditions. All data and code for the project are available on GitHub.
How Masterly Are People at Playing with Their Vocabulary?Matīss
In this paper, we describe adaptation of a simple word guessing game that occupied the hearts and minds of people around the world. There are versions for all three Baltic countries and even several versions of each. We specifically pay attention to the Latvian version and look into how people form their guesses given any already uncovered hints. The paper analyses guess patterns, easy and difficult word characteristics, and player behaviour and response.
Tracing multisensory food experience on twitterMatīss
- This document summarizes a project that analyzes over 2.4 million Latvian language tweets related to food and eating collected since 2011.
- The tweets have been annotated for sentiment, named entities, and question-answer pairs and analyzed to identify trends, seasonal patterns for foods, and relationships between foods and senses of smell, taste, and temperature.
- Key findings include identifying popular foods and drinks mentioned over time, seasonal patterns for foods like chocolate and berries, and changing sentiment toward foods like meat detected on Twitter. Publications and code related to the project are available on GitHub.
This document provides information about tips and tools for training neural machine translation systems. It discusses trends seen at the WMT 2019 conference, including techniques like back-translation being used more than once, careful corpus filtering, using extra languages during training, and target-to-source reranking. The document also describes the author's experiences improving their English-Estonian machine translation system by applying techniques like multi-pass training, filtering corpora, and back-translation. Visualization tools for analyzing neural machine translation outputs and attention are also introduced.
The Impact of Corpora Qulality on Neural Machine TranslationMatīss
The document discusses how filtering and cleaning a training corpus improved the author's neural machine translation results. Specifically, applying filters to remove duplicate, unaligned, and noisy sentence pairs and using language identification tools produced a cleaner parallel corpus. Models trained on this filtered corpus performed significantly better, moving the author's system from weak to top results on the WMT 2018 evaluation. The document emphasizes that preprocessing and cleaning data is important for high quality neural machine translation models.
This document discusses advancing Estonian machine translation by collecting more data, improving neural machine translation systems, and creating a mobile translation application. It summarizes efforts to gather additional parallel text data from public sources and Estonian institutions. It also describes experiments with multilingual neural machine translation models and creating scripts to facilitate multilingual training. The document concludes by noting new state-of-the-art machine translation systems for Estonian-Russian and Estonian-English are now available online and through a new Android translation app.
This document summarizes Matīss Rikters' presentation on debugging neural machine translations. It discusses different types of machine translation including statistical MT and neural MT. It describes attention mechanisms and how they allow neural MT to consider context. The document demonstrates examples of MT outputs and compares them to references. It introduces a visualization tool that can analyze soft alignments. It provides characteristics to look for when debugging such as short sentences with low confidence. The conclusion is that the tool allows direct comparison of outputs without references and provides additional analysis when references are available.
Effective online learning implementation for statistical machine translationMatīss
This document discusses effective online learning for statistical machine translation. It presents research on using online learning to improve machine translation quality during runtime by learning from corrected translations sent back from computer-assisted translation tools. The researchers implemented three translation scenarios: a baseline without online learning, online learning without feedback, and online learning with feedback. Their two-step tuning method improved over the baseline, increasing quality by up to 12 BLEU points depending on the language domain and size of the training data used for tuning.
Processing of multi-word expressions (MWEs) is a known problem for any natural language processing task. Even neural machine translation (NMT) struggles to overcome it. This paper presents results of experiments on investigating NMT attention allocation to the MWEs and improving automated translation of
sentences that contain MWEs in English -> Latvian and English -> Czech NMT systems. Two improvement strategies were
explored - (1) bilingual pairs of automatically extracted MWE candidates were added to the parallel corpus used to train the NMT system, and (2) full sentences containing the automatically extracted MWE candidates were added to the parallel corpus. Both approaches allowed to increase automated evaluation
results. The best result - 0.99 BLEU point increase - has been reached with the first approach, while with the second approach minimal improvements achieved. We also provide open-source software and tools used for MWE extraction and alignment inspection.
The document summarizes the 26th International Conference on Computational Linguistics (COLING) held in Osaka, Japan in December 2016. Over 1100 presenters attended, with 1039 papers submitted and a 32% acceptance rate. Key areas included neural networks, machine translation, dialog systems, and natural language processing applications. Plenary speakers addressed topics such as universal dependencies in parsing and grounded semantics for hybrid machine translation. The conference featured presentations and posters on recent research advances, including character-level named entity recognition, interactive attention for neural machine translation, and improving attention modeling for machine translation.
Neural Network Language Models for Candidate Scoring in Multi-System Machine...Matīss
This document summarizes Matīss Rikters' presentation on using neural network language models for candidate scoring in multi-system machine translation. It discusses using character-level recurrent and memory neural networks to score translations from multiple online machine translation systems. The best-performing models were a character-level RNN and a memory network, with the RNN achieving the highest BLEU score of 19.53 on a Latvian-English task. Future work discussed expanding the approach to other languages and tasks like quality estimation.
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxMAGOTI ERNEST
Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024).
Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).
The technology uses reclaimed CO₂ as the dyeing medium in a closed loop process. When pressurized, CO₂ becomes supercritical (SC-CO₂). In this state CO₂ has a very high solvent power, allowing the dye to dissolve easily.
ESR spectroscopy in liquid food and beverages.pptxPRIYANKA PATEL
With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.
ESPP presentation to EU Waste Water Network, 4th June 2024 “EU policies driving nutrient removal and recycling
and the revised UWWTD (Urban Waste Water Treatment Directive)”
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Leonel Morgado
Current descriptions of immersive learning cases are often difficult or impossible to compare. This is due to a myriad of different options on what details to include, which aspects are relevant, and on the descriptive approaches employed. Also, these aspects often combine very specific details with more general guidelines or indicate intents and rationales without clarifying their implementation. In this paper we provide a method to describe immersive learning cases that is structured to enable comparisons, yet flexible enough to allow researchers and practitioners to decide which aspects to include. This method leverages a taxonomy that classifies educational aspects at three levels (uses, practices, and strategies) and then utilizes two frameworks, the Immersive Learning Brain and the Immersion Cube, to enable a structured description and interpretation of immersive learning cases. The method is then demonstrated on a published immersive learning case on training for wind turbine maintenance using virtual reality. Applying the method results in a structured artifact, the Immersive Learning Case Sheet, that tags the case with its proximal uses, practices, and strategies, and refines the free text case description to ensure that matching details are included. This contribution is thus a case description method in support of future comparative research of immersive learning cases. We then discuss how the resulting description and interpretation can be leveraged to change immersion learning cases, by enriching them (considering low-effort changes or additions) or innovating (exploring more challenging avenues of transformation). The method holds significant promise to support better-grounded research in immersive learning.
Or: Beyond linear.
Abstract: Equivariant neural networks are neural networks that incorporate symmetries. The nonlinear activation functions in these networks result in interesting nonlinear equivariant maps between simple representations, and motivate the key player of this talk: piecewise linear representation theory.
Disclaimer: No one is perfect, so please mind that there might be mistakes and typos.
dtubbenhauer@gmail.com
Corrected slides: dtubbenhauer.com/talks.html
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...University of Maribor
Slides from talk:
Aleš Zamuda: Remote Sensing and Computational, Evolutionary, Supercomputing, and Intelligent Systems.
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Inter-Society Networking Panel GRSS/MTT-S/CIS Panel Session: Promoting Connection and Cooperation
https://www.etran.rs/2024/en/home-english/
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...AbdullaAlAsif1
The pygmy halfbeak Dermogenys colletei, is known for its viviparous nature, this presents an intriguing case of relatively low fecundity, raising questions about potential compensatory reproductive strategies employed by this species. Our study delves into the examination of fecundity and the Gonadosomatic Index (GSI) in the Pygmy Halfbeak, D. colletei (Meisner, 2001), an intriguing viviparous fish indigenous to Sarawak, Borneo. We hypothesize that the Pygmy halfbeak, D. colletei, may exhibit unique reproductive adaptations to offset its low fecundity, thus enhancing its survival and fitness. To address this, we conducted a comprehensive study utilizing 28 mature female specimens of D. colletei, carefully measuring fecundity and GSI to shed light on the reproductive adaptations of this species. Our findings reveal that D. colletei indeed exhibits low fecundity, with a mean of 16.76 ± 2.01, and a mean GSI of 12.83 ± 1.27, providing crucial insights into the reproductive mechanisms at play in this species. These results underscore the existence of unique reproductive strategies in D. colletei, enabling its adaptation and persistence in Borneo's diverse aquatic ecosystems, and call for further ecological research to elucidate these mechanisms. This study lends to a better understanding of viviparous fish in Borneo and contributes to the broader field of aquatic ecology, enhancing our knowledge of species adaptations to unique ecological challenges.
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Hybrid Machine Translation by Combining Multiple Machine Translation Systems
1. Matīss Rikters
Supervisor: Dr. sc. comp., prof. Inguna Skadiņa
May 10, 2019
Hybrid Machine Translation by Combining
Multiple Machine Translation Systems
3. Automatic Evaluation MT
BLEU - one of the first metrics to report high correlation with human judgments
• One of the most popular in the field
• The closer MT is to a professional human translation, the better it is
• Scores a translation on a scale of 0 to 100
Introduction
4. The aim is to research and develop methods and tools that would be able to
improve the quality of MT output for the Baltic languages that are small, have a rich
morphology and little resources available.
For this research, the author has suggested the following hypothesis:
Combining output from multiple different MT systems makes it possible to
produce higher quality translations for the Baltic languages than the output that is
produced by each component system individually.
Aim and objectives
Objectives
• Analyse RBMT, SMT and NMT methods as well as existing HMT methods
• Experiment with different methods of combining translations
• Evaluate quality of the resulting translations
• Investigate applicability of methods for Latvian and other morphologically rich languages
• Provide practical applications of MT combining
6. Full sentence translations
Sentence tokenization
Translation with APIs
Google Translate Bing Translator LetsMT
Selection of the best
translation
Output
7. Perplexity on a test set is calculated using the language model as the inverse
probability (P) of that test set, which is normalized by the number of words (N)
(Jurafsky and Martin, 2014). For a test set W = w1, w2, ..., wN:
𝑝𝑒𝑟𝑝𝑙𝑒𝑥𝑖𝑡𝑦(𝑊) = P(𝑤1, 𝑤2, … , 𝑤 𝑁)−
1
𝑁 =
𝑁 1
P(𝑤1, 𝑤2, … , 𝑤 𝑁)
Perplexity can also be defined as the exponential function of the cross-entropy:
𝐻(𝑊) = −
1
𝑁
log P(𝑤1, 𝑤2, … , 𝑤 𝑁)
𝑝𝑒𝑟𝑝𝑙𝑒𝑥𝑖𝑡𝑦(𝑊) = 2 𝐻(𝑊)
Selection of the best translation
8. En → Lv data
• 5-gram Language model trained with KenLM for the target language (Lv)
• JRC Acquis corpus version 3.0 (legal domain - 1.4M sentences)
• Parallel test sets
• ACCURAT balanced test corpus for under resourced languages (general domain - 512 sentences)
• Random (held-out) sentences from the JRC Acquis 3.0 (legal domain - 1581 sentences)
Experiments
System BLEU
Google Translate 16.92
Bing Translator 17.16
LetsMT 28.27
Google + Bing 17.28
Google + LetsMT 22.89
LetsMT + Bing 22.83
Google + Bing + LetsMT 21.08
System AVG human Hybrid BLEU
Bing 31,88% 28,93% 16.92
Google 30,63% 34,31% 17.16
LetsMT 37,50% 33,98% 28.27
Human evaluationAutomatic evaluation
9. Sentence fragments
Sentence tokenization
Translation with APIs
Google
Translate
Bing
Translator
LetsMT
Selection of the best translated chunk
Output
Sentence chunking (decomposition)
Syntactic parsing
Sentence recomposition
10. Experiments
System BLEU
Full Chunks
Google Translate 18.09
Bing Translator 18.87
LetsMT! 30.28
Google + Bing 18.73 21.27
Google + LetsMT 24.50 26.24
LetsMT! + Bing 24.66 26.63
Google + Bing + LetsMT! 22.69 24.72
System Fluency AVG Accuracy AVG Hybrid selection BLEU
Google 35.29% 34.93% 16.83% 18.09
Bing 23.53% 23.97% 17.94% 18.87
LetsMT 20.00% 21.92% 65.23% 30.28
Hybrid 21.18% 19.18% - 24.72
Human evaluationAutomatic evaluation
Syntactic parsing
• Berkeley Parser
• Sentences are split into chunks from the top level subtrees of the syntax tree
Selection of the best chunk and test data
• Same as in the previous experiment
11. An advanced approach to chunking
• Traverse the syntax tree bottom up, from right to left
• Add a word to the current chunk if
• The current chunk is not too long (sentence word count / 4)
• The word is non-alphabetic or only one symbol long
• The word begins with a genitive phrase («of »)
• Otherwise, initialize a new chunk with the word
• In case when chunking results in too many chunks,
repeat the process, allowing more words in a chunk
(sentence word count / 4)
Changes in the MT API systems
• LetsMT! API temporarily replaced with Hugo.lv API
• Added Yandex API
Advanced sentence fragments
12. Experiments
System BLEU
Full
Google + Bing
17.70
Full
Google + Bing + LetsMT
17.63
Chunks
Google + Bing
17.95
Chunks
Google + Bing + LetsMT
17.30
Advanced Chunks
Google + Bing
18.29
Advanced Chunks
all four
19.21
15. Goals
• Improve translation of multiword-expressions
• Keep track of changes in attention alignments
Workflow
Experimenting with NMT attention alignments
Tag corpora with
morphological
taggers
UDPipe
LV Tagger
Identify MWE
candidates
MWE Toolkit
Align identified
MWE candidates
MPAligner
Shuffle MWEs into
training corpora;
Train NMT systems
Neural Monkey
Identify changes
16. Training
En → Lv
4.5M parallel sentences for the baseline
4.8M after adding MEWs/MWE sentences
En → Cs
49M parallel sentences for the baseline
17M after adding MEWs/MWE sentences
Evaluation
En → Lv
2003 sentences in total
611 sentences with at least one MWE
En → Cs
6000 sentences in total
112 sentences with at least one MWE
Data
WMT17 News Translation Task
En → Lv
En → Cs
1M 1xMWE 1M 2xMWE 2M 2xMWE 0.5M
2.5M 1xMWE 2.5M 2xMWE 5M 2xMWE 5M
17. Two forms of the presenting MWEs to the NMT system
• Adding only the parallel MWEs themselves (MWE phrases)
• each pair forming a new “sentence pair” in the parallel corpus
• Adding full sentences that contain the identified MWEs (MWE sentences)
Experiments
Languages En → Cs En → Lv
Dataset Dev MWE Dev MWE
Baseline 13.71 10.25 11.29 9.32
+MWE phrases - - 11.94 10.31
+MWE sentences 13.99 10.44 - -
21. Workflow
• Translate the same sentence with two different NMT systems and one SMT system
• Save attention alignment data from the NMT systems
• Choose output from the system that does not
• Align most of its attention to a single token
• Have only very strong one-to-one alignments
• Otherwise - back off to the output of the SMT system
System combination using NMT attention
System En->Lv Lv->En
Dataset Dev Test Dev Test
LetsMT! 19.8 12.9 24.3 13.4
Neural Monkey 16.7 13.5 15.7 14.3
Nematus 16.9 13.6 15.0 13.8
NM+NT+LMT - 13.6 - 14.3
Data – WMT17
News translation Task
23. System combination by estimating confidence
Source Viņš bija labs cilvēks ar plašu sirdi.
Reference He was a kind spirit with a big heart.
Hypothesis He was a good man with a wide heart.
CDP -0.099
APout -1.077
APin -0.847
Confidence -2.024
24. System combination by estimating confidence
Source Aizvadītajā diennaktī Latvijā reģistrēts 71 ceļu satiksmes negadījumos, kuros cietuši16 cilvēki.
Reference 71 traffic accidents in which 16 persons were injured have happened in Latvia during the last 24 hours.
Hypothesis The first day of the EU’European Parliament is the first of the three years of the European Union .
CDP -0.900
APout -2.809
APin -2.137
Confidence -5.846
25. Experiments
BLEU
System En->Lv Lv->En
Neural Monkey 13.74 11.09
Nematus 13.80 12.64
Hybrid 14.79 12.65
Human 15.12 13.24
Data – WMT17 News translation Task
En->Lv Lv->En
LM-based overlap with human 58% 56%
Attention-based overlap with human 52% 60%
LM-based overlap with Attention-based 34% 22%
26. • Poor MT between two non-English languages due to limited parallel data
• Improve X↔Y MT be by adding X↔En and Y↔En data
• Experiment with various NMT architectures
Data combination for multilingual NMT
Language pair
Before filtering
(Total/Unique)
After filtering
(Unique)
English ↔ Estonian 62.5M / 24.3M 18.9M
English ↔ Russian 60.7M / 39.2M 29.4M
Russian ↔ Estonian 6.5M / 4.4M 3.5M
27. Data combination for multilingual NMT
Development Evaluation
Ru → Et Et → Ru En → Et Et → En Ru → Et Et → Ru En → Et Et → En
MLSTM-
SO
17.51 18.46 23.79 34.45 11.11 12.32 26.14 36.78
GRU-SM 13.70 13.71 17.95 27.84 10.66 11.17 19.22 27.85
GRU-DO 17.03 17.42 23.53 33.63 10.33 12.36 25.25 36.86
GRU-DM 17.07 17.93 23.37 33.52 13.75 14.57 25.76 36.93
FConv-O 15.24 16.17 21.63 33.84 7.56 8.83 24.87 36.96
FConv-M 14.92 15.80 18.99 30.25 10.65 10.99 21.65 31.79
Transf.-O 17.44 18.90 25.27 37.12 9.10 11.17 28.43 40.08
Transf.-M 18.03 19.18 23.99 35.15 14.38 15.48 25.56 37.97
• New state-of-the-art Russian ↔ Estonian MT
• Significantly better than the previous systems
28. • NTM is more sensitive to noisy data – requires stricter data filtering
• Back-translation has proven to be an easy way to increase MT quality
Incremental multi-pass training for NMT
Filter data Train NMT
Back-
translate
Train Final
NMT
• Excessive filtering
(details in future slides)
• Multiple pass-throughs
of back-translation
29. • New state-of-the-art English ↔ Estonian MT
• Tied for 1st place in WMT 2018
• Significantly better (p<=0.05) than the competition
Incremental multi-pass training for NMT
System
BLEU
Score Rank
Estonian → English 28.0 7 of 23
English → Estonian 23.6 3 of 18
Finnish → English 23.0 5 of 17
English → Finnish 16.9 5 of 18
System
Rank
BLEU
Human
Cluster Ave %
Estonian → English 7 of 23 1-7 of 9 3 of 9
English → Estonian 3 of 18 1-3 of 9 3 of 9
31. • A user-friendly web interface for ChunkMT
• Draws a syntax tree with chunks highlighted
• Designates which chunks where chosen from which component system
• Provides a confidence score for the choices
• Allows using online APIs or user provided translations
Interactive multi-system machine translation
Start page
Translate with
onlinesystems
Inputtranslations
to combine
Input
translated
chunks
Settings
Translation results
Inputsource
sentence
Inputsource
sentence
32. Works with attention alignment data from
• Nematus
• Neural Monkey
• Marian
• OpenNMT
• Sockeye
Visualise translations in
• Linux Terminal or Windows PowerShell
• Web browser
• Line form or matrix form
• Save as PNG
• Sort and navigate dataset by confidence scores
• Directly compare translations of one sentence from two different systems
Visualising NMT attention and confidence
34. Cleaning corpora to improve NMT performance
English Estonian
Add to my wishlist Hommikul (200 + 200 = 400 kcal)
Dec 2009 ÊßÇÌí 2009
I voted in favour. kirjalikult. – (IT) Hääletasin poolt.
I voted in favour. Ma andsin oma poolthääle.
That is the wrong way to go. See ei ole õge.
This is simply wrong. See ei ole õge.
Zaghachi See okwu 3 Comments
Täna mängitud: 25 910 Täna mängitud: 25 929
1 If , , and are the roots of , compute . 1 Juhul kui , Ja on juured , Arvutama .
we have that and or or or . meil on, et ja või või või .
NXT Spray - NAPURA NXT SPRAY NXT SPRAY
35. • Unique parallel sentence filter – removes duplicate source-target sentence pairs
• Equal source-target filter - removes sentences that are identical in the source side and the
target side of the corpus
• Multiple sources - one target and multiple targets - one source filters – remove repeating
sentence pairs where the same source sentence is aligned to multiple different target sentences
and multiple source sentences aligned to the same target sentence
• Non-alphabetical filters – remove sentences that contain > 50% non-alphabetical symbols on
the source or the target side, and sentence pairs that have significantly more (at least 1:3)
non-alphabetical symbols in the source side than in the target side (or vice versa)
• Repeating token filter – removes sentences with consecutive repeating tokens or phrases.
• Correct language filter – estimates the language of each sentence (Lui and Baldwin, 2012)
and removes any sentence that has a different identified language from the one specified
Cleaning corpora to improve NMT performance
38. • Hybrid MT combination via chunking outperformed individual systems in
translating long SMT sentences
• Hybrid combination for NMT via attention alignments complies to the emerging
technology of neural network MT systems and can distinguish low quality
translations from high quality ones
• It has also been used in MT quality estimation research (Ive et al., 2018; Yankovskaya et al., 2018)
• The graphical tools help to inspect how translations are composed from
component systems, and overview results of generated translations to locate
better or worse results quickly
• The NMT visualization and debugging tool is used to teach students in Charles University, the
University of Tartu and in the University of Zurich. It is also currently the most cited (8) publication
of the author and has received the most stars (31) and forks (12) on GitHub, indicating that it is
appreciated by the research community
Conclusions
39. • A method for hybrid MT combination using chunking and neural LMs
• A method for hybrid neural machine translation combination via attention
• A method for multi-pass incremental training for NMT
• Graphical tools for overviewing the and debugging processes
• State-of-the-art MT systems (Estonian ↔ Russian and Estonian ↔ English)
along with details and required tools for reproducibility
Main results
The proposed hypothesis that it is possible to produce higher quality
translations for the Baltic languages by combining output from multiple
different MT systems than produced by each component system individually
can be considered as proven.
40. • Research project “Neural Network Modelling for Inflected Natural Languages” No.
1.1.1.1/16/A/215. Research activity No. 3 “Usability of neural networks for automated
translation”. Project supported by the European Regional Development Fund.
• ICT COST Action IC1207 “ParseME: Parsing and multi-word expressions. Towards
linguistic precision and computational efficiency in natural language processing.”
• Research project “Forest Sector Competence Centre” No. 1.2.1.1/16/A/009. Project
supported by the European Regional Development Fund.
Approbation in Research Projects
• 17 publications
• 10 indexed in Web of Science and / or in Scopus
• 7 in other peer-reviewed conference proceedings
• Presented in
• 10 international conferences
• 3 international workshops
• 2 local conferences
Publications
Rule-based MT (RBMT) is based on linguistic information covering the main semantic, morphological, and syntactic regularities of source and target languages
Statistical MT (SMT) consists of subcomponents that are separately engineered to learn how to translate from vast amounts of translated text
Neural MT (NMT) consists of a large neural network in which weights are trained jointly to maximize the translation performance
Hybrid MT (HMT) employs different MT approaches in the same system to complement each other’s strengths to boost the accuracy level of the translation
Full sentence translations
Sentence fragments
Advanced sentence fragments
Neural network language models
Experimenting with NMT attention alignments
System combination using neural network attention
System combination by estimating confidence from neural network attention
Data combination for training multilingual NMT systems
Incremental multi-pass training for NMT systems
Interactive multi-system machine translation
Visualising and debugging neural machine translations
Cleaning corpora to improve neural machine translation performance