ACL 2018에 다녀오신 NAVER 개발자 분들께서 그 내용을 공유해 주십니다.
1. Overview - Lucy Park
2. Tutorials - Xiaodong Gu
3. Main conference
a. Semantic parsing - Soonmin Bae
b. Dialogue - Kyungduk Kim
c. Machine translation - Zae Myung Kim
d. Summarization - Hye-Jin Min
4. Workshops - Minjoon Seo
[KDD 2018 tutorial] End to-end goal-oriented question answering systemsQi He
End to-end goal-oriented question answering systems
version 2.0: An updated version with references of the old version (https://www.slideshare.net/QiHe2/kdd-2018-tutorial-end-toend-goaloriented-question-answering-systems).
08/22/2018: The old version was just deleted for reducing the confusion.
The document provides information about an upcoming bootcamp on natural language processing (NLP) being conducted by Anuj Gupta. It discusses Anuj Gupta's background and experience in machine learning and NLP. The objective of the bootcamp is to provide a deep dive into state-of-the-art text representation techniques in NLP and help participants apply these techniques to solve their own NLP problems. The bootcamp will be very hands-on and cover topics like word vectors, sentence/paragraph vectors, and character vectors over two days through interactive Jupyter notebooks.
The document presents a neural network architecture for various natural language processing (NLP) tasks such as part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. It shows results comparable to state-of-the-art using word embeddings learned from a large unlabeled corpus, and improved results from joint training of the tasks. The network transforms words into feature vectors, extracts higher-level features through neural layers, and is trained via backpropagation. Benchmark results demonstrate performance on par with traditional task-specific systems without heavy feature engineering.
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools Lifeng (Aaron) Han
Abstract of Aaron Han’s Presentation
The main topic of this presentation will be the “evaluation of machine translation”. With the rapid development of machine translation (MT), the MT evaluation becomes more and more important to tell whether they make some progresses. The traditional human judgments are very time-consuming and expensive. On the other hand, there are some weaknesses in the existing automatic MT evaluation metrics:
– perform well in certain language pairs but weak on others, which we call the language-bias problem;
– consider no linguistic information (leading the metrics result in low correlation with human judgments) or too many linguistic features (difficult in replicability), which we call the extremism problem;
– design incomprehensive factors (e.g. precision only).
To address the existing problems, he has developed several automatic evaluation metrics:
– Design tunable parameters to address the language-bias problem;
– Use concise linguistic features for the linguistic extremism problem;
– Design augmented factors.
The experiments on ACL-WMT corpora show the proposed metrics yield higher correlation with human judgments. The proposed metrics have been published on international top conferences, e.g. COLING and MT SUMMIT. Actually speaking, the evaluation works are very related to the similarity measuring. So these works can be further developed into other literature, such as information retrieval, question and answering, searching, etc.
A brief introduction about some of his other researches will also be mentioned, such as Chinese named entity recognition, word segmentation, and multilingual treebanks, which have been published on Springer LNCS and LNAI series. Precious suggestions and comments are much appreciated. The opportunities of further corporation will be more exciting.
Michael Manukyan and Hrayr Harutyunyan gave a talk on sentence representations in the context of deep learning during Armenian NLP Meetup. They also reviewed a recent paper on machine comprehension (Wang, Jiang, 2016)
[KDD 2018 tutorial] End to-end goal-oriented question answering systemsQi He
End to-end goal-oriented question answering systems
version 2.0: An updated version with references of the old version (https://www.slideshare.net/QiHe2/kdd-2018-tutorial-end-toend-goaloriented-question-answering-systems).
08/22/2018: The old version was just deleted for reducing the confusion.
The document provides information about an upcoming bootcamp on natural language processing (NLP) being conducted by Anuj Gupta. It discusses Anuj Gupta's background and experience in machine learning and NLP. The objective of the bootcamp is to provide a deep dive into state-of-the-art text representation techniques in NLP and help participants apply these techniques to solve their own NLP problems. The bootcamp will be very hands-on and cover topics like word vectors, sentence/paragraph vectors, and character vectors over two days through interactive Jupyter notebooks.
The document presents a neural network architecture for various natural language processing (NLP) tasks such as part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. It shows results comparable to state-of-the-art using word embeddings learned from a large unlabeled corpus, and improved results from joint training of the tasks. The network transforms words into feature vectors, extracts higher-level features through neural layers, and is trained via backpropagation. Benchmark results demonstrate performance on par with traditional task-specific systems without heavy feature engineering.
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools Lifeng (Aaron) Han
Abstract of Aaron Han’s Presentation
The main topic of this presentation will be the “evaluation of machine translation”. With the rapid development of machine translation (MT), the MT evaluation becomes more and more important to tell whether they make some progresses. The traditional human judgments are very time-consuming and expensive. On the other hand, there are some weaknesses in the existing automatic MT evaluation metrics:
– perform well in certain language pairs but weak on others, which we call the language-bias problem;
– consider no linguistic information (leading the metrics result in low correlation with human judgments) or too many linguistic features (difficult in replicability), which we call the extremism problem;
– design incomprehensive factors (e.g. precision only).
To address the existing problems, he has developed several automatic evaluation metrics:
– Design tunable parameters to address the language-bias problem;
– Use concise linguistic features for the linguistic extremism problem;
– Design augmented factors.
The experiments on ACL-WMT corpora show the proposed metrics yield higher correlation with human judgments. The proposed metrics have been published on international top conferences, e.g. COLING and MT SUMMIT. Actually speaking, the evaluation works are very related to the similarity measuring. So these works can be further developed into other literature, such as information retrieval, question and answering, searching, etc.
A brief introduction about some of his other researches will also be mentioned, such as Chinese named entity recognition, word segmentation, and multilingual treebanks, which have been published on Springer LNCS and LNAI series. Precious suggestions and comments are much appreciated. The opportunities of further corporation will be more exciting.
Michael Manukyan and Hrayr Harutyunyan gave a talk on sentence representations in the context of deep learning during Armenian NLP Meetup. They also reviewed a recent paper on machine comprehension (Wang, Jiang, 2016)
This document provides advice on how to prepare for the GATE CSE exam, including recommended books and study materials for each subject. It emphasizes the importance of practicing problems at the end of chapters in reference books to develop technical knowledge and the ability to apply concepts. Specific books and topics are recommended for subjects like Discrete Mathematics, Algorithms, Data Structures, Theoretical Computer Science, and Database Management Systems. It also stresses the value of video lecture courses to supplement reading. The overall message is to focus on understanding fundamental concepts and gaining problem-solving skills through extensive practice.
Open domain Question Answering System - Research project in NLPGVS Chaitanya
Using a computer to answer questions has been a human dream since the beginning of the digital era. A first step towards the achievement of such an ambitious goal is to deal with natural language to enable the computer to understand what its user asks. The discipline that studies the connection between natural language and the representation of its meaning via computational models is computational linguistics. According to such discipline, Question Answering can be defined as the task that, given a question formulated in natural language , aims at finding one or more concise answers. And the Improvements in Technology and the Explosive demand for better information access has reignited the interest in Q & A systems , The wealth of the information on the web makes it an Interactive resource for seeking quick Answers to factual Questions such as “Who is the first American to land in space ?”, or “what is the second Tallest Mountain in the world ?”, yet Today’s Most advanced web Search systems(Bing , Google , yahoo) make it Surprisingly Tedious to locate the Answers , Q& A System Aims to develop techniques that go beyond Retrieval of Relevant documents in order to return the exact answers using Natural language factoid question
Question Answering System using machine learning approachGarima Nanda
In a compact form, this is a presentation reflecting how the machine learning approach can be used for the effective and efficient interaction using classification techniques.
This document provides an overview of deep learning techniques for natural language processing. It begins with an introduction to distributed word representations like word2vec and GloVe. It then discusses methods for generating sentence embeddings, including paragraph vectors and recursive neural networks. Character-level models are presented as an alternative to word embeddings that can handle morphology and out-of-vocabulary words. Finally, some general deep learning approaches for NLP tasks like text generation and word sense disambiguation are briefly outlined.
Monthly AI Tech Talks in Toronto 2019-08-28
https://www.meetup.com/aittg-toronto
The talk will cover the end-to-end details including contextual and linguistic feature extraction, vectorization, n-grams, topic modeling, named entity resolution which are based on concepts from mathematics, information retrieval and natural language processing. We will also be diving into more advanced feature engineering strategies such as word2vec, GloVe and fastText that leverage deep learning models.
In addition, attendees will learn how to combine NLP features with numeric and categorical features and analyze the feature importance from the resulting models.
The following libraries will be used to demonstrate the aforementioned feature engineering techniques: spaCy, Gensim, fasText and Keras in Python.
https://www.meetup.com/aittg-toronto/events/261940480/
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingSeonghyun Kim
The document discusses BERT, which stands for Bidirectional Encoder Representations from Transformers. BERT uses bidirectional Transformers to pre-train deep contextual representations of language. It was trained on two unsupervised prediction tasks using large text corpora. Experimental results showed that BERT achieved state-of-the-art results on eleven natural language understanding tasks, including question answering and textual inference. The document outlines the model architecture of BERT and the pre-training and fine-tuning methods used.
In this presentation we discuss several concepts that include Word Representation using SVD as well as neural networks based techniques. In addition we also cover core concepts such as cosine similarity, atomic and distributed representations.
Convolutional neural networks (CNNs) have traditionally been used for computer vision tasks but recent work has applied them to language modeling as well. CNNs treat sequences of words as signals over time rather than independent units. They use convolution and pooling layers to identify important n-gram features. Results show CNNs can be effective for classification tasks like sentiment analysis but have had less success with sequence modeling tasks. Overall, CNNs provide an alternative to recurrent neural networks for certain natural language processing problems and help understand each model's strengths and weaknesses.
The document discusses several different machine learning approaches to plain text information extraction, including SRV, RAPIER, WHISK, AutoSlog, and CRYSTAL. These systems use both top-down and bottom-up approaches to induce rules or patterns for extracting structured information from unstructured text. The document compares the different systems and their rule representations, learning algorithms, experiments and performance on various information extraction tasks.
This document discusses natural language processing (NLP) and feature extraction. It explains that NLP can be used for applications like search, translation, and question answering. The document then discusses extracting features from text like paragraphs, sentences, words, parts of speech, entities, sentiment, topics, and assertions. Specific features discussed in more detail include frequency, relationships between words, language features, supervised machine learning, classifiers, encoding words, word vectors, and parse trees. Tools mentioned for NLP include Google Cloud NLP, Spacy, OpenNLP, and Stanford Core NLP.
Seq2seq Model to Tokenize the Chinese LanguageJinho Choi
This document describes using a sequence-to-sequence (seq2seq) model to tokenize Chinese text without spaces between words. It discusses how the seq2seq model works, previous work using it for machine translation, and the goal of testing if it can also be used for Chinese tokenization. An analysis is provided on why applying the seq2seq model directly did not achieve ideal results for tokenization, and suggestions are made for future work such as using beam search and other algorithms.
Natural language processing for requirements engineering: ICSE 2021 Technical...alessio_ferrari
These are the slides for the technical briefing given at ICSE 2021, given by Alessio Ferrari, Liping Zhao, and Waad Alhoshan
It covers RE tasks to which NLP is applied, an overview of a recent systematic mapping study on the topic, and a hands-on tutorial on using transfer learning for requirements classification.
Please find the links to the colab notebooks here:
https://colab.research.google.com/drive/158H-lEJE1pc-xHc1ISBAKGDHMt_eg4Gn?usp=sharing
https://colab.research.google.com/d rive/1B_5ow3rvS0Qz1y-KyJtlMNnm gmx9w3kJ?usp=sharing
https://colab.research.google.com/d rive/1Xrm0gNaa41YwlM5g2CRYYX cRvpbDnTRT?usp=sharing
Enriching Word Vectors with Subword InformationSeonghyun Kim
1) The document proposes a new word vector model that represents words as the sum of their character n-gram vectors to better capture morphological information.
2) It tests this model on nine languages and shows it outperforms previous models on word similarity and analogy tasks.
3) Representing words as combinations of character n-grams allows the model to learn representations for out-of-vocabulary words.
This document provides a summary of topics covered in a deep neural networks tutorial, including:
- A brief introduction to artificial intelligence, machine learning, and artificial neural networks.
- An overview of common deep neural network architectures like convolutional neural networks, recurrent neural networks, autoencoders, and their applications in areas like computer vision and natural language processing.
- Advanced techniques for training deep neural networks like greedy layer-wise training, regularization methods like dropout, and unsupervised pre-training.
- Applications of deep learning beyond traditional discriminative models, including image synthesis, style transfer, and generative adversarial networks.
This document summarizes two Arabic question answering systems: QASAL and QARAB. It describes the main components of each system, including question analysis, passage retrieval, and answer extraction. It also discusses how each system handles yes/no questions in Arabic. The document concludes by comparing the performance of the two systems and different techniques for Arabic question answering.
This document proposes an approach to automatically build term hierarchies from large patent datasets. It involves a three-stage process: term extraction, hierarchy building, and hierarchy enrichment. Terms are first extracted from patent titles, abstracts, and claims. The hierarchy is built by classifying terms into unigrams, bigrams, and trigrams to reflect different levels of generality. The hierarchy is then enriched using a word embedding model to add related terms. Results on sample patent subgroups show the approach can identify generic and specific terms, though human evaluation and more linguistic study on patents are needed.
Improving Neural Question Generation using Answer SeparationYang Hoon Kim
1) The document proposes an improved neural question generation model called answer-separated seq2seq that treats the target answer and passage separately to better generate questions.
2) It introduces three key components - an answer-separated encoder, keyword-net to extract key information from the target answer, and a retrieval-style word generator to consider word meanings.
3) Experimental results show the proposed model outperforms previous work and has a better ability to generate questions without including the target answer words and predict the correct question type.
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskSaurabh Saxena
Studied feasibility of applying state-of-the-art deep learning models like end-to-end memory networks and neural attention- based models to the problem of machine comprehension and subsequent question answering in corporate settings with huge
amount of unstructured textual data. Used pre-trained embeddings like word2vec and GLove to avoid huge training costs.
This document provides an overview of the CS760 Machine Learning course taught by David Page at the University of Wisconsin. The course will cover a broad survey of machine learning algorithms and applications over 30 class meetings. Topics will include both theoretical and practical aspects of supervised learning algorithms like naive Bayes, decision trees, neural networks, and support vector machines. Students will complete programming homework assignments applying various machine learning algorithms and a midterm exam. The primary goals of the course are to understand what learning systems should do and how existing systems work.
This document provides advice on how to prepare for the GATE CSE exam, including recommended books and study materials for each subject. It emphasizes the importance of practicing problems at the end of chapters in reference books to develop technical knowledge and the ability to apply concepts. Specific books and topics are recommended for subjects like Discrete Mathematics, Algorithms, Data Structures, Theoretical Computer Science, and Database Management Systems. It also stresses the value of video lecture courses to supplement reading. The overall message is to focus on understanding fundamental concepts and gaining problem-solving skills through extensive practice.
Open domain Question Answering System - Research project in NLPGVS Chaitanya
Using a computer to answer questions has been a human dream since the beginning of the digital era. A first step towards the achievement of such an ambitious goal is to deal with natural language to enable the computer to understand what its user asks. The discipline that studies the connection between natural language and the representation of its meaning via computational models is computational linguistics. According to such discipline, Question Answering can be defined as the task that, given a question formulated in natural language , aims at finding one or more concise answers. And the Improvements in Technology and the Explosive demand for better information access has reignited the interest in Q & A systems , The wealth of the information on the web makes it an Interactive resource for seeking quick Answers to factual Questions such as “Who is the first American to land in space ?”, or “what is the second Tallest Mountain in the world ?”, yet Today’s Most advanced web Search systems(Bing , Google , yahoo) make it Surprisingly Tedious to locate the Answers , Q& A System Aims to develop techniques that go beyond Retrieval of Relevant documents in order to return the exact answers using Natural language factoid question
Question Answering System using machine learning approachGarima Nanda
In a compact form, this is a presentation reflecting how the machine learning approach can be used for the effective and efficient interaction using classification techniques.
This document provides an overview of deep learning techniques for natural language processing. It begins with an introduction to distributed word representations like word2vec and GloVe. It then discusses methods for generating sentence embeddings, including paragraph vectors and recursive neural networks. Character-level models are presented as an alternative to word embeddings that can handle morphology and out-of-vocabulary words. Finally, some general deep learning approaches for NLP tasks like text generation and word sense disambiguation are briefly outlined.
Monthly AI Tech Talks in Toronto 2019-08-28
https://www.meetup.com/aittg-toronto
The talk will cover the end-to-end details including contextual and linguistic feature extraction, vectorization, n-grams, topic modeling, named entity resolution which are based on concepts from mathematics, information retrieval and natural language processing. We will also be diving into more advanced feature engineering strategies such as word2vec, GloVe and fastText that leverage deep learning models.
In addition, attendees will learn how to combine NLP features with numeric and categorical features and analyze the feature importance from the resulting models.
The following libraries will be used to demonstrate the aforementioned feature engineering techniques: spaCy, Gensim, fasText and Keras in Python.
https://www.meetup.com/aittg-toronto/events/261940480/
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingSeonghyun Kim
The document discusses BERT, which stands for Bidirectional Encoder Representations from Transformers. BERT uses bidirectional Transformers to pre-train deep contextual representations of language. It was trained on two unsupervised prediction tasks using large text corpora. Experimental results showed that BERT achieved state-of-the-art results on eleven natural language understanding tasks, including question answering and textual inference. The document outlines the model architecture of BERT and the pre-training and fine-tuning methods used.
In this presentation we discuss several concepts that include Word Representation using SVD as well as neural networks based techniques. In addition we also cover core concepts such as cosine similarity, atomic and distributed representations.
Convolutional neural networks (CNNs) have traditionally been used for computer vision tasks but recent work has applied them to language modeling as well. CNNs treat sequences of words as signals over time rather than independent units. They use convolution and pooling layers to identify important n-gram features. Results show CNNs can be effective for classification tasks like sentiment analysis but have had less success with sequence modeling tasks. Overall, CNNs provide an alternative to recurrent neural networks for certain natural language processing problems and help understand each model's strengths and weaknesses.
The document discusses several different machine learning approaches to plain text information extraction, including SRV, RAPIER, WHISK, AutoSlog, and CRYSTAL. These systems use both top-down and bottom-up approaches to induce rules or patterns for extracting structured information from unstructured text. The document compares the different systems and their rule representations, learning algorithms, experiments and performance on various information extraction tasks.
This document discusses natural language processing (NLP) and feature extraction. It explains that NLP can be used for applications like search, translation, and question answering. The document then discusses extracting features from text like paragraphs, sentences, words, parts of speech, entities, sentiment, topics, and assertions. Specific features discussed in more detail include frequency, relationships between words, language features, supervised machine learning, classifiers, encoding words, word vectors, and parse trees. Tools mentioned for NLP include Google Cloud NLP, Spacy, OpenNLP, and Stanford Core NLP.
Seq2seq Model to Tokenize the Chinese LanguageJinho Choi
This document describes using a sequence-to-sequence (seq2seq) model to tokenize Chinese text without spaces between words. It discusses how the seq2seq model works, previous work using it for machine translation, and the goal of testing if it can also be used for Chinese tokenization. An analysis is provided on why applying the seq2seq model directly did not achieve ideal results for tokenization, and suggestions are made for future work such as using beam search and other algorithms.
Natural language processing for requirements engineering: ICSE 2021 Technical...alessio_ferrari
These are the slides for the technical briefing given at ICSE 2021, given by Alessio Ferrari, Liping Zhao, and Waad Alhoshan
It covers RE tasks to which NLP is applied, an overview of a recent systematic mapping study on the topic, and a hands-on tutorial on using transfer learning for requirements classification.
Please find the links to the colab notebooks here:
https://colab.research.google.com/drive/158H-lEJE1pc-xHc1ISBAKGDHMt_eg4Gn?usp=sharing
https://colab.research.google.com/d rive/1B_5ow3rvS0Qz1y-KyJtlMNnm gmx9w3kJ?usp=sharing
https://colab.research.google.com/d rive/1Xrm0gNaa41YwlM5g2CRYYX cRvpbDnTRT?usp=sharing
Enriching Word Vectors with Subword InformationSeonghyun Kim
1) The document proposes a new word vector model that represents words as the sum of their character n-gram vectors to better capture morphological information.
2) It tests this model on nine languages and shows it outperforms previous models on word similarity and analogy tasks.
3) Representing words as combinations of character n-grams allows the model to learn representations for out-of-vocabulary words.
This document provides a summary of topics covered in a deep neural networks tutorial, including:
- A brief introduction to artificial intelligence, machine learning, and artificial neural networks.
- An overview of common deep neural network architectures like convolutional neural networks, recurrent neural networks, autoencoders, and their applications in areas like computer vision and natural language processing.
- Advanced techniques for training deep neural networks like greedy layer-wise training, regularization methods like dropout, and unsupervised pre-training.
- Applications of deep learning beyond traditional discriminative models, including image synthesis, style transfer, and generative adversarial networks.
This document summarizes two Arabic question answering systems: QASAL and QARAB. It describes the main components of each system, including question analysis, passage retrieval, and answer extraction. It also discusses how each system handles yes/no questions in Arabic. The document concludes by comparing the performance of the two systems and different techniques for Arabic question answering.
This document proposes an approach to automatically build term hierarchies from large patent datasets. It involves a three-stage process: term extraction, hierarchy building, and hierarchy enrichment. Terms are first extracted from patent titles, abstracts, and claims. The hierarchy is built by classifying terms into unigrams, bigrams, and trigrams to reflect different levels of generality. The hierarchy is then enriched using a word embedding model to add related terms. Results on sample patent subgroups show the approach can identify generic and specific terms, though human evaluation and more linguistic study on patents are needed.
Improving Neural Question Generation using Answer SeparationYang Hoon Kim
1) The document proposes an improved neural question generation model called answer-separated seq2seq that treats the target answer and passage separately to better generate questions.
2) It introduces three key components - an answer-separated encoder, keyword-net to extract key information from the target answer, and a retrieval-style word generator to consider word meanings.
3) Experimental results show the proposed model outperforms previous work and has a better ability to generate questions without including the target answer words and predict the correct question type.
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskSaurabh Saxena
Studied feasibility of applying state-of-the-art deep learning models like end-to-end memory networks and neural attention- based models to the problem of machine comprehension and subsequent question answering in corporate settings with huge
amount of unstructured textual data. Used pre-trained embeddings like word2vec and GLove to avoid huge training costs.
This document provides an overview of the CS760 Machine Learning course taught by David Page at the University of Wisconsin. The course will cover a broad survey of machine learning algorithms and applications over 30 class meetings. Topics will include both theoretical and practical aspects of supervised learning algorithms like naive Bayes, decision trees, neural networks, and support vector machines. Students will complete programming homework assignments applying various machine learning algorithms and a midterm exam. The primary goals of the course are to understand what learning systems should do and how existing systems work.
This document outlines the syllabus for a machine learning course. It introduces the instructor, teaching assistant, required textbook, and meeting schedule. It describes the course style as primarily algorithmic and experimental, covering many ML subfields. The goals are to understand what a learning system should do and how existing systems work. Background knowledge in languages, AI topics, and math is assumed, but no prior ML experience is needed. Requirements include biweekly programming homework, a midterm exam, and a final project. Grading will be based on homework, exam, project, and discussion participation. Policies on late homework and academic misconduct are also provided.
Module handout for COM839 - Intelligent Systems [Word format]butest
This document provides information about the COM850C2/COM839J2 Intelligent Systems module taught in the 2003-4 academic year at Ulster University. It outlines the teaching staff, consultation times, aims and learning outcomes of the module. The timetable and schedule details that the module will be delivered online through WebCT with an initial class. There are three assignments worth 50% of the marks and an exam worth the other 50%. The assignments include a Prolog mini-project, essay, and Clementine mini-project.
Transfer learning in NLP involves pre-training large language models on unlabeled text and then fine-tuning them on downstream tasks. Current state-of-the-art models such as BERT, GPT-2, and XLNet use bidirectional transformers pretrained using techniques like masked language modeling. These models have billions of parameters and require huge amounts of compute but have achieved SOTA results on many NLP tasks. Researchers are exploring ways to reduce model sizes through techniques like distillation while maintaining high performance. Open questions remain around model interpretability and generalization.
The document discusses the foundations of artificial intelligence including contributions from philosophy, mathematics, economics, and neuroscience. It describes how these fields helped establish the concepts of logic, computation, probability, decision theory, and rational agents that are important to AI. The document also provides examples of early work in these fields that helped lay the groundwork for modern artificial intelligence.
Natural Language Generation / Stanford cs224n 2019w lecture 15 Reviewchangedaeoh
This document discusses natural language generation (NLG) tasks and neural approaches. It begins with a recap of language models and decoding algorithms like beam search and sampling. It then covers NLG tasks like summarization, dialogue generation, and storytelling. For summarization, it discusses extractive vs. abstractive approaches and neural methods like pointer-generator networks. For dialogue, it discusses challenges like genericness, irrelevance and repetition that neural models face. It concludes with trends in NLG evaluation difficulties and the future of the field.
Ontology-based data access: why it is so cool!Josef Hardi
A brief introduction about ontology-based data access (shortly OBDA) and its core implementation. I presented too a recent simple benchmark between -ontop- and Semantika---two most available software for OBDA framework---in term of query performance (including details in the appendix section). The slides were presented for Friday Research Meeting in Stanford Center for Biomedical Informatics Research (BMIR).
License: Creative Commons by Attribution 3.0
This was presented to software developers with the goal of introducing them to basic machine learning workflow, code snippets, possibilities and state-of-the-art in NLP and give some clues on where to get started.
A Machine learning approach to classify a pair of sentence as duplicate or not.Pankaj Chandan Mohapatra
The team presented their machine learning project on predicting question pairs on Quora. They used logistic regression, random forest, and XGBoost models with manually engineered features like word count and word match. XGBoost performed best with an AUC score of 0.936. Key lessons were the importance of preprocessing, using words as features requires dimension reduction, and feature hashing improves scalability over storing vocabularies. Future work could experiment with convolutional neural networks for sentence similarity as proposed by H. Hua et al.
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningBigDataCloud
The document discusses deep learning for natural language processing. It provides 5 reasons why deep learning is well-suited for NLP tasks: 1) it can automatically learn representations from data rather than relying on human-designed features, 2) it uses distributed representations that address issues with symbolic representations, 3) it can perform unsupervised feature and weight learning on unlabeled data, 4) it learns multiple levels of representation that are useful for multiple tasks, and 5) recent advances in methods like unsupervised pre-training have made deep learning models more effective for NLP. The document outlines some successful applications of deep learning to tasks like language modeling and speech recognition.
Naver learning to rank question answer pairs using hrde-ltcNAVER Engineering
The automatic question answering (QA) task has long been considered a primary objective of artificial intelligence.
Among the QA sub-systems, we focused on answer-ranking part. In particular, we investigated a novel neural network architecture with additional data clustering module to improve the performance in ranking answer candidates which are longer than a single sentence. This work can be used not only for the QA ranking task, but also to evaluate the relevance of next utterance with given dialogue generated from the dialogue model.
In this talk, I'll present our research results (NAACL 2018), and also its potential use cases (i.e. fake news detection). Finally, I'll conclude by introducing some issues on previous research, and by introducing recent approach in academic.
This document provides an overview of an IST 380 data science course. It introduces the instructor, Zach Dodds, and discusses topics that will be covered over the 15 weeks including using R, descriptive statistics, predictive modeling, machine learning algorithms, and a final project. Assignments are due weekly and students can work individually or in pairs. The course aims to provide both specific skills in data analysis and a broad background in data science.
This document provides an overview of an introductory data science course (IST 380). It discusses the course content which includes learning the R programming language, descriptive statistics, predictive modeling, and machine learning algorithms. It also covers course logistics like assignments, grading, and academic honesty policies. The goal of the course is to provide students with practical data science skills that can be applied to real-world problems and datasets.
This document provides an overview of an introductory data science course (IST 380). It discusses the course content which includes learning the R programming language, descriptive statistics, predictive modeling, and machine learning algorithms. It also covers the grading scheme, assignments, and final project where students can apply what they learned to a dataset of their choice.
This document provides an overview of an IST 380 data science course. It introduces the instructor, Zach Dodds, and discusses topics that will be covered over the 15 weeks including using R, descriptive statistics, predictive modeling, machine learning algorithms, and a final project. Assignments are due weekly and students can work individually or in pairs. The course aims to provide both specific skills in data analysis and a broad background in data science.
This course teaches technical writing skills needed for a career in technology. It covers writing processes, styles, and formats for various technical documents. Key topics include organizing information, writing for different audiences, using graphics and visual elements, writing instructions, descriptions, reports, and other common technical document types. Students learn grammar and mechanics rules specific to technical writing. The goal is for students to gain proficiency in written communication skills required in technical fields.
ZUIX is a design system created by Zigbang's CTO team to standardize design across all of Zigbang's services. It uses React Native for responsive, multi-platform components and includes tools like Storybook for development and a design review infrastructure for validation. The deployment process involves code reviews, CI/CD pipelines, and publishing to a npm registry. Training and documentation is provided through tools like Google Classroom and Notion. The team aims to further develop ZUIX by improving the design review tools, adding end-to-end testing, and analyzing component usage. The goal is to solve Zigbang's unique challenges through an agile, collaborative approach between designers and developers.
This document discusses Kakao's search platform front-end project. It describes the architecture of an integrated search service using microservices and the need for a design system due to fragmented UIs. It introduces the KST (Kakao Search Template) project for creating a design system including 200+ UI blocks and templates. The KST Builder, Logger, and Dashboard are discussed for managing templates, logging usage, and monitoring coverage. Maintaining a consistent design system is important for operating diverse search services and platforms.
This document discusses Banksalad Product Language (BPL), which is a method used at Banksalad to standardize UI text, elements, and components. It allows designers and developers to use consistent terms, while abstracting UI elements to different levels suitable for their roles. Examples of standardized elements are provided, as well as external resources that discuss concepts like tree shaking that are relevant to BPL. While BPL has benefits, the document considers whether there may be better approaches than BPL.
This document summarizes a presentation about using Stitches, a React styling library, and Storybook for component design.
The presentation introduces Stitches as the styling library used for its support of React, easy usage, and themes. Key features of Stitches discussed include creating styled components, variants, and comparisons to other libraries.
Storybook is presented as a way to improve communication between designers and developers by allowing visualization of components alongside their stories. Clean communication through a shared Storybook is emphasized.
Reflections on initially creating a design system note the benefits of consistency and speed but also identify areas for improvement like documentation, process alignment, and understanding each other's roles. Establishing trust and understanding between
비행기 설계를 왜 통일 해야 할까?
디자인 시스템을 하는 이유
비행기들이 다 용도가 다르다...어떻게 설계하지?
맥락이 다른 페이지와 패턴
경유지까지 아직 멀었다... 언제 수리하지?
디자인 시스템을 적용하는 시점
엔지니어랑 얘기해서 정비해야하는데...어떻게 수리하지?
디자인 시스템을 적용하는 프로세스
비행기 설계가 바뀐걸 어떻게 알리지?
디자인 시스템의 전파
The document discusses Kotlin coroutines and how they can be used to write asynchronous code in a synchronous, sequential way. It explains what coroutines are, how they work internally using continuation-passing style (CPS) transformation and state machines, and compares them to callbacks. It also outlines some of the benefits of using coroutines, such as structured concurrency, light weight execution, built-in cancellation, and simplifying asynchronous code. Finally, it provides examples of how to use common coroutine builders like launch, async, and coroutineScope in a basic Android application with ViewModels.
This document contains the transcript from a presentation given by Wonsuk Lim from Naver on tips for debugging and analyzing Android applications. Some key tips discussed include fully utilizing the Android emulator's capabilities like 2-finger touch control, clipboard sharing between the emulator and host PC, and mocking locations. Advanced settings for the emulator like foldable and camera emulation are also covered. The presenter recommends ways to configure developer options and use tools like LeakCanary, the Android profiler, and Stetho for testing app stability. Methods for understanding the Android framework by reviewing system services and managers via AIDL files and logcat dumps are presented. Finally, reverse engineering tools like APK Extractor and decompilers are introduced.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Zilliz
Join us to introduce Milvus Lite, a vector database that can run on notebooks and laptops, share the same API with Milvus, and integrate with every popular GenAI framework. This webinar is perfect for developers seeking easy-to-use, well-integrated vector databases for their GenAI apps.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
2. 1. Overview - Lucy Park
2. Tutorials - Xiaodong Gu
3. Main conference
a. Semantic parsing - Soonmin Bae
b. Dialogue - Kyungduk Kim
c. Machine translation - Zae Myung Kim
d. Summarization - Hye-Jin Min
4. Workshops - Minjoon Seo
7. Tutorials
Morning:
● 100 Things You Always Wanted to Know about Semantics & Pragmatics But Were Afraid to Ask
● Neural Approaches to Conversational AI
● Variational Inference and Deep Generative Models
● Connecting Language and Vision to Actions
Afternoon:
● Beyond Multiword Expressions: Processing Idioms and Metaphors
● Neural Semantic Parsing
● Deep Reinforcement Learning for NLP
● Multi-lingual Entity Discovery and Linking
To be covered
by Xiadong
8. Thursday:
● BioNLP 2018 (BioNLP)
● Cognitive Aspects of Computational Language Learning and Processing (CogCL)
● Deep Learning Approaches for Low Resource Natural Language Processing (DeepLo)
● Multilingual Surface Realization: Shared Task and Beyond (MSR)
● The 5th Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA)
● Third Workshop on Computational Approaches to Linguistic Code-Switching (CALCS)
● Workshop on Machine Reading for Question Answering (MRQA)
● Workshop on Relevance of Linguistic Structure in Neural Architectures for NLP (RELNLP)
Friday:
● 1st Workshop on Economics and Natural Language Processing (ECONLP)
● 3rd Workshop on Representation Learning for NLP (RepL4NLP)
● First Workshop on Computational Modeling of Human Multimodal Language (MML_Challenge)
● Sixth International Workshop on Natural Language Processing for Social Media (SocialNLP)
● The 2nd Workshop on Neural Machine Translation and Generation (NMT)
● The Seventh Named Entities Workshop (NEWS)
● Workshop for NLP Open Source Software (NLPOSS)
Workshops
7 out of 15
recurring
workshops
9. Thursday:
● BioNLP 2018 (BioNLP): Room 207, MCEC
● Cognitive Aspects of Computational Language Learning and Processing (CogCL)
● Deep Learning Approaches for Low Resource Natural Language Processing (DeepLo)
● Multilingual Surface Realization: Shared Task and Beyond (MSR)
● The 5th Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA)
● Third Workshop on Computational Approaches to Linguistic Code-Switching (CALCS)
● Workshop on Machine Reading for Question Answering (MRQA)
● Workshop on Relevance of Linguistic Structure in Neural Architectures for NLP (RELNLP)
Friday:
● 1st Workshop on Economics and Natural Language Processing (ECONLP)
● 3rd Workshop on Representation Learning for NLP (RepL4NLP)
● First Workshop on Computational Modeling of Human Multimodal Language (MML_Challenge)
● Sixth International Workshop on Natural Language Processing for Social Media (SocialNLP)
● The 2nd Workshop on Neural Machine Translation and Generation (NMT)
● The Seventh Named Entities Workshop (NEWS)
● Workshop for NLP Open Source Software (NLPOSS)
Workshops
To be covered
by Minjoon
11. Best papers
Short:
● Know What You Don’t Know: Unanswerable Questions for SQuAD.
Pranav Rajpurkar, Robin Jia and Percy Liang
● ‘Lighter’ Can Still Be Dark: Modeling Comparative Color Descriptions.
Olivia Winn and Smaranda Muresan
Long:
● Finding syntax in human encephalography with beam search.
John Hale, Chris Dyer, Adhiguna Kuncoro and Jonathan Brennan.
● Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected Value of Perfect Information.
Sudha Rao and Hal Daumé III.
● Let’s do it “again”: A First Computational Approach to Detecting Adverbial Presupposition Triggers.
Andre Cianflone,* Yulan Feng,* Jad Kabbara* and Jackie Chi Kit Cheung. (* equal contribution)
새로운 태스크 + 새로운 데이터
12. Know what you don’t know:
Unanswerable Questions for SQuAD
배경: MRC (machine reading comprehension)
문제: Unanswerable questions in MRC
제안: SQuAD 2.0
- 50,000 unanswerable questions written
adversarially by crowdworkers
- a model with 86% F1 on SQuAD 1.1
achieves only 66% F1 on SQuAD 2.0
13. Learning to Ask Good Questions:
Ranking Clarification Questions using Neural Expected Value of Perfect Information
문제: Make a machine generate questions
- 기존: Reading comprehension
- 제안: Find missing information
데이터: 77k Unix-related posts from
StackExchange
방법:
1. p와 유사한 10개의 질문에서 q 후보 추출
2. 각 q에 적합한 a 도출
3. p가 a로 개선된 정도(EVPI)로 q 랭킹
p
q
a
14.
15. Best demo: uroman
● Romanization for 290 langs
● Unicode 기반이지만 직접 설계한
1088개의 룰 추가
● https://www.isi.edu/~ulf/uroman.html
16. Lifetime Achievement Award
* https://aclweb.org/aclwiki/ACL_Lifetime_Achievement_Award_Recipients
“Algorithms like LSTM and RNN
may work in practice.
But do they work in theory?”
● Mark Steedman (U. Edinburgh)
● Known for work on CCG
○ Combinatory Categorial Grammar
○ Phrase structure grammar의 일종
18. Variational Inference and Deep Generative Models
Let X and Z be random variables, where X is our observed data and Z is a latent variable. A generative model is any model
that defines their joint distributions: p(x, z)
Our goal: estimate the posterior over latent variables p(z|x) which is often intractable!
Our solution: approximate p(z|x) with q(z;θ) which is computable!
L = log p(x) >= - KL(q(z;θ)||p(z|x)) + log p(x)
ELBO:
VI turns inference into optimization
19. Variational Autoencoder Generative Network Gradient
Inference Network Gradient
Solution - VAE
Again requires approximation by sampling
Generative NetworkInference Network
20. Deep Reinforcement Learning for NLP
- RL introduction
- Fundamentals and Overview
- Deep Reinforcement Learning for Dialogue
- Challenges
21. Fundamentals and Overview
Why use RL in NLP?
1. Learning to search and reason
2. Instead of minimizing the surrogate loss (e.g., XE, hinge loss), optimize the end
metric (e.g., BLEU, ROUGE) directly.
3. Select the right (unlabeled) data.
4. Back-propagate the reward to update the model.
22. Deep Reinforcement Learning for Dialogue
Sequence-to-Sequence Model for Dialogue
Problem1: Repetitive responses
Shut up! -> You shut up! -> You shut up!-> …...
Problem2: Short-sighted conversation decisions
How old are you? -> I’m 16 -> 16? -> I don’t know what you are talking
about -> you don’t know what you’re saying -> I don’t know ……->......
Reinforcement Learning
State: message encoding
Action: response decoding
Rewards:
- Ease of answering
r(response)=-log(dull utt | resp)
- Information Flow
r2
= - log sigmoid(cos(s1
,s2
))
- Meaningfulness
r = log p(resp|message) + log p (message|resp)
23. Challenges
NLP problems that presents new challenges to RL
• An unbounded action space defined by natural language
• Dealing with combinatorial actions and external knowledges
• Learning reward functions for NLG
RL problems that are particularly relevant to NLP
• Sample complexity
• Model-based vs. model free RL
• Acquiring rewards
25. Intro
Semantic Parsing : mapping natural language utterances to machine interpretable representations
(e.g., executable queries or logical forms)
Semantic Role Labeling : shallow semantic parsing, assigning labels to words
(e.g., predicate-argument structure or “who does what to who, when, and where”)
Semantic Parsing @ ACL 2018
3 Talk Sessions for Semantic Parsing (2A, 6A, 7A) : 12 papers
2 Poster Sessions for Semantics (1B, 3C) : 15 + 14 papers ⇒ half SP/SRL, half embeddings
1 Tutorial Session for Neural Semantic Parsing
2 Talk Sessions for Semantics (5A, 8A) : 8 papers ⇒ mostly embeddings, but a few SRL
2 Talk Sessions for Parsing (3F, 4F) : 8 papers ⇒ grammars/syntax + dependency parsing
1 Poster Session for Morphology, Tagging, Parsing (3F)
26. Improving Text-to-SQL Evaluation Methodology [1/3] - Intro
https://arxiv.org/abs/1806.09029
“How to evaluate Text-to-SQL” must be improved.
- DBs are not realistic
- question-based split vs query-based split
Template-based, Slot-filling Baseline must fail
but works for question-based split.
1B
27. Improving Text-to-SQL Evaluation Methodology [2/3] - DB Issues
Issue 1: Human-written questions require more complex queries than automatically generated data
(e.g., joins, nesting)
Issue 2: Training and test sets should be splitted based on the SQL query instead of question.
1B
28. Improving Text-to-SQL Evaluation Methodology [3/3]
- Template-based, Slot-filling Baseline must fail, but works
Oracle Entities : To provide accurate anonymization, we annotated query variables using a combination of
automatic and manual processing.
DB from NLP communities DB from DB communities
1B
29. Coarse-to-Fine Decoding for Neural Semantic Parsing [1/2]
- Honourable Mention for Best Paper
https://arxiv.org/abs/1805.04793
Structure-aware neural architecture
1. Generate a rough sketch of input utterance’s meaning, where low-level info is glossed over
2. Experimental results on four datasets
2A
31. Jointly Predicting Predicates and Arguments in Neural
Semantic Role Labeling
https://arxiv.org/abs/1805.04787
End-to-end approach for jointly predicting all predicates, arguments spans, and the relations b/w them
5A
32. Large-Scale QA-SRL Parsing [1/3] - Honourable Mention for Best Paper
https://arxiv.org/abs/1805.05377
Contributions
1. Scalable crowdsourcing QA-SRL
2. New models for QA-SRL parser :
span detection → question generation
Challenges
1. Scale up QA-SRL data annotation
2. Train a QA-SRL Parser
3. Improve Recall
Results
- 133,479 verbs from 64,018 sentences
- across 3 domains (Wikipedia, Wikinews, Science)
- 265,140 question-answer pairs in 9 days
7A
33. Large-Scale QA-SRL Parsing [2/3] - Honourable Mention for Best Paper
Crowdsourcing pipeline
1. Generation
1.1. Write QA-SRL questions for each verb
⇒ Auto-suggest complete questions
reduces # strokes
1.2. Highlight answers in the sentence
2. Validation
2.1. Answer each question or mark it invalid
Model
7A
37. Neural Approaches
- Neural Response Generation
- Dialog Policy Optimization
- Personalized Dialog System
Discourse Analysis
Dialog System @ ACL 2018
Dialog System
Oral presentation: 12 long, 4 short papers
Poster presentation: 16 papers
(Sigdial 2018 is co-located)
Question Answering
Oral presentation: 8 long, 4 short papers
Poster presentation: 11+ papers
Intro
38. Intro
● 결론
○ Task-Oriented 대화 시스템에서 neural approaches를 사용하기 위한 방법론이 활발히 연구 중
○ Amazon Alexa와 같은 산업체에서 발표한, (Clova에도) 실제 적용가능 할만한 연구도 눈에 띔
○ 대화 시스템 평가에 대한 방법론은 아직도 정리 중
■ Perplexity등을 사용하기도 하나, mTurk등을 통해 사람이 직접 성능을 확인
■ ConvAI2 등의 shared task 개최에서도 mTurk을 통한 evaluation이 진행됨
39. Exemplar Encoder-Decoder for Neural Conversation Generation
Neural Conversation Generation using example contexts and their responses
43. Efficient Large-Scale Neural Domain Classification with Personalized
Attention
Evaluation
Week: 100K+ utterances from 600K+ users across 1,000+ Alexa skills
Generated from utterances with explicit invocation patterns.
e.g.) “Ask TAXI to get me a ride” → (get me a ride, TAXI)
MTurk: 20K+ human-paraphrased utterances of randomly sampled commands
from test set of Weak data
e.g.) “get me a ride” → I need a ride, can I get a taxi, order a car for me, …
Evaluate performance with natural language
44. Efficient Large-Scale Neural Domain Classification with Personalized
Attention
Baseline model
1-bit model
(user-enabled domain bitmap)
Attention model
(user-enabled domain attention)
45. Improving Slot Filling in Spoken Language Understanding with
Joint Pointer and Attention
46. Improving Slot Filling in Spoken Language Understanding with
Joint Pointer and Attention
47. Improving Slot Filling in Spoken Language Understanding with
Joint Pointer and Attention
49. Intro
● Machine translation
○ Translating a sequence of tokens in lang. A into that of lang. B
○ All MT papers @ ACL 2018 focused on neural MT (NMT) as opposed to
statistical MT (SMT)
● 11 long, 8 shorts, 1 demo, 1 NMT workshop (16 papers)
○ Main conf. papers (see next page)
○ Papers @ NMT workshop
○ Talks @ NMT workshop
● Trends in MT @ ACL 2018
○ Less exploration on DL architectures than last year, but more NLP-oriented
○ Linguistic structure, document-level translation, data augmentation,
efficient computation, domain adaptation, handling inadequate resources,
and analysis of models
50. List of MT related papers (main conf.)
1. Unsupervised Neural Machine Translation with
Weight Sharing. Yang et al. [link]
2. Subword Regularization: Improving Neural Network
Translation Models with Multiple Subword
Candidates. Kudo. [link]
3. Forest-Based Neural Machine Translation. Ma et al.
[link]
4. Attention Focusing for Neural Machine Translation
by Bridging Source and Target Embeddings. Kuang
et al. [link]
5. Context-Aware Neural Machine Translation Learns
Anaphora Resolution. Voita et al. [link]
6. The Best of Both Worlds: Combining Recent
Advances in Neural Machine Translation. Chen et al.
[link]
7. Towards Robust Neural Machine Translation. Cheng
et al. [link]
8. Document Context Neural Machine Translation with
Memory Networks. Maruf and Haffari. [link]
9. A Stochastic Decoder for Neural Machine
Translation. Schulz et al. [link]
10. Accelerating Neural Transformer via an Average
Attention Network. Zhang et al. [link]
11. Are BLEU and Meaning Representation in
Opposition?. Cífka and Bojar. [link]
12. A Simple and Effective Approach to Coverage-Aware
Neural Machine Translation. Li et al. [link]
13. Compositional Representation of
Morphologically-Rich Input for Neural Machine
Translation. Ataman and Federico. [link]
14. Extreme Adaptation for Personalized Neural Machine
Translation. Michel and Neubig. [link]
15. Learning from Chunk-based Feedback in Neural
Machine Translation. Petrushkov et al. [link]
16. Sparse and Constrained Attention for Neural
Machine Translation. Malaviya et al. [link]
17. Bag-of-Words as Target for Neural Machine
Translation. Ma et al. [link]
18. Improving Beam Search by Removing Monotonic
Constraint for Neural Machine Translation. Shu and
Nakayama. [link]
19. Adaptive Knowledge Sharing in Multi-Task Learning:
Improving Low-Resource Neural Machine
Translation. Zaremoodi et al. [link]
20. Marian: Fast Neural Machine Translation in C++.
Junczys-Dowmunt et al. [link]
51. Document-level translation
Context-Aware Neural Machine Translation Learns Anaphora
Resolution. Voita et al. [link]
● OpenSubtitles 2018; En-Ru; takes the previous
subtitle as “context”
● Modifies Transformer’s encoder part to incorporate
the context
● Model attends to the context when translating
ambiguous pronouns
52. Data augmentation
Towards Robust Neural Machine Translation. Cheng et al. [link]
● Small perturbations in an input can dramatically
deteriorate its translation results
● Perturb input by 1) randomly replacing words, or 2)
adding Gaussian noise to word embedding
● Maintain the consistency of behaviors through the
NMT model for the source sentence x and its
perturbed counterpart x’ through adversarial
learning for the perturbation-invariant encoder
55. Overview of Text Summarization
● TASK
○ Extractive summarization / Abstractive Summarization
● DATA
○ CNN/Daily Mail data sets / English giga word corpus
● ISSUES
○ OOV (Out-Of-Vocabulary) / Repetition
● Techniques
○ Seq2seq model + α
56. 3 selected papers
Model TASK DATA Issues Techniques
Unified Model Abstractive
Summarization
CNN/Daily Mail
data sets
advantages from
extractor and
abstractor
Unified model &
Inconsistency
mechanism
SWAP-NET Extractive
summarization
CNN/Daily Mail
data sets
the interaction of
key words and
salient sentences
2-level Pointer
Network
Re
3
Sum
Abstractive
Summarization
English giga
word corpus
Repetition
Short length
IR-approach +
seq2seq
57. Unified Model
● Combining sentence-level and word-level attentions to take advantage of both
extractive and abstractive summarization approaches
58. Unified Model
● Combining sentence-level and word-level attentions to take advantage of both
extractive and abstractive summarization approaches
59. SWAP-NET
● Sentences and Words
from Alternating
Pointer Network
● 2 level pointer network
○ The sentence-level
○ The word-level
● Switch mechanism
60. Retrieve, Rerank and Rewrite (Re3Sum)
● Soft Template Based Neural Summarization
○ to improve the readability and stability of seq2seq summarization systems
● fuse the popular IR-based and seq2seq-based summarization systems
61. Retrieve, Rerank and Rewrite (Re3Sum)
● Jointly Rerank and Rewrite
○ Bi-directional RNN encoder to read input x and template r
○ soft template r resembles the actual summary y∗ as much as possible
62. References
● Unified Model: A Unified Model for Extractive and Abstractive Summarization using Inconsistency
Loss
○ https://arxiv.org/abs/1805.06266
○ https://hsuwanting.github.io/unified_summ/
● SWAP-NET: Extractive Summarization with SWAP-NET: Sentences and Words from Alternating Pointer
Networks
○ http://aclweb.org/anthology/P18-1014
○ https://github.com/aishj10/swap-net/tree/master/swap-net-summer
● Re
3
Sum: Retrieve, Rerank and Rewrite: Soft Template Based Neural Summarization
○ http://aclweb.org/anthology/P18-1015
○ http://www4.comp.polyu.edu.hk/~cszqcao/