NLP DLforDS

•Download as PPTX, PDF•

0 likes•100 views

Liangqun Lu

This is the presentation for NLP 2018 Spring class, talked about the Deep learning application for dialogue systems

Education

Deep Learning for Dialogue
Systems
Liangqun Lu
PhD program in Biology/Bioinformatics
MS program in Computer Science

JARVIS (Just Another Rather Very Intelligent System)
"J.A.R.V.I.S., are you up?"
"For you sir, always."
"J.A.R.V.I.S.? You ever hear the
tale of Jonah?"
"I wouldn't consider him a role
model."
"J.A.R.V.I.S., where's my flight
power?!"
"Working on it, sir. This is a
prototype."
https://www.youtube.com/watch?v=ZwOxM0-byvc

Intelligent
Sentence
Generation for
Dialogues
DeepLearningtoolbox
1. Seq2Seq Generation -- LSTM --
RNN
2. RL for Generation --- Reinforcement
Learning (RL)
3. SeqGAN for Generation ---
Generative Adversarial Nets (GANs)

1. Seq2Seq Generation
Source: cs224u-2016-li-chatbots
Encoder Decoder GeneratorInput
Maximum
Likelihood
Estimation:
Mutual
Information:

https://talbaumel.github.io/blog/attention/

Seq2Seq encoder- decoder example in keras
Encoder Model
Decoder Model

Summaries
● Seq2seq model can generate output sentences based on the input
sentences
● The maximum likelihood estimation (MLE) objective function does not
guarantee good responses to human beings in read world.
● It is likely to generate highly dull and generic responses such as “I
don’t know” regardless of the input, which is a buzzkiller in a
conversation.
● Mutual Information (MI) could avoid ~30% dull responses.
● It is likely to get stuck in an infinite loop of repetitive responses.

2. RL for sentence generation
Encoder Decoder
Generator
(x)
Input
(h) Human R(h, x)

Hung-yi Lee : RL and GAN for Sentence Generation and Chat-bot

Evaluation
● Training: OpenSubtitles dataset (0.8 M pairs)
● Testing: 1000 input messages
● Length of dialogue;
● lexical diversity;
● human evaluation

Summaries
● Reinforcement Learning implemented in dialogue
generation rewards the conversation with properties:
informativity, coherence and ease of answering
● The model has the advantages on diversity, length, better
human judges and more interactive responses
● This approach makes it potential to generate long-term
dialogues

SeqGAN for sentence generation
Encoder Decoder
Discriminator
(x)
Input Scalar

● Random: random token
generation
● MLE: Seq2Seq with MLE
objective function
● SS: scheduled sampling
● PG-BLEU: policy gradient
with BLEU
* bilingual evaluation understudy
* NLL oracle:

● The stability of SeqGAN
depends on the training
strategy such as g-steps,
d-steps and epoch
number k for g-step
● g-steps=1, d-steps=5,
k=3 has the best
performance

● Table 2: 16,394
Chinese quatrains
● Table 3: 11,092
paragraphs
● Table 4: 695 music

Summaries
● Generative Adversarial Net (GAN) that uses a discriminative model to
guide the training of the generative model has enjoyed considerable
success in generating real-valued data.
● SeqGAN applying policy gradient to update from the discriminative
model to the generative model demonstrates significant
improvements in synthetic and real-world data.

References
1. Li, Jiwei, et al. "Deep reinforcement learning for dialogue generation." arXiv
preprint arXiv:1606.01541 (2016)
2. Yu, Lantao, et al. "SeqGAN: Sequence Generative Adversarial Nets with
Policy Gradient" (2016)
3. Stanford CS224d: Deep Learning for Natural Language Processing
4. DL/ML Tutorial from Hung-yi Lee

Similar to NLP DLforDS

End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager

Yun-Nung (Vivian) Chen

Recurrent Neural Networks hold great promise as general sequence learning algorithms. As such, they are a very promising tool for text analysis. However, outside of very specific use cases such as handwriting recognition and recently, machine translation, they have not seen wide spread use. Why has this been the case? In this presentation, we will first introduce RNNs as a concept. Then we will sketch how to implement them and cover the tricks necessary to make them work well. With the basics covered, we will investigate using RNNs as general text classification and regression models, examining where they succeed and where they fail compared to more traditional text analysis models. A straightforward open-source Python and Theano library for training RNNs with a scikit-learn style interface will be introduced and we’ll see how to use it through a tutorial on a real world text dataset

Recurrent Neural Networks for Text Analysis

odsc

Deep Learning for Automatic Speaker Recognition

Sai Kiran Kadam

deepnet-lourentzou.ppt

yang947066

Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...

Alessandro Suglia

Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...

Claudio Greco

Deep Learning and Watson Studio

Sasha Lazarevic

Deep Learning, an interactive introduction for NLP-ers

Roelof Pieters

[Paper Reading] Unsupervised Learning of Sentence Embeddings using Compositi...

Hiroki Shimanaka

Formal analysis-crypto-proto

Dr. Jayaraj Poroor

Deep Learning - Speaker Verification, Sound Event Detection

Sai Kiran Kadam

SNLI_presentation_2

Viral Gupta

Deep Learning And Business Models (VNITC 2015-09-13)

Ha Phuong

Interest in Deep Learning has been growing in the past few years. With advances in software and hardware technologies, Neural Networks are making a resurgence. With interest in AI based applications growing, and companies like IBM, Google, Microsoft, NVidia investing heavily in computing and software applications, it is time to understand Deep Learning better! In this lecture, we will get an introduction to Autoencoders and Recurrent Neural Networks and understand the state-of-the-art in hardware and software architectures. Functional Demos will be presented in Keras, a popular Python package with a backend in Theano. This will be a preview of the QuantUniversity Deep Learning Workshop that will be offered in 2017.

Deep learning Tutorial - Part II

QuantUniversity

Temporal Hypermap Theory and Application

Abel Nyamapfene

A tutorial given at NAACL HLT 2013. Richard Socher and Christopher Manning http://nlp.stanford.edu/courses/NAACL2013/ Machine learning is everywhere in today's NLP, but by and large machine learning amounts to numerical optimization of weights for human designed representations and features. The goal of deep learning is to explore how computers can take advantage of data to develop features and representations appropriate for complex interpretation tasks. This tutorial aims to cover the basic motivation, ideas, models and learning algorithms in deep learning for natural language processing. Recently, these methods have been shown to perform very well on various NLP tasks such as language modeling, POS tagging, named entity recognition, sentiment analysis and paraphrase detection, among others. The most attractive quality of these techniques is that they can perform well without any external hand-designed resources or time-intensive feature engineering. Despite these advantages, many researchers in NLP are not familiar with these methods. Our focus is on insight and understanding, using graphical illustrations and simple, intuitive derivations. The goal of the tutorial is to make the inner workings of these techniques transparent, intuitive and their results interpretable, rather than black boxes labeled "magic here". The first part of the tutorial presents the basics of neural networks, neural word vectors, several simple models based on local windows and the math and algorithms of training via backpropagation. In this section applications include language modeling and POS tagging. In the second section we present recursive neural networks which can learn structured tree outputs as well as vector representations for phrases and sentences. We cover both equations as well as applications. We show how training can be achieved by a modified version of the backpropagation algorithm introduced before. These modifications allow the algorithm to work on tree structures. Applications include sentiment analysis and paraphrase detection. We also draw connections to recent work in semantic compositionality in vector spaces. The principle goal, again, is to make these methods appear intuitive and interpretable rather than mathematically confusing. By this point in the tutorial, the audience members should have a clear understanding of how to build a deep learning system for word-, sentence- and document-level tasks. The last part of the tutorial gives a general overview of the different applications of deep learning in NLP, including bag of words models. We will provide a discussion of NLP-oriented issues in modeling, interpretation, representational power, and optimization.

Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning

BigDataCloud

State of the art time-series analysis with deep learning by Javier Ordóñez at...

Big Data Spain

Feature Extraction and Analysis of Natural Language Processing for Deep Learn...

Sharmila Sathish

Interest in Deep Learning has been growing in the past few years. With advances in software and hardware technologies, Neural Networks are making a resurgence. With interest in AI based applications growing, and companies like IBM, Google, Microsoft, NVidia investing heavily in computing and software applications, it is time to understand Deep Learning better! In this workshop, we will discuss the basics of Neural Networks and discuss how Deep Learning Neural networks are different from conventional Neural Network architectures. We will review a bit of mathematics that goes into building neural networks and understand the role of GPUs in Deep Learning. We will also get an introduction to Autoencoders, Convolutional Neural Networks, Recurrent Neural Networks and understand the state-of-the-art in hardware and software architectures. Functional Demos will be presented in Keras, a popular Python package with a backend in Theano and Tensorflow.

Deep learning with Keras

QuantUniversity

Tensorflowv5.0

Sanjib Basak

Similar to NLP DLforDS (20)

End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager

Recurrent Neural Networks for Text Analysis

Deep Learning for Automatic Speaker Recognition

deepnet-lourentzou.ppt

Ask Me Any Rating: A Content-based Recommender System based on Recurrent Neur...

Deep Learning and Watson Studio

Deep Learning, an interactive introduction for NLP-ers

[Paper Reading] Unsupervised Learning of Sentence Embeddings using Compositi...

Formal analysis-crypto-proto

Deep Learning - Speaker Verification, Sound Event Detection

SNLI_presentation_2

Deep Learning And Business Models (VNITC 2015-09-13)

Deep learning Tutorial - Part II

Temporal Hypermap Theory and Application

Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning

State of the art time-series analysis with deep learning by Javier Ordóñez at...

Feature Extraction and Analysis of Natural Language Processing for Deep Learn...

Deep learning with Keras

Tensorflowv5.0

Recently uploaded

The approach at University of Liverpool.pptx

Jisc

PART A. Introduction to Costumer Service

PedroFerreira53928

Fish and Chips - have they had their chips

GeoBlogs

We all have good and bad thoughts from time to time and situation to situation. We are bombarded daily with spiraling thoughts(both negative and positive) creating all-consuming feel , making us difficult to manage with associated suffering. Good thoughts are like our Mob Signal (Positive thought) amidst noise(negative thought) in the atmosphere. Negative thoughts like noise outweigh positive thoughts. These thoughts often create unwanted confusion, trouble, stress and frustration in our mind as well as chaos in our physical world. Negative thoughts are also known as “distorted thinking”.

How to Break the cycle of negative Thoughts

Col Mukteshwar Prasad

How libraries can support authors with open access requirements for UKRI fund...

Jisc

The geography of Taylor Swift - some ideas

GeoBlogs

NCERT Solutions Power Sharing Class 10 Notes pdf

Vivekanand Anglo Vedic Academy

Palestine last event orientationfvgnh .pptx

RaedMohamed3

Title: Maximizing Industrial Training Benefits: A Comprehensive Guide Introduction:- Welcome to our comprehensive guide on industrial training and its myriad benefits. Learn how to optimize your industrial training experience for maximum growth and skill development. Understanding Industrial Training:- Definition: Industrial training refers to a structured program that integrates academic knowledge with practical application in a professional setting. Importance: It bridges the gap between theoretical learning and real-world industry practices. Key Components of Industrial Training:- Practical Experience: Hands-on learning in a real workplace environment. Mentorship: Guidance from experienced professionals in the field. Skill Development: Enhancement of technical, interpersonal, and problem-solving skills. Benefits of Industrial Training Enhanced Skill Set: Gain practical skills relevant to your field of study or career path. Industry Exposure: Acquire firsthand knowledge of industry practices and trends. Networking Opportunities: Connect with professionals and build valuable contacts for future endeavors. Career Advancement: Increase employability and stand out in the job market with relevant experience. Personal Growth: Develop confidence, adaptability, and problem-solving abilities. Resume Enhancement: Strengthen your resume with valuable practical experience. Potential Job Offers: Impress employers and increase your chances of securing job offers post-training. Insight into Work Culture: Understand workplace dynamics and organizational structures. Professional Guidance: Receive mentorship and guidance from seasoned professionals. Test-Drive Careers: Explore different career paths and industries to make informed decisions about your future. Case Studies Highlight successful industrial training experiences and their impact on participants' careers. Showcase testimonials from trainees and employers on the benefits of industrial training. Slide 8: Conclusion Industrial training offers invaluable opportunities for skill development, career advancement, and personal growth. By implementing the strategies outlined in this guide, you can maximize the benefits of your industrial training experience. Embrace every opportunity to learn, grow, and excel in your chosen field. SEO Ranking Tags:- Industrial Training AKTU report AKTU Industrial Training Report

Industrial Training Report- AKTU Industrial Training Report

Avinash Rai

Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX

MIRIAMSALINAS13

1.4 modern child centered education - mahatma gandhi-2.pptx

JosvitaDsouza2

The Art Pastor's Guide to Sabbath | Steve Thomason

Steve Thomason

Benefits and Challenges of Using Open Educational Resources

dimpy50

NLC-2024-Orientation-for-RO-SDO (1).pptx

ssuserbdd3e8

50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...

Nguyen Thanh Tu Collection

GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...

Nguyen Thanh Tu Collection

The Indian economy is classified into different sectors to simplify the analysis and understanding of economic activities. For Class 10, it's essential to grasp the sectors of the Indian economy, understand their characteristics, and recognize their importance. This guide will provide detailed notes on the Sectors of the Indian Economy Class 10, using specific long-tail keywords to enhance comprehension. For more information, visit-www.vavaclasses.com

Sectors of the Indian Economy - Class 10 Study Notes pdf

Vivekanand Anglo Vedic Academy

How to Create Map Views in the Odoo 17 ERP

Celine George

INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf

bu07226

Instructions for Submissions thorugh G- Classroom.pptx

Jheel Barad

Recently uploaded (20)

The approach at University of Liverpool.pptx

PART A. Introduction to Costumer Service

Fish and Chips - have they had their chips

How to Break the cycle of negative Thoughts

How libraries can support authors with open access requirements for UKRI fund...

The geography of Taylor Swift - some ideas

NCERT Solutions Power Sharing Class 10 Notes pdf

Palestine last event orientationfvgnh .pptx

Industrial Training Report- AKTU Industrial Training Report

Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX

1.4 modern child centered education - mahatma gandhi-2.pptx

The Art Pastor's Guide to Sabbath | Steve Thomason

Benefits and Challenges of Using Open Educational Resources

NLC-2024-Orientation-for-RO-SDO (1).pptx

50 ĐỀ LUYỆN THI IOE LỚP 9 - NĂM HỌC 2022-2023 (CÓ LINK HÌNH, FILE AUDIO VÀ ĐÁ...

GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...

Sectors of the Indian Economy - Class 10 Study Notes pdf

How to Create Map Views in the Odoo 17 ERP

INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf

Instructions for Submissions thorugh G- Classroom.pptx

NLP DLforDS

1. Deep Learning for Dialogue Systems Liangqun Lu PhD program in Biology/Bioinformatics MS program in Computer Science

2. JARVIS (Just Another Rather Very Intelligent System) "J.A.R.V.I.S., are you up?" "For you sir, always." "J.A.R.V.I.S.? You ever hear the tale of Jonah?" "I wouldn't consider him a role model." "J.A.R.V.I.S., where's my flight power?!" "Working on it, sir. This is a prototype." https://www.youtube.com/watch?v=ZwOxM0-byvc

4. Intelligent Sentence Generation for Dialogues DeepLearningtoolbox 1. Seq2Seq Generation -- LSTM -- RNN 2. RL for Generation --- Reinforcement Learning (RL) 3. SeqGAN for Generation --- Generative Adversarial Nets (GANs)

5. 1. Seq2Seq Generation Source: cs224u-2016-li-chatbots Encoder Decoder GeneratorInput Maximum Likelihood Estimation: Mutual Information:

6. RNN (Recurrent Neural Network)

7. Long short term memories (LSTMs)

8. https://talbaumel.github.io/blog/attention/

9. Seq2Seq encoder- decoder example in keras Encoder Model Decoder Model

10. Summaries ● Seq2seq model can generate output sentences based on the input sentences ● The maximum likelihood estimation (MLE) objective function does not guarantee good responses to human beings in read world. ● It is likely to generate highly dull and generic responses such as “I don’t know” regardless of the input, which is a buzzkiller in a conversation. ● Mutual Information (MI) could avoid ~30% dull responses. ● It is likely to get stuck in an infinite loop of repetitive responses.

11. 2. RL for sentence generation Encoder Decoder Generator (x) Input (h) Human R(h, x)

12. Hung-yi Lee : RL and GAN for Sentence Generation and Chat-bot

13. Hung-yi Lee : RL and GAN for Sentence Generation and Chat-bot

14. Hung-yi Lee : RL and GAN for Sentence Generation and Chat-bot

15. RL implemented in dialogue systems

16. Evaluation ● Training: OpenSubtitles dataset (0.8 M pairs) ● Testing: 1000 input messages ● Length of dialogue; ● lexical diversity; ● human evaluation

17. Summaries ● Reinforcement Learning implemented in dialogue generation rewards the conversation with properties: informativity, coherence and ease of answering ● The model has the advantages on diversity, length, better human judges and more interactive responses ● This approach makes it potential to generate long-term dialogues

18. 3. SeqGAN for sentence generation

19.

20. SeqGAN for sentence generation Encoder Decoder Discriminator (x) Input Scalar

21.

22.

23. ● Random: random token generation ● MLE: Seq2Seq with MLE objective function ● SS: scheduled sampling ● PG-BLEU: policy gradient with BLEU * bilingual evaluation understudy * NLL oracle:

24. ● The stability of SeqGAN depends on the training strategy such as g-steps, d-steps and epoch number k for g-step ● g-steps=1, d-steps=5, k=3 has the best performance

25. ● Table 2: 16,394 Chinese quatrains ● Table 3: 11,092 paragraphs ● Table 4: 695 music

26. Summaries ● Generative Adversarial Net (GAN) that uses a discriminative model to guide the training of the generative model has enjoyed considerable success in generating real-valued data. ● SeqGAN applying policy gradient to update from the discriminative model to the generative model demonstrates significant improvements in synthetic and real-world data.

27. References 1. Li, Jiwei, et al. "Deep reinforcement learning for dialogue generation." arXiv preprint arXiv:1606.01541 (2016) 2. Yu, Lantao, et al. "SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient" (2016) 3. Stanford CS224d: Deep Learning for Natural Language Processing 4. DL/ML Tutorial from Hung-yi Lee

Editor's Notes

My interest on this topic is actually from Iron Man movies. In the movies, we know that Iron man tony stark has an intelligent assistant called JARVIS, they have many interesting conversations. It will be a pleasure to have such smart virtual friend.
Deep Learning techniques have successful applications in many areas, including Natural Language Processing. These two papers from 2 years played an important role in dialogue systems, with advanced skills in RL and GAN.
In my understanding, deep learning toolbox provides tools which can be applied in dialogues, at least in these 3 steps. So far, there are some intelligent sentence generation for dialogues from these techniques.
In seq2seq generation, the simplified architecture is like this one. Here is an example: There are 2 optimizations in this system, MLE and MI.
The seq2seq model is based on RNN with LSTM. RNN is ---, the structure is this, including input and output. Unfold shows the details here, from xt to ot, the input actually is xt, s(t-1) and the output is ot and st, s(t-1) records the previous information, which is important in sequence tasks. The advantage of RNN, compared to other DL models, is that RNN is suitable to process sequence data.
However RNN has gradient exploding or vanishing problem when the sequence is long, because the optimization has to consider all memory from previous steps. LSTM was developed to optimize the memory problem with three gates in a cell.
Encoder and Decoder, a function used to model the complex system.
An encoder and decoder example from Keras shows the parameters in layers. The encoder and decoder has the same number 256.
Evaluating dialogue systems is difficult. Metrics such as BLEU (Papineni et al., 2002) and perplexity have been widely used for dialogue quality evaluation (Li et al., 2016a; Vinyals and Le, 2015; Sordoni et al., 2015), but it is widely debated how well these automatic metrics are correlated with true response quality (Liu et al., 2016; Galley et al., 2015). Since the goal of the proposed system is not to predict the highest probability response, but rather the long-term success of the dialogue, we do not employ BLEU or perplexity for evaluation.
We propose to measure the ease of answering a generated turn by using the negative log likelihood of responding to that utterance with a dull response.
Li, Jiwei, et al. "Deep reinforcement learning for dialogue generation." arXiv preprint arXiv:1606.01541 (2016). Yu, Lantao, et al. "SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient" (2016)

NLP DLforDS

Recommended

Recommended

More Related Content

Similar to NLP DLforDS

Similar to NLP DLforDS (20)

More from Liangqun Lu

More from Liangqun Lu (13)

Recently uploaded

Recently uploaded (20)

NLP DLforDS

Editor's Notes