A PRIMER ON NEURAL NETWORK MODELS FOR
NATURAL LANGUAGE PROCESSING
2018 Copyright QuantUniversity LLC.
Sri Krishnamurthy, CFA, CAP
sri@quantuniversity.com
www.analyticscertificate.com
2
QuantUniversity
• Analytics and Fintech Advisory
• Trained more than 1000 students in
Quantitative methods, Data Science
and Big Data & Fintech
• Programs
▫ Analytics Certificate Program
▫ Fintech Certification program
• Solutions
• Founder of QuantUniversity LLC. and
www.analyticscertificate.com
• Advisory and Consultancy for Financial Analytics
• Prior Experience at MathWorks, Citigroup and Endeca and
25+ financial services and energy customers.
• Regular Columnist for the Wilmott Magazine
• Charted Financial Analyst and Certified Analytics
Professional
• Teaches Analytics in the Babson College MBA program and
at Northeastern University, Boston
Sri Krishnamurthy
Founder and CEO
3
4
Code and slides for today’s
workshop:
Request at:
https://tinyurl.com/QUNLP2018
5
• Intro to Natural Language Processing
• Intro to Neural Networks and Deep Neural Networks
• Networks that “understand” language!
• Embeddings: clever representation of words
• Recurrent Neural Networks: remembering history
• Encoder-Decoder architectures
• So many models! So little time! - QuSandbox
In this session
6
Why NLP?
7
What is NLP ?
AI
Linguistics
Computer
Science
8
• Q/A
• Dialog systems - Chatbots
• Topic summarization
• Sentiment analysis
• Classification
• Keyword extraction - Search
• Information extraction – Prices, Dates, People etc.
• Tone Analysis
• Machine Translation
• Document comparison – Similar/Dissimilar
Sample applications
9
NLP in Finance
10
• If computers can understand language, opens huge possibilities
▫ Read and summarize
▫ Translate
▫ Describe what’s happening
▫ Understand commands
▫ Answer questions
▫ Respond in plain language
Language allows understanding
11
• Describe rules of grammar
• Describe meanings of words and their
relationships
• …including all the special cases
• ...and idioms
• ...and special cases for the idioms
• ...
• ...understand language!
Traditional language AI
https://en.wikipedia.org/wiki/Formal_language
12
What is NLP ?
Jumping NLP Curves
https://ieeexplore.ieee.org/document/6786458/
13
Q: What’s hard about writing programs
to understand text?
14
• Ambiguity:
▫ “ground”
▫ “jaguar”
▫ “The car hit the pole while it was moving”
▫ “One morning I shot an elephant in my pajamas. How he got into my
pajamas, I’ll never know.”
▫ “The tank is full of soldiers.”
“The tank is full of nitrogen.”
Language is hard to deal with
15
16
• Many ways to say the same thing
▫ “the same thing can be said in many ways”
▫ “language is versatile”
▫ “The same words can be arranged in many different ways to express
the same idea”
▫ …
Language is hard to deal with
17
• Context matters: “I pressed a suit”
Language is hard to deal with
Images: wikipedia and pixabay
18
Why are these funny?
“Time to do my homework #yay”
“It's a small world...
...but I wouldn't want to have to paint it.”
“Time flies like an arrow. Fruit flies like a banana.”
19
• Learn by “reading” lots of text, some labeled.
• Less precise
• Deals with ambiguity better
Neural networks and other statistical approaches
20
• Unsupervised Algorithms
▫ Given a dataset with variables 𝑥𝑖, build a model that captures the
similarities in different observations and assigns them to different
buckets => Clustering, etc.
▫ Create a transformed representation of the original data=> PCA
Machine Learning
Obs1,
Obs2,Obs3
etc.
Model
Obs1- Class 1
Obs2- Class 2
Obs3- Class 1
21
• Supervised Algorithms
▫ Given a set of variables 𝑥𝑖, predict the value of another variable 𝑦 in a
given data set such that
▫ If y is numeric => Prediction
▫ If y is categorical => Classification
Machine Learning
x1,x2,x3… Model F(X) y
22
Start with labeled pairs (Xi, Yi)
( ,“kitten”),( ,“puppy”)
…
23
Success: predict new examples
( ,?)
24
https://commons.wikimedia.org/wiki/Neural_network
“kitten”
“puppy”
“has fur?”
“pointy ears?”
“dangerously cute?”
Neural Networks
25
http://stackoverflow.com/questions/40537503/deep-neural-networks-precision-for-image-recognition-float-or-double
Linear regression
1
Weighted sum
26
http://stackoverflow.com/questions/40537503/deep-neural-networks-precision-for-image-recognition-float-or-double
Linear regression
1
Learning = “find good weights”
27
http://stackoverflow.com/questions/40537503/deep-neural-networks-precision-for-image-recognition-float-or-double
Binary linear classifier
1
To classify: Y > 0?
28
http://stackoverflow.com/questions/40537503/deep-neural-networks-precision-for-image-recognition-float-or-double
Binary linear classifier
1
Bias weight
29
30
Hardware
31
Data
http://www.theneweconomy.com/strategy/big-data-is-not-without-its-problems
32
New Approaches
http://deeplearning.net/reading-list/
33
Given (lots of) data, DNNs learn a good representation
automatically.
34
http://www.asimovinstitute.org/neural-network-zoo/
35
• MLP:
▫ Work with fixed sized inputs ; Networks learn to combine inputs in
a meaningful way
• CNNs:
▫ Specialized feed-forward architectures that extracts local patterns
in the data
• RNNs:
▫ Takes as input a sequence of items, and produce a fixed size
vector that summarizes that sequence
Key NN architectures for NLP
36
MLP
37
• Can be used with fixed/variable input sizes
• Can be used wherever linear models were used
• Useful in integrating pre-trained word embeddings
MLP in NLP
38
Convolutional Neural Networks
Convolution
Specialized feed-forward architectures that excel at extracting local
patterns in the data
39
Max pooling
40
Convolutional Neural Networks
easily integrate pre-trained word embeddings
41
▫ Specialized feed-forward architectures that extracts local patterns
in the data
▫ Fixed/Variable sized inputs
▫ Works well in identifying phrases/idioms
CNNs in NLP
42
Recurrent Neural Networks
• A recurrent neural network can be thought of as multiple copies of
the same network, each passing a message to a successor. 1
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
43
Used to generate representations that are typically used in
conjunction with MLPs
Great for sequences
Addresses many challenges in language modeling (markov
assumptions, sparsity etc.)
RNNs in NLP
44
• Sequence-to-sequence models (Encoder-Decoder) for machine
translation
• Learning from external, unannotated data (Semi-supervised models)
Other NN model applications
45
• Input: posts, labels as positive / negative.
• Goal: build a classifier to classify new posts
• IMDB Dataset: http://ai.stanford.edu/~amaas/data/sentiment/
• 25,000 highly polar movie reviews for training, and 25,000 for
testing.
Sample application: sentiment detection
46
• Goal: get familiar with the problem and establish a simple baseline.
• Overview:
▫ Load the data
▫ Look at a sample of positive and negative reviews
▫ Look at some distributional data
• Code: 08-imdb-explore.ipynb
Demo: IMDB dataset exploration
47
48
• Can’t learn them all individually…
• Instead, want to have a representation that encodes relationships
between words, so we can learn e.g. that all “negative” words make
it more likely the review is negative.
Challenge: many ways to say same thing
49
• Want computer to understand word relationships
▫ Man : King; Woman : ???
▫ Fish : Ocean; Gazelle : ???
• Goals:
▫ Encode semantic relationship between words: similarity, differences,
etc.
▫ Represent each word in a concise way
Let’s start “simple”: understanding individual words
50
• An embedding is a map word -> vector that makes similar words
have similar vectors, and encodes semantic relationships.
• Creating an embedding:
▫ Look at a lot of text.
 “there was a frog in the swamp”
 “artificial intelligence has a long way to go”
 “whether ’tis nobler in the mind to suffer the slings and arrows of
outrageous fortune”
▫ Learn what words tend to go together, which don’t.
Approach: embeddings
51
• Learn to predict neighbors of a word.
• Compute co-occurrence counts:
• “there was a frog in a swamp”
• P(swamp,frog) = …
• P(artificial,frog) = …
• …
• Train a model word -> vector to minimize d(v1,v2) where P(w1,w2) is
high.
Creating an embedding
52
Frog:
Swamp:
Computer:
…
Compute error in predicting P(w1,w2) given d(v1,v2).
Update weights:
Frog:
Swamp:
Computer:
Creating an embedding
[0.2, 0.7, 0.11, …, 0.52]
[0.9, 0.55, 0.4, …, 0.8]
[0.3, 0.6, 0.01, …, 0.7]
[0.3, 0.65, 0.3, …, 0.6]
[0.7, 0.6, 0.4, …, 0.7]
[0.5, 0.3, 0.02, …, 0.4]
1)
2)
3)
53
http://multithreaded.stitchfix.com/assets/images/blog/vectors.gif
Embeddings capture conceptual relationships
54
http://nlp.yvespeirsman.be/blog/visualizing-word-embeddings-with-tsne/
55
http://nlp.yvespeirsman.be/blog/visualizing-word-embeddings-with-tsne/
56
• Pre-trained embeddings are available:
▫ Google News (100B words)
▫ Twitter (27B words)
▫ Wikipedia + Gigaword (newswire corpus) (6B words)
• It’s better to train/fine-tune for your specific application, but these
are a good place to start
▫ Especially if you don’t have much data
You don’t have to train your own embedding
List from https://github.com/3Top/word2vec-api
57
• Let’s apply the approaches we already know to our movie review
sentiment task
Ok, now we have a reasonable way to represent words
58
• Goal: use familiar network architectures for text classification
• Overview:
▫ Prepare the dataset
▫ Use a pre-trained embedding
▫ Train a MLP
▫ Train a 1D CNN
• Code: 09-imdb-mlp-cnn.ipynb
Demo: MLPs and CNNs for sentiment analysis
59
60
“In 2009, I went to Nepal”
“I went to Nepal in 2009”
“I had high expectations, and this movie exceeded them.”
• Need to remember what we saw earlier.
• Time series → predict next element
Challenge: the state-time continuum
61
Solution: let the network represent the past
62
Our networks so far
Hidden
layers
Input
Output
63
Recurrent Neural Networks (RNNs)
Hidden
layers
Input
Recurrent connection
Output
64
Another view of RNNs
Hidden
layers
Input 1
Output
Hidden
layers
Input N
Output
…
This
Recurrent
connection
Recurrent
connection
Recurrent
connection
movie monkeys…
Hidden
layers
Input 2
Output
65
Variant: one output
Hidden
layers
Input 1
Hidden
layers
Input 2
…
This
Recurrent
connection
Recurrent
connection
Recurrent
connection
movie monkeys…
Hidden
layers
Input N
Output
66
New parameters:
Hidden
layers
Input
Output
Hidden-to-hidden weights
Input-to-hidden weights
Hidden-to-output weights
67
New parameters:
Hidden
layers
Input
Output
Hidden-to-hidden weights
Input-to-hidden weights
Hidden-to-output weightsHow to combine two arrows
leading to hidden state?
Add contribution of input +
previous hidden state
68
• The same state transformation for each time step
Question: where is the parameter sharing in an RNN?
Hidden
layers
Input 1
Hidden
layers
Input 2
…
Same parameters!
Hidden
layers
Input N
Output
Same parameters!
69
• Again, backpropagation just works!
• In theory…
• Long-term dependencies are a problem
▫ Vanishing gradients
▫ Exploding gradients
• Solutions:
▫ Careful initialization
▫ Short sequences
▫ More advanced techniques, such as LSTM
Training RNNs
70
• As mentioned RNNs have a problem: long-term dependencies
▫ Gradients disappear or blow up
• One solution: LSTM – let network learn when to remember, when to
forget
• Used in practice
LSTM – Long Short-Term Memory networks
71
Demo: simple RNN for text generation
72
• https://github.com/fchollet/keras/blob/master/examples/imdb_lst
m.py
Demo: RNN for sentiment classification
73
74
• Translate (seq2seq)
• Caption (vec2seq)
• Visualize or classify text (seq2vec)
What if input + output have different length, or type?
75
Encoder-decoder architecture
Hidden
layers
Input 1
Hidden
layers
Input 2
Hidden
layers
Input N
…
Hidden
layers
Output 1
Hidden
layers
Output 2
Hidden
layers
Output M
…
Encoding
“Thought vector”
76
Encoder-decoder variant: vec2seq
Hidden
layers
Input 1
Hidden
layers
Hidden
layers
…
Hidden
layers
Output 1
Hidden
layers
Output 2
Hidden
layers
Output M
…
Encoding
“Thought vector”
77
• Goal: learn to caption images
• Overview:
▫ Learn abstract representations of images using a CNN
▫ Learn to map those abstract representations to sentences
▫ Train the system end-to-end
• Code sketch: 10-image-captioning.ipynb
Demo: captioning images
QuSandbox
79
• Code + Environment
• Dynamic scalability
• Enterprise collaboration
• Model Management
• One platform for all your analytical needs
Why QuSandbox?
Create Projects
➢ Instructors can create projects using AMIs, DockerHub, Github as resources.
➢ Additional information such as the project type (JNS , Jupyter Lab etc) , description and name can be
specified here.
Run Projects
➢ QuSandbox allows users to run a
wide variety of projects hosted
on various platforms such as
AMIs, Docker Hub, Git repos.
➢ While launching the user can
configure specifications like the
project source, the machine
type, duration and the credits
used for this session.
➢ Users are allowed to run more
than 1 project at a time.
Launch Labs
On launching the lab users can :
- Modify and run jupyter notebook files, labs and other components linked to the project.
- Explore the project structure, create new files and keep track of work from previous sessions.
➢ Set up account information
username, personal details
and password.
➢ Specify courses that user
wants to registered for .
➢ Multi-role profiles allows
user to register as one or
more roles using the same
account.
Enterprise features – User and Roles
Enterprise features – Credential management
Amazon Credentials
- Update aws keys and pem file to grant permission to
use ec2 services for running, stopping , terminating
and extending instances.
Github Credentials
- Update the github username and password to allow
saving project work on github.
* All credentials are securely encrypted and stored in the
database.
Admin tools - Manage Tasks
- Running projects can be managed on the Tasks page. Information such as task and instance status, time
remaining as well as past projects information can be viewed here.
- The core project features (LAUNCH, EXTEND, STOP and KILL) can be performed by the designated buttons in
actions field of the task.
Academic use case - Courses
Instructors can use the course page to create and edit
lecture components such as slides, reading materials and
quizzes.
Students can view the uploaded material and submit
assignments for the lectures if they are registered for the
respective courses.
Command Line Interface on QuSandbox
The Command Line Interface is a unified tool that provides a consistent interface for interacting with all parts of
QuSandbox.
Run a specific project defined by Json file. After completing configuration, an
IP address will be given and user can use the public ip address to run the
project.
PythonJavaScrip
t
More Features on CLI
use >Qusandbox -help to get more features’ detail
Research Hub on QuSandbox
The research hub on QUSandbox allows group of people working on a project to share and run it seamlessly .
https://researchhub.herokuapp.com/homepage
1. Button linking the project to QUSandbox. 2. View the project on QUSandbox.
Research Hub on QuSandbox
The research hub on QUSandbox allows group of people working on a project to share and run it seamlessly.
➢ Each project associated
with a unique
ProjectName.
➢ Create embed link for
each project.
➢ Use the link from
anywhere to hit
QUSandbox.
Coming soon!
92
Logistics:
When: June 14,15th
Where: Boston MA
Registration: http://qu-nlp.eventbrite.com/
Code: 25% off all ticket levels
QU25 till 5/4/2018
Code and slides for today’s workshop:
Request at: https://tinyurl.com/QUNLP2018
93
Coming soon!
94
Logistics:
When: June 14,15th
Where: Boston MA
Registration: http://qu-nlp.eventbrite.com/
Code: 25% off all ticket levels
QU25 till 5/4/2018
Code and slides for today’s workshop:
Request at: https://tinyurl.com/QUNLP2018
Thank you!
Presentations will be posted here:
www.analyticscertificate.com
Sri Krishnamurthy, CFA, CAP
Founder and CEO
QuantUniversity LLC.
srikrishnamurthy
www.QuantUniversity.com
Information, data and drawings embodied in this presentation are strictly a property of QuantUniversity LLC. and shall not be
distributed or used in any other publication without the prior written consent of QuantUniversity LLC.
95

Nlp and Neural Networks workshop

  • 1.
    A PRIMER ONNEURAL NETWORK MODELS FOR NATURAL LANGUAGE PROCESSING 2018 Copyright QuantUniversity LLC. Sri Krishnamurthy, CFA, CAP sri@quantuniversity.com www.analyticscertificate.com
  • 2.
    2 QuantUniversity • Analytics andFintech Advisory • Trained more than 1000 students in Quantitative methods, Data Science and Big Data & Fintech • Programs ▫ Analytics Certificate Program ▫ Fintech Certification program • Solutions
  • 3.
    • Founder ofQuantUniversity LLC. and www.analyticscertificate.com • Advisory and Consultancy for Financial Analytics • Prior Experience at MathWorks, Citigroup and Endeca and 25+ financial services and energy customers. • Regular Columnist for the Wilmott Magazine • Charted Financial Analyst and Certified Analytics Professional • Teaches Analytics in the Babson College MBA program and at Northeastern University, Boston Sri Krishnamurthy Founder and CEO 3
  • 4.
    4 Code and slidesfor today’s workshop: Request at: https://tinyurl.com/QUNLP2018
  • 5.
    5 • Intro toNatural Language Processing • Intro to Neural Networks and Deep Neural Networks • Networks that “understand” language! • Embeddings: clever representation of words • Recurrent Neural Networks: remembering history • Encoder-Decoder architectures • So many models! So little time! - QuSandbox In this session
  • 6.
  • 7.
    7 What is NLP? AI Linguistics Computer Science
  • 8.
    8 • Q/A • Dialogsystems - Chatbots • Topic summarization • Sentiment analysis • Classification • Keyword extraction - Search • Information extraction – Prices, Dates, People etc. • Tone Analysis • Machine Translation • Document comparison – Similar/Dissimilar Sample applications
  • 9.
  • 10.
    10 • If computerscan understand language, opens huge possibilities ▫ Read and summarize ▫ Translate ▫ Describe what’s happening ▫ Understand commands ▫ Answer questions ▫ Respond in plain language Language allows understanding
  • 11.
    11 • Describe rulesof grammar • Describe meanings of words and their relationships • …including all the special cases • ...and idioms • ...and special cases for the idioms • ... • ...understand language! Traditional language AI https://en.wikipedia.org/wiki/Formal_language
  • 12.
    12 What is NLP? Jumping NLP Curves https://ieeexplore.ieee.org/document/6786458/
  • 13.
    13 Q: What’s hardabout writing programs to understand text?
  • 14.
    14 • Ambiguity: ▫ “ground” ▫“jaguar” ▫ “The car hit the pole while it was moving” ▫ “One morning I shot an elephant in my pajamas. How he got into my pajamas, I’ll never know.” ▫ “The tank is full of soldiers.” “The tank is full of nitrogen.” Language is hard to deal with
  • 15.
  • 16.
    16 • Many waysto say the same thing ▫ “the same thing can be said in many ways” ▫ “language is versatile” ▫ “The same words can be arranged in many different ways to express the same idea” ▫ … Language is hard to deal with
  • 17.
    17 • Context matters:“I pressed a suit” Language is hard to deal with Images: wikipedia and pixabay
  • 18.
    18 Why are thesefunny? “Time to do my homework #yay” “It's a small world... ...but I wouldn't want to have to paint it.” “Time flies like an arrow. Fruit flies like a banana.”
  • 19.
    19 • Learn by“reading” lots of text, some labeled. • Less precise • Deals with ambiguity better Neural networks and other statistical approaches
  • 20.
    20 • Unsupervised Algorithms ▫Given a dataset with variables 𝑥𝑖, build a model that captures the similarities in different observations and assigns them to different buckets => Clustering, etc. ▫ Create a transformed representation of the original data=> PCA Machine Learning Obs1, Obs2,Obs3 etc. Model Obs1- Class 1 Obs2- Class 2 Obs3- Class 1
  • 21.
    21 • Supervised Algorithms ▫Given a set of variables 𝑥𝑖, predict the value of another variable 𝑦 in a given data set such that ▫ If y is numeric => Prediction ▫ If y is categorical => Classification Machine Learning x1,x2,x3… Model F(X) y
  • 22.
    22 Start with labeledpairs (Xi, Yi) ( ,“kitten”),( ,“puppy”) …
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
    33 Given (lots of)data, DNNs learn a good representation automatically.
  • 34.
  • 35.
    35 • MLP: ▫ Workwith fixed sized inputs ; Networks learn to combine inputs in a meaningful way • CNNs: ▫ Specialized feed-forward architectures that extracts local patterns in the data • RNNs: ▫ Takes as input a sequence of items, and produce a fixed size vector that summarizes that sequence Key NN architectures for NLP
  • 36.
  • 37.
    37 • Can beused with fixed/variable input sizes • Can be used wherever linear models were used • Useful in integrating pre-trained word embeddings MLP in NLP
  • 38.
    38 Convolutional Neural Networks Convolution Specializedfeed-forward architectures that excel at extracting local patterns in the data
  • 39.
  • 40.
    40 Convolutional Neural Networks easilyintegrate pre-trained word embeddings
  • 41.
    41 ▫ Specialized feed-forwardarchitectures that extracts local patterns in the data ▫ Fixed/Variable sized inputs ▫ Works well in identifying phrases/idioms CNNs in NLP
  • 42.
    42 Recurrent Neural Networks •A recurrent neural network can be thought of as multiple copies of the same network, each passing a message to a successor. 1 http://colah.github.io/posts/2015-08-Understanding-LSTMs/
  • 43.
    43 Used to generaterepresentations that are typically used in conjunction with MLPs Great for sequences Addresses many challenges in language modeling (markov assumptions, sparsity etc.) RNNs in NLP
  • 44.
    44 • Sequence-to-sequence models(Encoder-Decoder) for machine translation • Learning from external, unannotated data (Semi-supervised models) Other NN model applications
  • 45.
    45 • Input: posts,labels as positive / negative. • Goal: build a classifier to classify new posts • IMDB Dataset: http://ai.stanford.edu/~amaas/data/sentiment/ • 25,000 highly polar movie reviews for training, and 25,000 for testing. Sample application: sentiment detection
  • 46.
    46 • Goal: getfamiliar with the problem and establish a simple baseline. • Overview: ▫ Load the data ▫ Look at a sample of positive and negative reviews ▫ Look at some distributional data • Code: 08-imdb-explore.ipynb Demo: IMDB dataset exploration
  • 47.
  • 48.
    48 • Can’t learnthem all individually… • Instead, want to have a representation that encodes relationships between words, so we can learn e.g. that all “negative” words make it more likely the review is negative. Challenge: many ways to say same thing
  • 49.
    49 • Want computerto understand word relationships ▫ Man : King; Woman : ??? ▫ Fish : Ocean; Gazelle : ??? • Goals: ▫ Encode semantic relationship between words: similarity, differences, etc. ▫ Represent each word in a concise way Let’s start “simple”: understanding individual words
  • 50.
    50 • An embeddingis a map word -> vector that makes similar words have similar vectors, and encodes semantic relationships. • Creating an embedding: ▫ Look at a lot of text.  “there was a frog in the swamp”  “artificial intelligence has a long way to go”  “whether ’tis nobler in the mind to suffer the slings and arrows of outrageous fortune” ▫ Learn what words tend to go together, which don’t. Approach: embeddings
  • 51.
    51 • Learn topredict neighbors of a word. • Compute co-occurrence counts: • “there was a frog in a swamp” • P(swamp,frog) = … • P(artificial,frog) = … • … • Train a model word -> vector to minimize d(v1,v2) where P(w1,w2) is high. Creating an embedding
  • 52.
    52 Frog: Swamp: Computer: … Compute error inpredicting P(w1,w2) given d(v1,v2). Update weights: Frog: Swamp: Computer: Creating an embedding [0.2, 0.7, 0.11, …, 0.52] [0.9, 0.55, 0.4, …, 0.8] [0.3, 0.6, 0.01, …, 0.7] [0.3, 0.65, 0.3, …, 0.6] [0.7, 0.6, 0.4, …, 0.7] [0.5, 0.3, 0.02, …, 0.4] 1) 2) 3)
  • 53.
  • 54.
  • 55.
  • 56.
    56 • Pre-trained embeddingsare available: ▫ Google News (100B words) ▫ Twitter (27B words) ▫ Wikipedia + Gigaword (newswire corpus) (6B words) • It’s better to train/fine-tune for your specific application, but these are a good place to start ▫ Especially if you don’t have much data You don’t have to train your own embedding List from https://github.com/3Top/word2vec-api
  • 57.
    57 • Let’s applythe approaches we already know to our movie review sentiment task Ok, now we have a reasonable way to represent words
  • 58.
    58 • Goal: usefamiliar network architectures for text classification • Overview: ▫ Prepare the dataset ▫ Use a pre-trained embedding ▫ Train a MLP ▫ Train a 1D CNN • Code: 09-imdb-mlp-cnn.ipynb Demo: MLPs and CNNs for sentiment analysis
  • 59.
  • 60.
    60 “In 2009, Iwent to Nepal” “I went to Nepal in 2009” “I had high expectations, and this movie exceeded them.” • Need to remember what we saw earlier. • Time series → predict next element Challenge: the state-time continuum
  • 61.
    61 Solution: let thenetwork represent the past
  • 62.
    62 Our networks sofar Hidden layers Input Output
  • 63.
    63 Recurrent Neural Networks(RNNs) Hidden layers Input Recurrent connection Output
  • 64.
    64 Another view ofRNNs Hidden layers Input 1 Output Hidden layers Input N Output … This Recurrent connection Recurrent connection Recurrent connection movie monkeys… Hidden layers Input 2 Output
  • 65.
    65 Variant: one output Hidden layers Input1 Hidden layers Input 2 … This Recurrent connection Recurrent connection Recurrent connection movie monkeys… Hidden layers Input N Output
  • 66.
  • 67.
    67 New parameters: Hidden layers Input Output Hidden-to-hidden weights Input-to-hiddenweights Hidden-to-output weightsHow to combine two arrows leading to hidden state? Add contribution of input + previous hidden state
  • 68.
    68 • The samestate transformation for each time step Question: where is the parameter sharing in an RNN? Hidden layers Input 1 Hidden layers Input 2 … Same parameters! Hidden layers Input N Output Same parameters!
  • 69.
    69 • Again, backpropagationjust works! • In theory… • Long-term dependencies are a problem ▫ Vanishing gradients ▫ Exploding gradients • Solutions: ▫ Careful initialization ▫ Short sequences ▫ More advanced techniques, such as LSTM Training RNNs
  • 70.
    70 • As mentionedRNNs have a problem: long-term dependencies ▫ Gradients disappear or blow up • One solution: LSTM – let network learn when to remember, when to forget • Used in practice LSTM – Long Short-Term Memory networks
  • 71.
    71 Demo: simple RNNfor text generation
  • 72.
  • 73.
  • 74.
    74 • Translate (seq2seq) •Caption (vec2seq) • Visualize or classify text (seq2vec) What if input + output have different length, or type?
  • 75.
    75 Encoder-decoder architecture Hidden layers Input 1 Hidden layers Input2 Hidden layers Input N … Hidden layers Output 1 Hidden layers Output 2 Hidden layers Output M … Encoding “Thought vector”
  • 76.
    76 Encoder-decoder variant: vec2seq Hidden layers Input1 Hidden layers Hidden layers … Hidden layers Output 1 Hidden layers Output 2 Hidden layers Output M … Encoding “Thought vector”
  • 77.
    77 • Goal: learnto caption images • Overview: ▫ Learn abstract representations of images using a CNN ▫ Learn to map those abstract representations to sentences ▫ Train the system end-to-end • Code sketch: 10-image-captioning.ipynb Demo: captioning images
  • 78.
  • 79.
    79 • Code +Environment • Dynamic scalability • Enterprise collaboration • Model Management • One platform for all your analytical needs Why QuSandbox?
  • 80.
    Create Projects ➢ Instructorscan create projects using AMIs, DockerHub, Github as resources. ➢ Additional information such as the project type (JNS , Jupyter Lab etc) , description and name can be specified here.
  • 81.
    Run Projects ➢ QuSandboxallows users to run a wide variety of projects hosted on various platforms such as AMIs, Docker Hub, Git repos. ➢ While launching the user can configure specifications like the project source, the machine type, duration and the credits used for this session. ➢ Users are allowed to run more than 1 project at a time.
  • 82.
    Launch Labs On launchingthe lab users can : - Modify and run jupyter notebook files, labs and other components linked to the project. - Explore the project structure, create new files and keep track of work from previous sessions.
  • 83.
    ➢ Set upaccount information username, personal details and password. ➢ Specify courses that user wants to registered for . ➢ Multi-role profiles allows user to register as one or more roles using the same account. Enterprise features – User and Roles
  • 84.
    Enterprise features –Credential management Amazon Credentials - Update aws keys and pem file to grant permission to use ec2 services for running, stopping , terminating and extending instances. Github Credentials - Update the github username and password to allow saving project work on github. * All credentials are securely encrypted and stored in the database.
  • 85.
    Admin tools -Manage Tasks - Running projects can be managed on the Tasks page. Information such as task and instance status, time remaining as well as past projects information can be viewed here. - The core project features (LAUNCH, EXTEND, STOP and KILL) can be performed by the designated buttons in actions field of the task.
  • 86.
    Academic use case- Courses Instructors can use the course page to create and edit lecture components such as slides, reading materials and quizzes. Students can view the uploaded material and submit assignments for the lectures if they are registered for the respective courses.
  • 87.
    Command Line Interfaceon QuSandbox The Command Line Interface is a unified tool that provides a consistent interface for interacting with all parts of QuSandbox. Run a specific project defined by Json file. After completing configuration, an IP address will be given and user can use the public ip address to run the project. PythonJavaScrip t
  • 88.
    More Features onCLI use >Qusandbox -help to get more features’ detail
  • 89.
    Research Hub onQuSandbox The research hub on QUSandbox allows group of people working on a project to share and run it seamlessly . https://researchhub.herokuapp.com/homepage 1. Button linking the project to QUSandbox. 2. View the project on QUSandbox.
  • 90.
    Research Hub onQuSandbox The research hub on QUSandbox allows group of people working on a project to share and run it seamlessly. ➢ Each project associated with a unique ProjectName. ➢ Create embed link for each project. ➢ Use the link from anywhere to hit QUSandbox.
  • 91.
    Coming soon! 92 Logistics: When: June14,15th Where: Boston MA Registration: http://qu-nlp.eventbrite.com/ Code: 25% off all ticket levels QU25 till 5/4/2018 Code and slides for today’s workshop: Request at: https://tinyurl.com/QUNLP2018
  • 92.
  • 93.
    Coming soon! 94 Logistics: When: June14,15th Where: Boston MA Registration: http://qu-nlp.eventbrite.com/ Code: 25% off all ticket levels QU25 till 5/4/2018 Code and slides for today’s workshop: Request at: https://tinyurl.com/QUNLP2018
  • 94.
    Thank you! Presentations willbe posted here: www.analyticscertificate.com Sri Krishnamurthy, CFA, CAP Founder and CEO QuantUniversity LLC. srikrishnamurthy www.QuantUniversity.com Information, data and drawings embodied in this presentation are strictly a property of QuantUniversity LLC. and shall not be distributed or used in any other publication without the prior written consent of QuantUniversity LLC. 95