SlideShare a Scribd company logo
NLP & Deep
Learning for
non-experts
Sanghamitra Deb
Staff Data Scientist
Chegg Inc
How to start projects in machine learning?
• Kaggle competitions ---
• Make sure to solve the ML problems for concept development
before competing
How to start projects in machine learning?
• Kaggle competitions ---
• Make sure to solve the ML
problems for concept
development before
competing
How to start projects in machine learning?
• Self guided workshops/projects ---
lets say you have data from Zomato
• Restaurant recommendation --
user based, content similarity
based.
• Restaurant tags from reviews.
• Sentiment analysis from reviews.
Outline
• What is NLP
• Bag of Words model for sentiment analysis using scikit learn
• DeepDive into deep learning
• Solve the sentiment analysis problem using keras
• A short into Convolution Neural Networks (CNN)
What is Natural
Language Processing?
• Giving structure to unstructured data
• Learn properties of the data that makes
decision making simple
• Provide concise information to drive
intelligence of different systems.
Why?
• Unstructured data cannot be consumed
directly
• Automate simple and complex
functionalities
• Inferences from text data becomes
queriable. This could help with regular BU
reports
• Understand customers better and take
necessary actions for better experience.
Applications
• Categorization of text
• Building domain specific Knowledge Graph
• Recommendations
• Web --- Search
• HR --- people analytics
• Medical --- drug discovery, automated
diagnosis
• ………..
What are the underlying tasks?
• Syntactic Parsing of sentences --- parsing based on structure
• Part of Speech Tagging
• Semantic Parsing -- mapping text directly into formal query language,
e.g. SQL queries for a pre-determined database schema.
• Dialogue state tracking --- chatbots
• Machine Translation
• Language modeling
• Text extraction
• Classification
Text Classification
Text Pre - processing Collecting Training Data Model Building
Offline
SME
• Reduces noise
• Ensures quality
• Improves overall performance
• Training Data Collection / Examples
of classes that we are trying to model
• Model performance is directly
correlated with quality of training
data
• Model selection
• Architecture
• Parameter Tuning
User
Online
Model Evaluation
Text Data
Data Source -- https://archive.ics.uci.edu/ml/datasets/Sentiment+Labelled+Sentences
Model Building: a simple Bag of words (BOW)
model
https://realpython.com/python-keras-text-classification/
Model Building: a simple BOW model
https://realpython.com/python-keras-text-classification/
Deep
Learning
Deep learning algorithms seek
to exploit the unknown
structure in the input
distribution in order to discover
good representations, often at
multiple levels, with higher-level
learned features defined in
terms of lower-level features.
--- Yoshua Bengio
a kind of
learning where
the
representation
you form have
several levels of
abstraction,
rather than a
direct input to
output --- Peter
Norvig
When you hear the term deep learning, just think
of a large deep neural net. Deep refers to the
number of layers typically and so this kind of the
popular term that’s been adopted in the press. I
think of them as deep neural networks generally.
--- Andrew Ng
Why now?
• Explosion in labelled data.
• Exponential growth in
computation power with
cloud computing and
availability of GPUs
• Improvements in setting
initial conditions and
activation functions
Neural Network
Simulate the brain and get neurons densely interconnected in a
computer such that it can learn things, recognize patterns and take
decisions?
Neural Network
Simulate the brain and get neurons densely interconnected in a
computer such that it can learn things, recognize patterns and take
decisions?
What is a neuron?
Neural Network
Simulate the brain and get neurons densely interconnected in a
computer such that it can learn things, recognize patterns and take
decisions?
What is a neuron?
What is neuron?
https://www.slideshare.net/tw_dsconf/ss-62245351
a1
a2
a3
What is neuron?
https://www.slideshare.net/tw_dsconf/ss-62245351
a1
a2
a3
Neural Network
a1
a2
a3
Neural Network
• Each node is a function with input
and output vectors
• Every network structure is defined
by a set of functions
Output Layer
• Loss is minimized using
Gradient Descent
• Find network parameters
such that the loss is
minimized
• This is done by taking
derivatives of the loss wrt
parameters.
• Next the parameters are
updated by subtracting
learning rate times the
derivative
Commonly
used loss
functions
• Mean Squared Error Loss
• Mean Squared Logarithmic Error Loss
• Mean Absolute Error Loss
Regression Loss Functions
• Binary Cross-Entropy
• Hinge Loss
• Squared Hinge Loss
Binary Classification Loss Functions
• Multi-Class Cross-Entropy Loss
• Sparse Multiclass Cross-Entropy Loss
• Kullback Leibler Divergence Loss
Multi-Class Classification Loss Functions
Cost
Function –
Cross
Entropy
Dropout -- avoid overfitting
• Large weights in a neural network are a
sign of a more complex network that has
overfit the training data.
• Probabilistically dropping out nodes in the
network is a simple and effective
regularization method.
• A large network with more training and the
use of a weight constraint are suggested
when using dropout.
Optimization
Techniques
Gradient Descent
Adagrad
RMSprop
Adam
…
Adam Optimization
• adaptive moment estimation
• The method computes individual adaptive learning rates for different
parameters from estimates of first and second moments of the
gradients.
• Calculates an exponential moving average of the gradient and the
squared gradient, parameters control the decay rates of these moving
averages.
https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/
Activation Functions
• Sigmoid/ Softmax
• Tanh
• Relu
• Leaky Relu
• Swish
Activation Functions
• Sigmoid/ Softmax
• Tanh
• Relu
• Leaky Relu
• Swish
Activation Functions
• Sigmoid/ Softmax
• Tanh
• Relu
• Leaky Relu
• Swish
Derivative
Activation Functions
• Sigmoid/ Softmax
• Tanh
• Relu
• Leaky Relu
• Swish
Derivative
a = max(0,z)
Activation Functions
• Sigmoid/ Softmax
• Tanh
• Relu
• Leaky Relu
• Swish
Activation Functions
• Sigmoid/ Softmax
• Tanh
• Relu
• Leaky Relu
• Swish
https://arxiv.org/abs/1710.05941v1
Text Classification Reminder!
https://realpython.com/python-keras-text-classification/
Text Classification using feed forward NN
https://realpython.com/python-keras-text-classification/
Text Classification using feed forward NN
Fit & measure accuracy!
plot_history(history)
Clearly overfits the data!
Can we do better? Word Embeddings
• Words are represented as dense
vectors
• These vectors are
• Learned during the training
task by the neural network
• Pre-trained, learned from
Language Models
• Encode the semantic meaning of
the word.
Text Pre-processing with Keras
PaddingTokenizing
Start with an Embedding Layer
• Embedding Layer of Keras which takes the previously calculated integers and
maps them to a dense vector of the embedding.
o Parameters
Ø input_dim: the size of the vocabulary
Ø output_dim: the size of the dense vector
Ø input_length: the length of the sequence
Hope to see you soon
Nice to see you again
After training
https://stats.stackexchange.com/questions/270546/how-does-keras-embedding-layer-work
Add a pooling layer
• MaxPooling1D/AveragePooling1D or
a GlobalMaxPooling1D/GlobalAveragePooling1D layer
• way to downsample (a way to reduce the size of) the incoming
feature vectors.
• Global max/average pooling takes the maximum/average of all
features whereas in the other case you have to define the pool size.
Definition of
the entire
model
Training
Using pre-trained word embeddings will lead to an accuracy of
0.82. This is a case of transfer learning.
https://realpython.com/python-keras-text-classification
Embeddings + Maxpooling -- Benifits
• Power of generalization --- embeddings are able to share information
across similar features.
• Fewer nodes with zero values.
Convolution Neural Network
Detect features ! Downsample.
What is a CNN?
In a traditional feedforward neural network we connect each
input neuron to each output neuron in the next layer. That’s
also called a fully connected layer, or affine layer.
• We use convolutions over the input layer to compute the
output. This results in local connections, where each region
of the input is connected to a neuron in the output. Each
layer applies different filters and combines the result
• During the training phase, a CNN automatically learns the
values of its filters based on the task you want to perform.
Tricky --- dimensions keep changing as we go from one layer to another
Model definition
Embedding_dim = 50
maxlen=10
Advantages
of CNN
• Character Based CNN
• Has the ability to deal with out of vocabulary
words. This makes it particularly suitable for user
generated raw text.
• Works for multiple languages.
• Model size is small since the tokens are limited to
the number of characters ~ 70. This makes real
life deployments easier and faster.
• Networks with convolutional and pooling
layers are useful for classification tasks in
which we expect to find strong local clues
regarding class membership.
Takeaways!
• If you have text data you need to use NLP
• Try a simple bag of words model for your data
• Having a high level understanding of deep learning will help with
better judgement in architecture design and choice of parameters.
• Deep Learning has the potential to give high performance, you do
need large amount of training data for the benefits.
Thank You
@sangha_deb
sangha123@gmail.com
Visualization of the architecture
50
10
GlobalMaxPool1D
DenseLayer
Sigmoid
Cov1D
Some helpful courses
https://www.coursera.org/learn/classification-vector-spaces-in-nlp
Appendix
Transfer Learning
Character Based CNNs.
https://papers.nips.cc/paper/5782-character-level-convolutional-networks-for-text-classification.pdf
• Embedding Layer
• Six convolutional layers, and 3 convolutional layers followed by a max pooling layer
• Two fully connected layer(dense layer in keras), neuron units are 1024.
• Output layer(dense layer), neuron units depends on classes. In this task, we set it 4.
Pre-processing
Setting Embedding Weights
Model
https://towardsdatascience.com/character-level-cnn-with-keras-50391c3adf33

More Related Content

What's hot

Word embedding
Word embedding Word embedding
Word embedding
ShivaniChoudhary74
 
Week 4 advanced labeling, augmentation and data preprocessing
Week 4   advanced labeling, augmentation and data preprocessingWeek 4   advanced labeling, augmentation and data preprocessing
Week 4 advanced labeling, augmentation and data preprocessing
Ajay Taneja
 
LearningKit.ppt
LearningKit.pptLearningKit.ppt
LearningKit.pptbutest
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningLior Rokach
 
Course 2 Machine Learning Data LifeCycle in Production - Week 1
Course 2   Machine Learning Data LifeCycle in Production - Week 1Course 2   Machine Learning Data LifeCycle in Production - Week 1
Course 2 Machine Learning Data LifeCycle in Production - Week 1
Ajay Taneja
 
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Madhav Mishra
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Shahar Cohen
 
C3 w4
C3 w4C3 w4
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter Tuning
Shubhmay Potdar
 
Lecture 5 machine learning updated
Lecture 5   machine learning updatedLecture 5   machine learning updated
Lecture 5 machine learning updated
Vajira Thambawita
 
DagdelenSiriwardaneY..
DagdelenSiriwardaneY..DagdelenSiriwardaneY..
DagdelenSiriwardaneY..butest
 
"An Introduction to Machine Learning and How to Teach Machines to See," a Pre...
"An Introduction to Machine Learning and How to Teach Machines to See," a Pre..."An Introduction to Machine Learning and How to Teach Machines to See," a Pre...
"An Introduction to Machine Learning and How to Teach Machines to See," a Pre...
Edge AI and Vision Alliance
 
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
Madhav Mishra
 
Machine Learning Data Life Cycle in Production (Week 2 feature engineering...
 Machine Learning Data Life Cycle in Production (Week 2   feature engineering... Machine Learning Data Life Cycle in Production (Week 2   feature engineering...
Machine Learning Data Life Cycle in Production (Week 2 feature engineering...
Ajay Taneja
 
Analytics Boot Camp - Slides
Analytics Boot Camp - SlidesAnalytics Boot Camp - Slides
Analytics Boot Camp - Slides
Aditya Joshi
 
Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.
Sheeyam Shellvacumar
 
C3 w3
C3 w3C3 w3
Deep Learning Interview Questions and Answers | Edureka
Deep Learning Interview Questions and Answers | EdurekaDeep Learning Interview Questions and Answers | Edureka
Deep Learning Interview Questions and Answers | Edureka
Edureka!
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
HJ van Veen
 
Machine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptMachine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptbutest
 

What's hot (20)

Word embedding
Word embedding Word embedding
Word embedding
 
Week 4 advanced labeling, augmentation and data preprocessing
Week 4   advanced labeling, augmentation and data preprocessingWeek 4   advanced labeling, augmentation and data preprocessing
Week 4 advanced labeling, augmentation and data preprocessing
 
LearningKit.ppt
LearningKit.pptLearningKit.ppt
LearningKit.ppt
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Course 2 Machine Learning Data LifeCycle in Production - Week 1
Course 2   Machine Learning Data LifeCycle in Production - Week 1Course 2   Machine Learning Data LifeCycle in Production - Week 1
Course 2 Machine Learning Data LifeCycle in Production - Week 1
 
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
C3 w4
C3 w4C3 w4
C3 w4
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter Tuning
 
Lecture 5 machine learning updated
Lecture 5   machine learning updatedLecture 5   machine learning updated
Lecture 5 machine learning updated
 
DagdelenSiriwardaneY..
DagdelenSiriwardaneY..DagdelenSiriwardaneY..
DagdelenSiriwardaneY..
 
"An Introduction to Machine Learning and How to Teach Machines to See," a Pre...
"An Introduction to Machine Learning and How to Teach Machines to See," a Pre..."An Introduction to Machine Learning and How to Teach Machines to See," a Pre...
"An Introduction to Machine Learning and How to Teach Machines to See," a Pre...
 
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
 
Machine Learning Data Life Cycle in Production (Week 2 feature engineering...
 Machine Learning Data Life Cycle in Production (Week 2   feature engineering... Machine Learning Data Life Cycle in Production (Week 2   feature engineering...
Machine Learning Data Life Cycle in Production (Week 2 feature engineering...
 
Analytics Boot Camp - Slides
Analytics Boot Camp - SlidesAnalytics Boot Camp - Slides
Analytics Boot Camp - Slides
 
Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.
 
C3 w3
C3 w3C3 w3
C3 w3
 
Deep Learning Interview Questions and Answers | Edureka
Deep Learning Interview Questions and Answers | EdurekaDeep Learning Interview Questions and Answers | Edureka
Deep Learning Interview Questions and Answers | Edureka
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
Machine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptMachine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.ppt
 

Similar to NLP and Deep Learning for non_experts

Natural Language Processing Advancements By Deep Learning: A Survey
Natural Language Processing Advancements By Deep Learning: A SurveyNatural Language Processing Advancements By Deep Learning: A Survey
Natural Language Processing Advancements By Deep Learning: A Survey
Rimzim Thube
 
presentation.ppt
presentation.pptpresentation.ppt
presentation.ppt
MadhuriChandanbatwe
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesTuri, Inc.
 
Deep Learning for Machine Translation
Deep Learning for Machine TranslationDeep Learning for Machine Translation
Deep Learning for Machine Translation
Matīss ‎‎‎‎‎‎‎  
 
How to Build Deep Learning Models
How to Build Deep Learning ModelsHow to Build Deep Learning Models
How to Build Deep Learning Models
Josh Patterson
 
AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...
Vandana Kannan
 
AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...
Apache MXNet
 
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech TalksA Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
Amazon Web Services
 
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech TalksA Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
Amazon Web Services
 
Synthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningSynthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep Learning
S N
 
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
Deep Learning For Practitioners,  lecture 2: Selecting the right applications...Deep Learning For Practitioners,  lecture 2: Selecting the right applications...
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
ananth
 
Deep Domain
Deep DomainDeep Domain
Deep Domain
Zachary S. Brown
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Abhishek Bhandwaldar
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackbox
Ivo Andreev
 
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
Apache MXNet
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014
Paris Open Source Summit
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Saurabh Kaushik
 
Distributed Deep Learning with Docker at Salesforce
Distributed Deep Learning with Docker at SalesforceDistributed Deep Learning with Docker at Salesforce
Distributed Deep Learning with Docker at Salesforce
Docker, Inc.
 
Artificial Intelligence and Deep Learning in Azure, CNTK and Tensorflow
Artificial Intelligence and Deep Learning in Azure, CNTK and TensorflowArtificial Intelligence and Deep Learning in Azure, CNTK and Tensorflow
Artificial Intelligence and Deep Learning in Azure, CNTK and Tensorflow
Jen Stirrup
 
Startup.Ml: Using neon for NLP and Localization Applications
Startup.Ml: Using neon for NLP and Localization Applications Startup.Ml: Using neon for NLP and Localization Applications
Startup.Ml: Using neon for NLP and Localization Applications
Intel Nervana
 

Similar to NLP and Deep Learning for non_experts (20)

Natural Language Processing Advancements By Deep Learning: A Survey
Natural Language Processing Advancements By Deep Learning: A SurveyNatural Language Processing Advancements By Deep Learning: A Survey
Natural Language Processing Advancements By Deep Learning: A Survey
 
presentation.ppt
presentation.pptpresentation.ppt
presentation.ppt
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
 
Deep Learning for Machine Translation
Deep Learning for Machine TranslationDeep Learning for Machine Translation
Deep Learning for Machine Translation
 
How to Build Deep Learning Models
How to Build Deep Learning ModelsHow to Build Deep Learning Models
How to Build Deep Learning Models
 
AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...
 
AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...
 
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech TalksA Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
 
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech TalksA Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
 
Synthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningSynthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep Learning
 
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
Deep Learning For Practitioners,  lecture 2: Selecting the right applications...Deep Learning For Practitioners,  lecture 2: Selecting the right applications...
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
 
Deep Domain
Deep DomainDeep Domain
Deep Domain
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackbox
 
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
 
Distributed Deep Learning with Docker at Salesforce
Distributed Deep Learning with Docker at SalesforceDistributed Deep Learning with Docker at Salesforce
Distributed Deep Learning with Docker at Salesforce
 
Artificial Intelligence and Deep Learning in Azure, CNTK and Tensorflow
Artificial Intelligence and Deep Learning in Azure, CNTK and TensorflowArtificial Intelligence and Deep Learning in Azure, CNTK and Tensorflow
Artificial Intelligence and Deep Learning in Azure, CNTK and Tensorflow
 
Startup.Ml: Using neon for NLP and Localization Applications
Startup.Ml: Using neon for NLP and Localization Applications Startup.Ml: Using neon for NLP and Localization Applications
Startup.Ml: Using neon for NLP and Localization Applications
 

More from Sanghamitra Deb

odsc_2023.pdf
odsc_2023.pdfodsc_2023.pdf
odsc_2023.pdf
Sanghamitra Deb
 
Multi-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningMulti-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learning
Sanghamitra Deb
 
Computer Vision Landscape : Present and Future
Computer Vision Landscape : Present and FutureComputer Vision Landscape : Present and Future
Computer Vision Landscape : Present and Future
Sanghamitra Deb
 
Intro to NLP: Text Categorization and Topic Modeling
Intro to NLP: Text Categorization and Topic ModelingIntro to NLP: Text Categorization and Topic Modeling
Intro to NLP: Text Categorization and Topic Modeling
Sanghamitra Deb
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
Sanghamitra Deb
 
NLP and Machine Learning for non-experts
NLP and Machine Learning for non-expertsNLP and Machine Learning for non-experts
NLP and Machine Learning for non-experts
Sanghamitra Deb
 
Democratizing NLP content modeling with transfer learning using GPUs
Democratizing NLP content modeling with transfer learning using GPUsDemocratizing NLP content modeling with transfer learning using GPUs
Democratizing NLP content modeling with transfer learning using GPUs
Sanghamitra Deb
 
Natural Language Comprehension: Human Machine Collaboration.
Natural Language Comprehension: Human Machine Collaboration.Natural Language Comprehension: Human Machine Collaboration.
Natural Language Comprehension: Human Machine Collaboration.
Sanghamitra Deb
 
Data day2017
Data day2017Data day2017
Data day2017
Sanghamitra Deb
 
Extracting knowledgebase from text
Extracting knowledgebase from textExtracting knowledgebase from text
Extracting knowledgebase from text
Sanghamitra Deb
 
Extracting medical attributes and finding relations
Extracting medical attributes and finding relationsExtracting medical attributes and finding relations
Extracting medical attributes and finding relations
Sanghamitra Deb
 
From Rocket Science to Data Science
From Rocket Science to Data ScienceFrom Rocket Science to Data Science
From Rocket Science to Data Science
Sanghamitra Deb
 
Understanding Product Attributes from Reviews
Understanding Product Attributes from ReviewsUnderstanding Product Attributes from Reviews
Understanding Product Attributes from Reviews
Sanghamitra Deb
 

More from Sanghamitra Deb (13)

odsc_2023.pdf
odsc_2023.pdfodsc_2023.pdf
odsc_2023.pdf
 
Multi-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningMulti-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learning
 
Computer Vision Landscape : Present and Future
Computer Vision Landscape : Present and FutureComputer Vision Landscape : Present and Future
Computer Vision Landscape : Present and Future
 
Intro to NLP: Text Categorization and Topic Modeling
Intro to NLP: Text Categorization and Topic ModelingIntro to NLP: Text Categorization and Topic Modeling
Intro to NLP: Text Categorization and Topic Modeling
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
 
NLP and Machine Learning for non-experts
NLP and Machine Learning for non-expertsNLP and Machine Learning for non-experts
NLP and Machine Learning for non-experts
 
Democratizing NLP content modeling with transfer learning using GPUs
Democratizing NLP content modeling with transfer learning using GPUsDemocratizing NLP content modeling with transfer learning using GPUs
Democratizing NLP content modeling with transfer learning using GPUs
 
Natural Language Comprehension: Human Machine Collaboration.
Natural Language Comprehension: Human Machine Collaboration.Natural Language Comprehension: Human Machine Collaboration.
Natural Language Comprehension: Human Machine Collaboration.
 
Data day2017
Data day2017Data day2017
Data day2017
 
Extracting knowledgebase from text
Extracting knowledgebase from textExtracting knowledgebase from text
Extracting knowledgebase from text
 
Extracting medical attributes and finding relations
Extracting medical attributes and finding relationsExtracting medical attributes and finding relations
Extracting medical attributes and finding relations
 
From Rocket Science to Data Science
From Rocket Science to Data ScienceFrom Rocket Science to Data Science
From Rocket Science to Data Science
 
Understanding Product Attributes from Reviews
Understanding Product Attributes from ReviewsUnderstanding Product Attributes from Reviews
Understanding Product Attributes from Reviews
 

Recently uploaded

weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
Pratik Pawar
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
TeeVichai
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
Pipe Restoration Solutions
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
ankuprajapati0525
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
Kamal Acharya
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
Kamal Acharya
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
PrashantGoswami42
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
Intella Parts
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
karthi keyan
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
Kamal Acharya
 

Recently uploaded (20)

weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 
The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
CME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional ElectiveCME397 Surface Engineering- Professional Elective
CME397 Surface Engineering- Professional Elective
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
 

NLP and Deep Learning for non_experts

  • 1. NLP & Deep Learning for non-experts Sanghamitra Deb Staff Data Scientist Chegg Inc
  • 2. How to start projects in machine learning? • Kaggle competitions --- • Make sure to solve the ML problems for concept development before competing
  • 3. How to start projects in machine learning? • Kaggle competitions --- • Make sure to solve the ML problems for concept development before competing
  • 4. How to start projects in machine learning? • Self guided workshops/projects --- lets say you have data from Zomato • Restaurant recommendation -- user based, content similarity based. • Restaurant tags from reviews. • Sentiment analysis from reviews.
  • 5. Outline • What is NLP • Bag of Words model for sentiment analysis using scikit learn • DeepDive into deep learning • Solve the sentiment analysis problem using keras • A short into Convolution Neural Networks (CNN)
  • 6. What is Natural Language Processing? • Giving structure to unstructured data • Learn properties of the data that makes decision making simple • Provide concise information to drive intelligence of different systems.
  • 7. Why? • Unstructured data cannot be consumed directly • Automate simple and complex functionalities • Inferences from text data becomes queriable. This could help with regular BU reports • Understand customers better and take necessary actions for better experience.
  • 8. Applications • Categorization of text • Building domain specific Knowledge Graph • Recommendations • Web --- Search • HR --- people analytics • Medical --- drug discovery, automated diagnosis • ………..
  • 9. What are the underlying tasks? • Syntactic Parsing of sentences --- parsing based on structure • Part of Speech Tagging • Semantic Parsing -- mapping text directly into formal query language, e.g. SQL queries for a pre-determined database schema. • Dialogue state tracking --- chatbots • Machine Translation • Language modeling • Text extraction • Classification
  • 10. Text Classification Text Pre - processing Collecting Training Data Model Building Offline SME • Reduces noise • Ensures quality • Improves overall performance • Training Data Collection / Examples of classes that we are trying to model • Model performance is directly correlated with quality of training data • Model selection • Architecture • Parameter Tuning User Online Model Evaluation
  • 11. Text Data Data Source -- https://archive.ics.uci.edu/ml/datasets/Sentiment+Labelled+Sentences
  • 12. Model Building: a simple Bag of words (BOW) model https://realpython.com/python-keras-text-classification/
  • 13. Model Building: a simple BOW model https://realpython.com/python-keras-text-classification/
  • 14. Deep Learning Deep learning algorithms seek to exploit the unknown structure in the input distribution in order to discover good representations, often at multiple levels, with higher-level learned features defined in terms of lower-level features. --- Yoshua Bengio a kind of learning where the representation you form have several levels of abstraction, rather than a direct input to output --- Peter Norvig When you hear the term deep learning, just think of a large deep neural net. Deep refers to the number of layers typically and so this kind of the popular term that’s been adopted in the press. I think of them as deep neural networks generally. --- Andrew Ng
  • 15. Why now? • Explosion in labelled data. • Exponential growth in computation power with cloud computing and availability of GPUs • Improvements in setting initial conditions and activation functions
  • 16. Neural Network Simulate the brain and get neurons densely interconnected in a computer such that it can learn things, recognize patterns and take decisions?
  • 17. Neural Network Simulate the brain and get neurons densely interconnected in a computer such that it can learn things, recognize patterns and take decisions? What is a neuron?
  • 18. Neural Network Simulate the brain and get neurons densely interconnected in a computer such that it can learn things, recognize patterns and take decisions? What is a neuron?
  • 22. Neural Network • Each node is a function with input and output vectors • Every network structure is defined by a set of functions
  • 24. • Loss is minimized using Gradient Descent • Find network parameters such that the loss is minimized • This is done by taking derivatives of the loss wrt parameters. • Next the parameters are updated by subtracting learning rate times the derivative
  • 25. Commonly used loss functions • Mean Squared Error Loss • Mean Squared Logarithmic Error Loss • Mean Absolute Error Loss Regression Loss Functions • Binary Cross-Entropy • Hinge Loss • Squared Hinge Loss Binary Classification Loss Functions • Multi-Class Cross-Entropy Loss • Sparse Multiclass Cross-Entropy Loss • Kullback Leibler Divergence Loss Multi-Class Classification Loss Functions
  • 27. Dropout -- avoid overfitting • Large weights in a neural network are a sign of a more complex network that has overfit the training data. • Probabilistically dropping out nodes in the network is a simple and effective regularization method. • A large network with more training and the use of a weight constraint are suggested when using dropout.
  • 29. Adam Optimization • adaptive moment estimation • The method computes individual adaptive learning rates for different parameters from estimates of first and second moments of the gradients. • Calculates an exponential moving average of the gradient and the squared gradient, parameters control the decay rates of these moving averages. https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/
  • 30. Activation Functions • Sigmoid/ Softmax • Tanh • Relu • Leaky Relu • Swish
  • 31. Activation Functions • Sigmoid/ Softmax • Tanh • Relu • Leaky Relu • Swish
  • 32. Activation Functions • Sigmoid/ Softmax • Tanh • Relu • Leaky Relu • Swish Derivative
  • 33. Activation Functions • Sigmoid/ Softmax • Tanh • Relu • Leaky Relu • Swish Derivative a = max(0,z)
  • 34. Activation Functions • Sigmoid/ Softmax • Tanh • Relu • Leaky Relu • Swish
  • 35. Activation Functions • Sigmoid/ Softmax • Tanh • Relu • Leaky Relu • Swish https://arxiv.org/abs/1710.05941v1
  • 37. Text Classification using feed forward NN https://realpython.com/python-keras-text-classification/
  • 38. Text Classification using feed forward NN
  • 39. Fit & measure accuracy! plot_history(history) Clearly overfits the data!
  • 40. Can we do better? Word Embeddings • Words are represented as dense vectors • These vectors are • Learned during the training task by the neural network • Pre-trained, learned from Language Models • Encode the semantic meaning of the word.
  • 41. Text Pre-processing with Keras PaddingTokenizing
  • 42. Start with an Embedding Layer • Embedding Layer of Keras which takes the previously calculated integers and maps them to a dense vector of the embedding. o Parameters Ø input_dim: the size of the vocabulary Ø output_dim: the size of the dense vector Ø input_length: the length of the sequence Hope to see you soon Nice to see you again After training https://stats.stackexchange.com/questions/270546/how-does-keras-embedding-layer-work
  • 43. Add a pooling layer • MaxPooling1D/AveragePooling1D or a GlobalMaxPooling1D/GlobalAveragePooling1D layer • way to downsample (a way to reduce the size of) the incoming feature vectors. • Global max/average pooling takes the maximum/average of all features whereas in the other case you have to define the pool size.
  • 45. Training Using pre-trained word embeddings will lead to an accuracy of 0.82. This is a case of transfer learning. https://realpython.com/python-keras-text-classification
  • 46. Embeddings + Maxpooling -- Benifits • Power of generalization --- embeddings are able to share information across similar features. • Fewer nodes with zero values.
  • 47. Convolution Neural Network Detect features ! Downsample.
  • 48. What is a CNN? In a traditional feedforward neural network we connect each input neuron to each output neuron in the next layer. That’s also called a fully connected layer, or affine layer. • We use convolutions over the input layer to compute the output. This results in local connections, where each region of the input is connected to a neuron in the output. Each layer applies different filters and combines the result • During the training phase, a CNN automatically learns the values of its filters based on the task you want to perform. Tricky --- dimensions keep changing as we go from one layer to another
  • 50. Advantages of CNN • Character Based CNN • Has the ability to deal with out of vocabulary words. This makes it particularly suitable for user generated raw text. • Works for multiple languages. • Model size is small since the tokens are limited to the number of characters ~ 70. This makes real life deployments easier and faster. • Networks with convolutional and pooling layers are useful for classification tasks in which we expect to find strong local clues regarding class membership.
  • 51. Takeaways! • If you have text data you need to use NLP • Try a simple bag of words model for your data • Having a high level understanding of deep learning will help with better judgement in architecture design and choice of parameters. • Deep Learning has the potential to give high performance, you do need large amount of training data for the benefits.
  • 53. Visualization of the architecture 50 10 GlobalMaxPool1D DenseLayer Sigmoid Cov1D
  • 57. Character Based CNNs. https://papers.nips.cc/paper/5782-character-level-convolutional-networks-for-text-classification.pdf • Embedding Layer • Six convolutional layers, and 3 convolutional layers followed by a max pooling layer • Two fully connected layer(dense layer in keras), neuron units are 1024. • Output layer(dense layer), neuron units depends on classes. In this task, we set it 4.