SlideShare a Scribd company logo
THE WONDERS OF
DEEP LEARNING: HOW
TO LEVERAGE IT FOR
NLP
DATAXDAY 2018
Paris
17/05/2018
DR. ANA PELETEIRO RAMALLO
DATA SCIENCE DIRECTOR
@PeleteiroAna
@TendamRetail
@DataXDay
1880
Founding year
2.000
Physical shops
89
Countries
10.394
Employees
1.299
Own shops
683
Franchises
2
@DataXDay
DEEP LEARNING FOR NLP
Deep learning is having a transformative impact in many areas where machine learning has been
applied.
NLP was somewhat behind other fields in terms of adopting deep learning for applications.
However, this has changed over the last few years, thanks to the use of RNNs, specifically LSTMs,
as well as word embeddings.
Distinct areas in which deep learning can be beneficial for NLP tasks, such as in named entity
recognition, machine translation and language modelling, parsing, chunking, POS tagging,
amongst others.
3
@DataXDay
WORD EMBEDDINGS
4
Representing as ids.
Encodings are arbitrary.
No information about the relationship between words.
Data sparsity.
https://www.tensorflow.org/tutorials/word2vec
Better representation for words.
Words in a continuous vector space where semantically similar words are mapped to nearby points.
Learn dense embedding vectors.
Skip-gram and CBOW
• CBOW predicts target words from the context. E.g., Tendam ?? Talk
• Skip-gram predicts source context-words from the target words. E.g., ?? conference ??
Standard preprocessing step for NLP.
Used also as a feature in supervised approaches (e.g., clustering).
Several parameters we can experiment with, e.g., the size of the word
embedding or the context window.
@DataXDay
CHARACTER EMBEDDINGS
Word embeddings are able to capture syntactic and semantic information.
POS-tagging and NER not enough.
Not the intra-word morphological and shape information, learn sub-token patterns (suffix, prefix), etc.
Out-of-vocabulary word (OOV) issue.
In languages where text is not composed of separated words but individual characters (Chinese).
We can overcome these problems by using character embeddings
5
@DataXDay
CNNs in NLP
CNNs:
effectiveness in
computer vision
tasks
Ability to extract
salient n-gram
features from the
input sentence to
create an
informative latent
semantic
Representa?on of
the sentence for
downstream tasks
Several tasks:
sentence
classifica?on,
summariza?on
6
@DataXDay
RECURRENT
NEURAL NETWORKS
7
@DataXDay
8
Why not basic Deep Nets or CNNs?
@DataXDay
Traditional neural networks and CNNs do not use information from the past,
each entry is independent.
This is fine for several applica=ons, such as classifying images.
However, several applications, such as video, or language modelling, rely on
what has happened in the past to predict the future.
Recurrent Neural Networks (RNN) are capable of conditioning the model on
previous units in the corpus.
Capability of handling inputs of arbitrary length
RNNs
Make use of sequen+al informa+on.
Output is dependent on the previous informa+on.
RNN shares the same parameter W for each step,
so less parameters we need to learn.
9
@DataXDay
h"p://cs224d.stanford.edu/lectures/CS224d-Lecture8.pdf
10
@DataXDay
http://torch.ch/blog/2016/07/25/nce.html
In theory, RNNs
are absolutely
capable of
handling such
long-term
dependencies.
Practice is ”a
bit” different.
11
RNNs (II)
1.
Parameters are
shared by all
>me steps in
the network,
the gradient at
each output
depends not
only on the
calcula>ons of
the current >me
step, but also
the previous
>me steps.
2.
Exploding
gradients:
3.
Vanishing
gradients:
4.
Easier to spot.
3.1.
Clip the gradient to a
maximum
3.2.
Relus instead of
sigmoid
4.3.
@DataXDay
4.2.
4.1.
Initialization of the
matrix to identity
matrix
Harder to iden>fy
The oversized mannish coats looked positively edible over the bun-
skimming dresses while combined with novelty knitwear such as punk-
like fisherman's sweaters. As other look, the ballet pink Elizabeth and
James jacket provides a cozy cocoon for the 20-year-old to top off her
ensemble of a T-shirt and Parker Smith jeans. But I have to admit that
my favorite is the bun-skimming dresses with the ??
• In theory, RNNs can handle of handling such long-term dependencies.
12
@DataXDay
• However, in reality, they cannot.
• LSTMs and GRUs avoid the long-term dependency problem.
• Remove or add informaEon to the cell state, carefully regulated by
structures called gates.
• Gates are a way to opEonally let informaEon through.
13
@DataXDay
LSTMs
http://cs224d.stanford.edu/lecture_notes/notes4.pdf http://colah.github.io/posts/2015-08-Understanding-LSTMs/
14
@DataXDay
GRUs
h"p://cs224d.stanford.edu/lecture_notes/notes4.pdf
15
@DataXDay
RNN architectures
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
16
@DataXDay
ATTENTION MECHANISM
h"p://www.wildml.com/2016/01/a"en5on-and-memory-in-deep-learning-and-nlp/
h"ps://medium.com/@Synced/a-brief-overview-of-a"en5on-mechanism-13c578ba9129
APPLICATIONS
Word level classifica-on: NER
Sentence classifica-on: tweet sen-ment polarity. Seman-c matching between text
Text classifica-on
Language modelling
Speech recogni-on
Cap-on genera-on
Machine transla-on
Document summariza-on
Ques-on answering
17
EX1: TEXT GENERATION
All text from Shakespeare (4.4MB)
3-layer RNN with 512 hidden nodes on
each layer.
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
https://github.com/martin-gorner/tensorflow-rnn-shakespeare
18
@DataXDay
Q&A 19
Pedro del Hierro
SS18
How can I help you today?
I was wondering
what is trending this
spring
This spring is all about new
wave slip, in for example
jumpsuits
Is that appropriate
for a work dinner?
Yes, it totally works! I would
recommend you to use this
chilly oil jumpsuit. You can
combine it with a dark brown
belt and cherry tomato heels.
All from Pedro del Hierro
That sounds great!
@DataXDay
20
PLENTY OF RESOURCES OUT THERE!
• https://distill.pub/2016/misread-tsne/
• http://www.wildml.com
• https://arxiv.org/pdf/1708.02709.pdf
• http://www.jmlr.org/papers/volume12/collobert11a/collobert11a.pdf
• http://colah.github.io/posts/2015-08-Understanding-LSTMs/
• https://nlp.stanford.edu/courses/NAACL2013/
• http://cs224d.stanford.edu/syllabus.html
• https://github.com/kjw0612/awesome-rnn
• https://lvdmaaten.github.io/tsne/
• https://github.com/oxford-cs-deepnlp-2017
@DataXDay
THANKS!
@PeleteiroAna
21
DataXDay - The wonders of deep learning: how to leverage it for natural language processing

More Related Content

Similar to DataXDay - The wonders of deep learning: how to leverage it for natural language processing

Deeplearning in finance
Deeplearning in financeDeeplearning in finance
Deeplearning in finance
Sebastien Jehan
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVA
Robert McDermott
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVA
Robert McDermott
 
AI and Machine Learning PG program
AI and Machine Learning PG programAI and Machine Learning PG program
AI and Machine Learning PG program
MamathaSharma4
 
Handwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with RHandwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with R
Poo Kuan Hoong
 
Performance Comparison between Pytorch and Mindspore
Performance Comparison between Pytorch and MindsporePerformance Comparison between Pytorch and Mindspore
Performance Comparison between Pytorch and Mindspore
ijdms
 
Data science nlp_resume-2018-abridged
Data science nlp_resume-2018-abridgedData science nlp_resume-2018-abridged
Data science nlp_resume-2018-abridged
Rangarajan Chari
 
Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...
BaoTramDuong2
 
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDistributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Databricks
 
Analysis of the evolution of advanced transformer-based language models: Expe...
Analysis of the evolution of advanced transformer-based language models: Expe...Analysis of the evolution of advanced transformer-based language models: Expe...
Analysis of the evolution of advanced transformer-based language models: Expe...
IAESIJAI
 
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
Sharmila Sathish
 
Machine Learning in NLP
Machine Learning in NLPMachine Learning in NLP
Machine Learning in NLP
Vijay Ganti
 
Big Data in small words
Big Data in small wordsBig Data in small words
Big Data in small words
Yogesh Tomar
 
Self adaptive based natural language interface for disambiguation of
Self adaptive based natural language interface for disambiguation ofSelf adaptive based natural language interface for disambiguation of
Self adaptive based natural language interface for disambiguation of
Nurfadhlina Mohd Sharef
 
Building Deep Learning Powered Big Data: Spark Summit East talk by Jiao Wang ...
Building Deep Learning Powered Big Data: Spark Summit East talk by Jiao Wang ...Building Deep Learning Powered Big Data: Spark Summit East talk by Jiao Wang ...
Building Deep Learning Powered Big Data: Spark Summit East talk by Jiao Wang ...
Spark Summit
 
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...
Databricks
 
2019 4-nn-and-dl-tao wang@unc-v2
2019 4-nn-and-dl-tao wang@unc-v22019 4-nn-and-dl-tao wang@unc-v2
2019 4-nn-and-dl-tao wang@unc-v2
Tao Wang
 
IRJET- Visual Question Answering using Combination of LSTM and CNN: A Survey
IRJET- Visual Question Answering using Combination of LSTM and CNN: A SurveyIRJET- Visual Question Answering using Combination of LSTM and CNN: A Survey
IRJET- Visual Question Answering using Combination of LSTM and CNN: A Survey
IRJET Journal
 
The Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonThe Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & Python
Krishna Sankar
 

Similar to DataXDay - The wonders of deep learning: how to leverage it for natural language processing (20)

Deeplearning in finance
Deeplearning in financeDeeplearning in finance
Deeplearning in finance
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVA
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVA
 
AI and Machine Learning PG program
AI and Machine Learning PG programAI and Machine Learning PG program
AI and Machine Learning PG program
 
Handwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with RHandwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with R
 
Performance Comparison between Pytorch and Mindspore
Performance Comparison between Pytorch and MindsporePerformance Comparison between Pytorch and Mindspore
Performance Comparison between Pytorch and Mindspore
 
Data science nlp_resume-2018-abridged
Data science nlp_resume-2018-abridgedData science nlp_resume-2018-abridged
Data science nlp_resume-2018-abridged
 
Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...
 
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDistributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
 
Analysis of the evolution of advanced transformer-based language models: Expe...
Analysis of the evolution of advanced transformer-based language models: Expe...Analysis of the evolution of advanced transformer-based language models: Expe...
Analysis of the evolution of advanced transformer-based language models: Expe...
 
Lakshmi_DB_Engineer1
Lakshmi_DB_Engineer1Lakshmi_DB_Engineer1
Lakshmi_DB_Engineer1
 
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
Feature Extraction and Analysis of Natural Language Processing for Deep Learn...
 
Machine Learning in NLP
Machine Learning in NLPMachine Learning in NLP
Machine Learning in NLP
 
Big Data in small words
Big Data in small wordsBig Data in small words
Big Data in small words
 
Self adaptive based natural language interface for disambiguation of
Self adaptive based natural language interface for disambiguation ofSelf adaptive based natural language interface for disambiguation of
Self adaptive based natural language interface for disambiguation of
 
Building Deep Learning Powered Big Data: Spark Summit East talk by Jiao Wang ...
Building Deep Learning Powered Big Data: Spark Summit East talk by Jiao Wang ...Building Deep Learning Powered Big Data: Spark Summit East talk by Jiao Wang ...
Building Deep Learning Powered Big Data: Spark Summit East talk by Jiao Wang ...
 
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...
A Spark-Based Intelligent Assistant: Making Data Exploration in Natural Langu...
 
2019 4-nn-and-dl-tao wang@unc-v2
2019 4-nn-and-dl-tao wang@unc-v22019 4-nn-and-dl-tao wang@unc-v2
2019 4-nn-and-dl-tao wang@unc-v2
 
IRJET- Visual Question Answering using Combination of LSTM and CNN: A Survey
IRJET- Visual Question Answering using Combination of LSTM and CNN: A SurveyIRJET- Visual Question Answering using Combination of LSTM and CNN: A Survey
IRJET- Visual Question Answering using Combination of LSTM and CNN: A Survey
 
The Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonThe Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & Python
 

More from DataXDay Conference by Xebia

DataXDay - Exploring graphs: looking for communities & leaders
DataXDay - Exploring graphs: looking for communities & leadersDataXDay - Exploring graphs: looking for communities & leaders
DataXDay - Exploring graphs: looking for communities & leaders
DataXDay Conference by Xebia
 
DataXDay - A data scientist journey to industrialization of machine learning
DataXDay - A data scientist journey to industrialization of machine learning DataXDay - A data scientist journey to industrialization of machine learning
DataXDay - A data scientist journey to industrialization of machine learning
DataXDay Conference by Xebia
 
DataXDay - Real-Time Access log analysis
DataXDay - Real-Time Access log analysis DataXDay - Real-Time Access log analysis
DataXDay - Real-Time Access log analysis
DataXDay Conference by Xebia
 
DataXDay - Tensors in the sky with CloudML
DataXDay - Tensors in the sky with CloudML DataXDay - Tensors in the sky with CloudML
DataXDay - Tensors in the sky with CloudML
DataXDay Conference by Xebia
 
DataXDay - Building a Real Time Analytics API at Scale
DataXDay - Building a Real Time Analytics API at ScaleDataXDay - Building a Real Time Analytics API at Scale
DataXDay - Building a Real Time Analytics API at Scale
DataXDay Conference by Xebia
 
DataXDay - Machine learning models at scale with Amazon SageMaker
DataXDay - Machine learning models at scale with Amazon SageMaker DataXDay - Machine learning models at scale with Amazon SageMaker
DataXDay - Machine learning models at scale with Amazon SageMaker
DataXDay Conference by Xebia
 

More from DataXDay Conference by Xebia (6)

DataXDay - Exploring graphs: looking for communities & leaders
DataXDay - Exploring graphs: looking for communities & leadersDataXDay - Exploring graphs: looking for communities & leaders
DataXDay - Exploring graphs: looking for communities & leaders
 
DataXDay - A data scientist journey to industrialization of machine learning
DataXDay - A data scientist journey to industrialization of machine learning DataXDay - A data scientist journey to industrialization of machine learning
DataXDay - A data scientist journey to industrialization of machine learning
 
DataXDay - Real-Time Access log analysis
DataXDay - Real-Time Access log analysis DataXDay - Real-Time Access log analysis
DataXDay - Real-Time Access log analysis
 
DataXDay - Tensors in the sky with CloudML
DataXDay - Tensors in the sky with CloudML DataXDay - Tensors in the sky with CloudML
DataXDay - Tensors in the sky with CloudML
 
DataXDay - Building a Real Time Analytics API at Scale
DataXDay - Building a Real Time Analytics API at ScaleDataXDay - Building a Real Time Analytics API at Scale
DataXDay - Building a Real Time Analytics API at Scale
 
DataXDay - Machine learning models at scale with Amazon SageMaker
DataXDay - Machine learning models at scale with Amazon SageMaker DataXDay - Machine learning models at scale with Amazon SageMaker
DataXDay - Machine learning models at scale with Amazon SageMaker
 

Recently uploaded

GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 

Recently uploaded (20)

GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 

DataXDay - The wonders of deep learning: how to leverage it for natural language processing

  • 1. THE WONDERS OF DEEP LEARNING: HOW TO LEVERAGE IT FOR NLP DATAXDAY 2018 Paris 17/05/2018 DR. ANA PELETEIRO RAMALLO DATA SCIENCE DIRECTOR @PeleteiroAna @TendamRetail @DataXDay
  • 3. DEEP LEARNING FOR NLP Deep learning is having a transformative impact in many areas where machine learning has been applied. NLP was somewhat behind other fields in terms of adopting deep learning for applications. However, this has changed over the last few years, thanks to the use of RNNs, specifically LSTMs, as well as word embeddings. Distinct areas in which deep learning can be beneficial for NLP tasks, such as in named entity recognition, machine translation and language modelling, parsing, chunking, POS tagging, amongst others. 3 @DataXDay
  • 4. WORD EMBEDDINGS 4 Representing as ids. Encodings are arbitrary. No information about the relationship between words. Data sparsity. https://www.tensorflow.org/tutorials/word2vec Better representation for words. Words in a continuous vector space where semantically similar words are mapped to nearby points. Learn dense embedding vectors. Skip-gram and CBOW • CBOW predicts target words from the context. E.g., Tendam ?? Talk • Skip-gram predicts source context-words from the target words. E.g., ?? conference ?? Standard preprocessing step for NLP. Used also as a feature in supervised approaches (e.g., clustering). Several parameters we can experiment with, e.g., the size of the word embedding or the context window. @DataXDay
  • 5. CHARACTER EMBEDDINGS Word embeddings are able to capture syntactic and semantic information. POS-tagging and NER not enough. Not the intra-word morphological and shape information, learn sub-token patterns (suffix, prefix), etc. Out-of-vocabulary word (OOV) issue. In languages where text is not composed of separated words but individual characters (Chinese). We can overcome these problems by using character embeddings 5 @DataXDay
  • 6. CNNs in NLP CNNs: effectiveness in computer vision tasks Ability to extract salient n-gram features from the input sentence to create an informative latent semantic Representa?on of the sentence for downstream tasks Several tasks: sentence classifica?on, summariza?on 6 @DataXDay
  • 8. 8 Why not basic Deep Nets or CNNs? @DataXDay Traditional neural networks and CNNs do not use information from the past, each entry is independent. This is fine for several applica=ons, such as classifying images. However, several applications, such as video, or language modelling, rely on what has happened in the past to predict the future. Recurrent Neural Networks (RNN) are capable of conditioning the model on previous units in the corpus. Capability of handling inputs of arbitrary length
  • 9. RNNs Make use of sequen+al informa+on. Output is dependent on the previous informa+on. RNN shares the same parameter W for each step, so less parameters we need to learn. 9 @DataXDay h"p://cs224d.stanford.edu/lectures/CS224d-Lecture8.pdf
  • 11. In theory, RNNs are absolutely capable of handling such long-term dependencies. Practice is ”a bit” different. 11 RNNs (II) 1. Parameters are shared by all >me steps in the network, the gradient at each output depends not only on the calcula>ons of the current >me step, but also the previous >me steps. 2. Exploding gradients: 3. Vanishing gradients: 4. Easier to spot. 3.1. Clip the gradient to a maximum 3.2. Relus instead of sigmoid 4.3. @DataXDay 4.2. 4.1. Initialization of the matrix to identity matrix Harder to iden>fy
  • 12. The oversized mannish coats looked positively edible over the bun- skimming dresses while combined with novelty knitwear such as punk- like fisherman's sweaters. As other look, the ballet pink Elizabeth and James jacket provides a cozy cocoon for the 20-year-old to top off her ensemble of a T-shirt and Parker Smith jeans. But I have to admit that my favorite is the bun-skimming dresses with the ?? • In theory, RNNs can handle of handling such long-term dependencies. 12 @DataXDay • However, in reality, they cannot. • LSTMs and GRUs avoid the long-term dependency problem. • Remove or add informaEon to the cell state, carefully regulated by structures called gates. • Gates are a way to opEonally let informaEon through.
  • 17. APPLICATIONS Word level classifica-on: NER Sentence classifica-on: tweet sen-ment polarity. Seman-c matching between text Text classifica-on Language modelling Speech recogni-on Cap-on genera-on Machine transla-on Document summariza-on Ques-on answering 17
  • 18. EX1: TEXT GENERATION All text from Shakespeare (4.4MB) 3-layer RNN with 512 hidden nodes on each layer. http://karpathy.github.io/2015/05/21/rnn-effectiveness/ https://github.com/martin-gorner/tensorflow-rnn-shakespeare 18 @DataXDay
  • 19. Q&A 19 Pedro del Hierro SS18 How can I help you today? I was wondering what is trending this spring This spring is all about new wave slip, in for example jumpsuits Is that appropriate for a work dinner? Yes, it totally works! I would recommend you to use this chilly oil jumpsuit. You can combine it with a dark brown belt and cherry tomato heels. All from Pedro del Hierro That sounds great! @DataXDay
  • 20. 20 PLENTY OF RESOURCES OUT THERE! • https://distill.pub/2016/misread-tsne/ • http://www.wildml.com • https://arxiv.org/pdf/1708.02709.pdf • http://www.jmlr.org/papers/volume12/collobert11a/collobert11a.pdf • http://colah.github.io/posts/2015-08-Understanding-LSTMs/ • https://nlp.stanford.edu/courses/NAACL2013/ • http://cs224d.stanford.edu/syllabus.html • https://github.com/kjw0612/awesome-rnn • https://lvdmaaten.github.io/tsne/ • https://github.com/oxford-cs-deepnlp-2017 @DataXDay