Natural language processing

Natural language processing
Parsing and understanding language

Full Day of Applied AI
Morning
Session 1 Intro to Artificial Intelligence
09:00-09:45 Introduction to Applied AI
09:45-10:00 Coffee and break
Session 2 Live Coding a machine learning app
10:00-10:10 Getting your machine ready for machine learning
10:10-10.20 Training and evaluating the model
10.20-10.50 Improving the model
10.50-11.00 Coffee and break
Session 3 Machine learning in the wild - deployment
11:00-11.15 Coding exercise continued
11:15-11:45 Serving your own machine learning model | Code
11:45-11:55 How to solve problems | interactive exercise
11:55-12:00 Q and A
Lunch
12:00-13:00 Lunch
Afternoon
Session 4 Hello World Deep Learning (MNIST)
13:00-13:15 Deep Learning intro
13:00-13.15 Image recognition and CNNs | Talk |
13:15-13:45 Building your own convolutional neural network | Code |
Session 5 Natural Language Processing
14:00-14.30 Natural language processing | Talk |
14:30-14:45 Working with language | Code |
Session 6 Conversational interfaces and Time Series
14:00-14.20 Conversational interfaces
14:20-14:45 Time Series prediction
Session 7 Generative models and style transfer
16:00-16.30 Generative models | Talk |
16:30-16:45 Trying out GANS and style transfer | Code |
Anton Osika AI Research Engineer Sana Labs AB
anton.osika@gmail.com
Birger Moëll Machine Learning Engineer
birger.moell@gmail.com

Deep learning for speech
State of the art in machine
learning can understand human
speech well enough to perform
fairly complicated actions based
on speech commands.

Natural Language Processing
Text to speech (Wavenet State of the Art, https://deepmind.com/blog/wavenet-generative-
model-raw-audio)
Speech to text (Alexa, Google Home, Google https://cloud.google.com/speech-to-text/)
Text to vector (Glove, Word2Vec, Bert)
Natural language creation (LSTMs, Generative models, GPT2)

Word vectors
Word vectors are high dimensional representations of language where each word
is assigned a vector based on its closeness to other words. This gives the model a
representation of language that includes bias and changes over time.

We can now compare how similar words are

How are word vectors used in ML?
Answer: Transfer learning
Using a model pre-trained on large text databases and then fine-tuning it on the desired task
has revolutionized NLP

Attention is all you need
The progress in NLP has been largely
based on the Attention Mechanism from
https://arxiv.org/abs/1706.03762.
Attention is all you need.
As opposed to directional models the
Attention model reads the entire
sequence of words at once instead of
sequentially.
It is therefore considered “bidirectional”,
though it would be more accurate to say
that it is “non-directional”.

The imagenet moment for NLP
October 2018, BERT Arrives
BERT is hailed as an imagenet moment for
natural language processing
https://thegradient.pub/nlp-imagenet/
Bert Paper
https://arxiv.org/abs/1810.04805

New state of the art on most NLP tasks
https://rajpurkar.github.io/SQuAD-
explorer/explore/1.1/dev/Harvard_Un
iversity.html?model=BERT%20(ense
mble)%20(Google%20AI%20Langua
ge)&version=1.1

How BERT works
BERT works using attention for
computing how much each word should
be combined with each other word to
compute their collective meaning.
http://jalammar.github.io/illustrated-bert/

Query and Key vectors “attend” Value vectors
Query, Key and Value
vectors gives each
word knowledge
about how much it
should be combined
with other words.

Attention visualized
The trainable Query, Key and
Value vectors result in what we
call “attention”.

The BERT encoder uses stacked attention
layers that does the attention computation
many times in parallel.
Output from the encoder is directly
mapped to the predicted output sequence.
Stacked attention heads

Fine tuning on the desired task

Fine tuning on the desired task
There exists several implementations of BERT
that gives us access to BERT-tensors to work
with.
However, in order to use BERT for language
tasks we need to train a classifier for our specific
problem.
This can be done with a neural network that uses
BERT-encoded word-vectors with labels as their
input (for classification).
Tensorflow Model
https://github.com/google-
research/bert
Pytorch Model
https://github.com/huggingfa
ce/pytorch-pretrained-BERT
Bert as Service
https://github.com/hanxiao/b
ert-as-service

GPT-2
GPT-2 is a model
created by Open AI
that improves state of
the art in language
creation.
Because of fears of
malicious use the full
model was never
released.
https://talktotransformer.com/

Lets get started coding
Open up the notebooks inside Natural
Language Processing to train your own
deep neural network for natural language
processing.

Natural language processing

More Related Content

What's hot

Similar to Natural language processing

More from Birger Moell

Recently uploaded

Natural language processing

Editor's Notes