PyData Tel Aviv

Kfir Bar,
Chief Scientist, Basis Technology
Named Entity Recognition

2
Automatically find names of people,
organizations, locations, and more in text across
many languages.
Named entity recognition (NER)

According to Elon Musk, Mars rocket will fly
‘short flights’ next year.

5
Context is important
Edward Adelson
Neuroscientist, MIT
Checker shadow illusion
The squares represented by A and B
are of the same color

6
Context is important
Edward Adelson
Neuroscientist, MIT
Checker shadow illusion
The squares represented by A and B
are of the same color

Can't play Spain? Improve your
playing via easy step-by-step video
lessons!
7
But sometimes it gets ambiguous...

8
But sometimes it gets ambiguous...
Can't play Spain? Improve your playing
via easy step-by-step video lessons!

➔ Processing one word after another
➔ Assigning label to each word, based on local as well as global features
➔ Labels are B-PER, I-PER, B-LOC, I-LOC, OTHER, etc. (a.k.a IOB)
I/O am/O working/O for/O Basis/B-ORG Technology/I-ORG
9
NER as a sequence-labeling problem

Traditional ML vs. Deep Learning
I love this movie
words, part of speech tags,
lemmas, brown clusters
[00010010110000101001…..001]
☺ Positive
Feature extraction
Vectorization
Modeling
I love this movie
Embeddings lookup
[0.323, -0.3434, 0.901, …, -0.267]
[-0.4923, 0.554, 0.001, …, -0.365]
[1.58845, 0.478, 0.0901, …, -0.171]
…
[-0.0592, 0.588, -0.01, …, -0.111]
Modeling
☺ Positive
10

Word embeddings
- + BerlinJapan GermanyTokyo =

12
Feed forward network for NER
listen
to
while
I
B-PER
B-LOC
...
...
Layer 1 Layer 2 Output
Spain I-PER
...

13
Recurrent neural network (RNN)
listen
to
while
I
B-PER
B-LOC
...
...
Layer 1 Output
Spain I-PER
...

14
listen
to
while
I
B-PER
B-LOC
...
...
Layer 1 Output
Spain I-PER
...

15
listen
to
while
I
B-PER
B-LOC
...
...
Layer 1 Output
Spain I-PER
...

16
t-1 t t+1
B-PER I-PER OTHER
➔ At each time step we
process one word
concatenated with
the output from
previous time steps
➔ It remembers information
for many time steps

17
Long Short Term Memory (LSTM)
LSTMIt can forget information when
necessary
LSTM LSTM
t-1 t t+1
B-PER I-PER OTHER

18
LSTM for Sequence Labeling
LSTM
Washington
B-PER
LSTM
said
OTHER
LSTM
in
OTHER
LSTM
Chicago
B-LOC
LSTM
last
OTHER
...

+
19
Bidirectional LSTM for Sequence Labeling
LSTM
Washington
B-PER
LSTM
+
LSTM
said
OTHER
LSTM
+
LSTM
in
OTHER
LSTM
+
LSTM
Chicago
B-LOC
LSTM
+
LSTM
last
OTHER
LSTM
...

20
Multilayer LSTM for Sequence Labeling
+
LSTM
Washington
B-PER
LSTM
+
LSTM
said
OTHER
LSTM
+
LSTM
in
OTHER
LSTM
+
LSTM
Chicago
B-LOC
LSTM
+
LSTM
last
OTHER
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
+ + + + +

21
Multilayer LSTM for Sequence Labeling
+
LSTM
Washington
B-PER
LSTM
+
LSTM
said
OTHER
LSTM
+
LSTM
in
OTHER
LSTM
+
LSTM
Chicago
B-LOC
LSTM
+
LSTM
last
OTHER
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
+ + + + +
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
LSTM
+ + + + +

+
22
Alternative decoding using Conditional Random Fields (CRF)
LSTM
Washington
LSTM
+
LSTM
said
LSTM
+
LSTM
in
LSTM
+
LSTM
Chicago
LSTM
+
LSTM
last
LSTM
...
B-PER OTHER OTHER B-LOC OTHER

+
23
Character encoding
LSTM
Washington
LSTM
+
LSTM
said
LSTM
+
LSTM
in
LSTM
+
LSTM
Chicago
LSTM
+
LSTM
last
LSTM
...
B-PER OTHER OTHER B-LOC OTHER
+
s a i d

24
Overall: better accuracy in multiple languages for NER,
using deep learning!
English Arabic Korean
Deep learning model 91.3 83.3 86.4
Traditional model 89.3 80.3 80.7

https://developer.rosette.com/

27
What does LSTM actually learn?

+
28
Bidirectional LSTM for NER
LSTM
Washington
B-PER
LSTM
+
LSTM
said
OTHER
LSTM
+
LSTM
in
OTHER
LSTM
+
LSTM
Chicago
B-LOC
LSTM
+
LSTM
last
OTHER
LSTM
...

+ + + ++
29
LSTM
Washington
B-PER
LSTM
LSTM
said
OTHER
LSTM
LSTM
in
OTHER
LSTM
LSTM
Chicago
B-LOC
LSTM
LSTM
last
OTHER
LSTM
...

+ + + ++
30
LSTM
Washington
B-PER
LSTM
LSTM
said
OTHER
LSTM
LSTM
in
OTHER
LSTM
LSTM
Chicago
B-LOC
LSTM
LSTM
last
OTHER
LSTM
...
Let’s look at this cell vector over time
...

31

32
Neuron 280 - gets positive around some punctuation marks

33
Neuron 189 - gets negative around potential locations

Questions?
Thank you!
kfir@basistech.com
@kfirbar

PyData Tel Aviv

Recommended

Recommended

More Related Content

Similar to PyData Tel Aviv

Similar to PyData Tel Aviv (8)

Recently uploaded

Recently uploaded (20)

PyData Tel Aviv