SlideShare a Scribd company logo
Building Named Entity
Recognition Models efficiently
using NERDS
Sujit Pal, Elsevier Labs
December 2019.
About me
• Work at Elsevier Labs
• (Mostly self-taught) data scientist
• Mostly work with Deep Learning, Machine
Learning, Natural Language Processing, and
Search.
• Got interested in Named Entity Recognition
(NER) and NERDS as part of Search and
Knowledge Graph development.
2
I am NOT the author or maintainer of NERDS!
• Originally built by Panagiotis Eustratiadis.
• See CONTRIBUTORS.md for list of contributors.
• Open sourced by Elsevier July 3, 2018.
Agenda
• What can NER do for you?
• Evolution of NER techniques
• NERDS Architecture.
• NERDS Usage.
• Future Work.
3
Agenda
• What can NER do for you?
• Evolution of NER techniques
• NERDS Architecture.
• NERDS Usage.
• Future Work.
4
What can NER do for you?
• In general…
• Foundational task for NLP pipelines.
• Good NERs available OOB for “standard” named entities.
• Topic Modeling, Co-reference Resolution, etc.
• Information Retrieval (IR)
• Chunk Entities into meaningful multi-word phrases.
• Understanding query intent.
• Automated Knowledge Graph Construction (AKBC)
• NER extracts entities from incoming text.
• Relationship Extraction extracts relationships between entity pairs.
• Entity Relationship triple inserted into Knowledge Graph.
5
ConceptSearch!
Agenda
• What can NER do for you?
• Evolution of NER techniques
• NERDS Architecture.
• NERDS Usage.
• Future Work.
6
Evolution of NER Techniques
• Rules
• Regular
Expressions
• Gazetteers
7
• Word-based
models – PMI,
log-likelihood.
• Sequence models
– Conditional
Random Fields
• Bi-LSTM
• Bi-LSTM+CRF
• Transformer
based Models
Traditional Statistical Neural
Input Format – BIO Tagging
• BIO – Begin In Out.
• Barack/B-PER Obama/I-PER is/O 44th/O
United/B-LOC States/I-LOC President/O
./O
• BILOU – a tagging variant:
• U – Unit token (for single token entities)
• L – Last token in sequence, ex. Barack/B-
PER Obama/L-PER
8
Barack B-PER
Obama I-PER
is O
44th O
United B-LOC
States I-LOC
President O
. O
Gazetteer – Aho Corasick
• Create in-memory data
structure from dictionary.
• Stream content against data
structure.
• Multiple matches with single
pass.
9
Aho, A.V., and Corasick, M.J., 1975. Efficient String Matching: An aid to bibliographic search
21
43
0
Barack Obama
United States
NOT(Barack, United)
5
Airlines
PER
LOC
ORG
Sequence Modeling - CRF
• Sequence version of logistic regression.
• Computes optimum labeling l (y0, …, yn) over entire sentence s.
• Build multiple feature functions f on each token, return real value in range 0..1.
Function parameters:
• sentence s with tokens (x0, …, xn) – feature can use any token, the entire
sentence, or functions computed over the sentence (POS),
• current position i,
• previous and next labels yi-1 and yi+1.
• Optimum labeling computed as follows, probability computed using softmax.
• Weights wj learned using gradient descent.
10
Neural Model - BiLSTM
• Input is sequence of tokens, output is sequence of BIO tags.
• Weights trained end-to-end, no feature engineering needed.
• Bidirectional LSTM gets signal from neighboring words on both sides.
11
B-PER I-PER O O B-LOC I-LOC O O
Barack Obama is 44th United States PresidentStates .
Neural Model – BiLSTM-CRF
• Same as previous model, with additional CRF layer.
• No feature engineering for CRF, unlike CRF only NER model.
• Pre-trained embeddings observed to improve performance.
12
Barack Obama is 44th United States PresidentStates .
B-PER I-PER O O B-LOC I-LOC O O
CRFBi-LSTM
Neural Model – adding char embeddings
• Concatenate char embedding + word embedding and feed to Bi-LSTM-CRF.
• All weights learned end-to-end.
• Handles rare / unknown words; Exploits signal in prefix/suffix.
13
.Barack Obama is 44th United PresidentStates
B-PER I-PER O O B-LOC I-LOC O O
word embeddings char LSTM/CNN
Bi-LSTM-CRF
concatenate
Neural Model – ELMo preprocessing
14
.Barack Obama is 44th United PresidentStates
B-PER I-PER O O B-LOC I-LOC O O
char LSTM/CNN
Bi-LSTM-CRF
concat
Contextualized
wordembeddings
Neural Model – Transformer based
• BERT = Bidirectional Encoder Representation for Transformers.
• Source of embeddings similar to ELMo in standard BiLSTM + CRF models, OR
• Fine-tune LM backed NERs such as HuggingFace’s BertForTokenClassification.
15
.Barack Obama is 44th United PresidentStates[CLS]
B-PER I-PER O O B-LOC I-LOC O O
More Info on NER Techniques
• High level overview on NER in series of blog posts by Tobias Sterbak
(https://bit.ly/2pNdgPG).
• Traditional NER techniques covered in paper by Rahul Shernagat (2014) -- Named
Entity Recognition: A Literature Survey (https://bit.ly/2NRaCAg).
• Introduction to Neural Models in paper by Ronan Collolbert and Jason Weston
(2008) – A Unified Architecture for Natural Language Processing: Deep Neural
Networks with Multitask Learning (https://bit.ly/32rRYnO)
• Others (more modern papers) mentioned in slides.
16
Agenda
• What can NER do for you?
• Evolution of NER techniques
• NERDS Architecture
• NERDS Usage
• Future Work
17
NERDS Overview
• Framework that provides easy to use NER capabilities to Data
Scientists.
• Wraps various popular third party NER models.
• Extendable, new third party NER tools can be added as needed.
• Software Engineering tooling to boost Data Science productivity.
• Looking for support, bug reports, contributions, and ideas.
18
Unification through I/O Format
19
pyAhoCorasick CRFSuite SpaCy NER Anago BiLSTM
AnnotatedDocument (
doc: Document(“Barack Obama is 44th United States President .”),
annotations: [
Annotation(start_offset:0, end_offset:12, text:”Barack Obama”, label:”PER”),
Annotation(start_offset:22, end_offset:35, text:”United States”, label:”LOC”)
])
Benefits of Unification
• Consistent API – all models are subclasses of NERModel.
• Data prep. done once per project and reused across multiple models.
• Reusable Training and Evaluation code.
• Familiar Scikit-Learn like API, and access to Scikit-Learn utility functions.
• Duck-typing allows us to build Ensembles of NER.
• Easy to benchmark NER label data.
20
Can we do better?
21
Data: [[“Barack”, “Obama”, “is”, “44th”, “United” “States”, “President”, “.”]]
Labels and Predictions: [[“B-PER”, “I-PER”, “O”, “O”, “B-LOC”, “I-LOC”, “O”, “O”]]
DictionaryNER
I/O
Convert
SpacyNER
I/O
Convert
CrfNER BiLstmCrfNER
ELMo NER Model from Anago
22
DictionaryNER CrfNER SpacyNER BiLstmCrfNER
Data: [[“Barack”, “Obama”, “is”, “44th”, “United” “States”, “President”, “.”]]
Labels and Predictions: [[“B-PER”, “I-PER”, “O”, “O”, “B-LOC”, “I-LOC”, “O”, “O”]]
I/O
Convert
I/O
Convert
ElmoNER
Agenda
• What can NER do for you?
• Evolution of NER techniques
• NERDS Architecture
• NERDS Usage
• Future Work
23
Dataset
• Bio Entity recognition task from BioNLP 2004.
• Training and Test sets provided in BIO format.
• 511,097 training examples
• 104,895 test examples.
• Entity Distribution (training set)
• 25,307 DNA
• 2,481 RNA
• 11,217 cell_line
• 15,466 cell_type
• 55,117 protein
24
Dictionary NER
• Wraps pyAhoCorasick Automaton
• Improvements in fork.
• Supports dictionary loading as well as fit(X, y) like other NER models.
• Handles multiple entity classes.
25
Dictionary NER
• Wraps pyAhoCorasick Automaton
• Improvements in fork.
• Supports dictionary loading as well as fit(X, y) like other NER models.
• Handles multiple entity classes.
26
CRF NER
• Wraps sklearn.crfsuite CRF
• Improvements in this fork:
• Removes NLTK dependency, replaces with SpaCy.
• Allows non-default features to be passed in.
27
CRF NER
• Wraps sklearn.crfsuite CRF
• Improvements
• Removes NLTK dependency, replaces with SpaCy.
• Allows non-default features to be passed in.
28
SpaCy NER
• Wraps NER provided by SpaCy toolkit.
• Improvements in this fork:
• More robust to large data sizes, uses mini-batches for training.
29
SpaCy NER
• Wraps NER provided by SpaCy toolkit.
• Improvements in this fork:
• More robust to large data sizes, uses mini-batches for training.
30
BiLSTM CRF NER
• Wraps Anago BiLSTMCRF.
• Improvements in this fork:
• Works against latest release (1.0.5) of Anago.
• No more intermittent failures due to time step mismatches.
31
BiLSTM CRF NER
• Wraps Anago BiLSTMCRF.
• Improvements in this fork:
• Works against latest release (1.0.5) of Anago.
• No more intermittent failures due to time step mismatches.
32
Elmo NER
• Wraps Anago ELModel.
• New in this fork, available in current (dev) version of Anago.
• Needs (mandatory) base embedding for ELMo preprocessor.
33
Elmo NER
• Wraps Anago ELModel.
• New in this fork, available in current (dev) version of Anago.
• Needs (mandatory) base embedding for ELMo preprocessor.
34
Ensemble NER
• Max Voting
• Improvements in this fork:
• Unifies Max Voting and
Weighted Max Voting
NERs into single model.
35
Ensemble NER
• Max Voting
• Improvements in this fork:
• Unifies Max Voting and
Weighted Max Voting
NERs into single model.
36
Results (OOTB)
• Comparison across models
• ELMO based CRF has best performance.
• SpaCy and BiLSTM have comparable
performance, but CRF is competitive.
• Model based NERs outperform gazetteers.
• F1-scores range from 0.65 to 0.80
• Comparison across entity types
• Some correlation observed between data
volume and F1-scores for other models.
• F1-scores range from 0.61 to 0.81
37
Agenda
• What can NER do for you?
• Evolution of NER techniques
• NERDS Architecture
• NERDS Usage
• Future Work
38
Future Work
• Current API is only superficially Scikit-Learn like, convert to make models
fully conform to Scikit-Learn Classifier API.
• Eliminate Serialization issues reported by joblib.Parallel.
• Eliminate EnsembleNER in favor of ScikitLearn’s VotingClassifier.
• Leverage Scikit-Learn’s Model Selection classes
(RandomizedSearchCV and GridSearchCV).
• Add FLAIR and BERT based NER to supported model collection.
• BRAT annotation adapter.
39
Thank you
https://github.com/sujitpal/nerds
sujit.pal@elsevier.com

More Related Content

What's hot

BERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersBERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from Transformers
Liangqun Lu
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
Roelof Pieters
 
Deep learning for real life applications
Deep learning for real life applicationsDeep learning for real life applications
Deep learning for real life applications
Anas Arram, Ph.D
 
NAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITIONNAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITION
live_and_let_live
 
Word2Vec
Word2VecWord2Vec
Word2Vec
hyunyoung Lee
 
Autoencoders
AutoencodersAutoencoders
Autoencoders
CloudxLab
 
Deep Natural Language Processing for Search and Recommender Systems
Deep Natural Language Processing for Search and Recommender SystemsDeep Natural Language Processing for Search and Recommender Systems
Deep Natural Language Processing for Search and Recommender Systems
Huiji Gao
 
NLP State of the Art | BERT
NLP State of the Art | BERTNLP State of the Art | BERT
NLP State of the Art | BERT
shaurya uppal
 
Large Language Models - From RNN to BERT
Large Language Models - From RNN to BERTLarge Language Models - From RNN to BERT
Large Language Models - From RNN to BERT
ATPowr
 
An introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTAn introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERT
Suman Debnath
 
Introduction to PyTorch
Introduction to PyTorchIntroduction to PyTorch
Introduction to PyTorch
Jun Young Park
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
Benjamin Bengfort
 
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)
Sergey Karayev
 
Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and Transformer
Arvind Devaraj
 
Python Seaborn Data Visualization
Python Seaborn Data Visualization Python Seaborn Data Visualization
Python Seaborn Data Visualization
Sourabh Sahu
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
Kuppusamy P
 
Bert
BertBert
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
Bhaskar Mitra
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
ananth
 
Tutorial on word2vec
Tutorial on word2vecTutorial on word2vec
Tutorial on word2vec
Leiden University
 

What's hot (20)

BERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersBERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from Transformers
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
Deep learning for real life applications
Deep learning for real life applicationsDeep learning for real life applications
Deep learning for real life applications
 
NAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITIONNAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITION
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
Autoencoders
AutoencodersAutoencoders
Autoencoders
 
Deep Natural Language Processing for Search and Recommender Systems
Deep Natural Language Processing for Search and Recommender SystemsDeep Natural Language Processing for Search and Recommender Systems
Deep Natural Language Processing for Search and Recommender Systems
 
NLP State of the Art | BERT
NLP State of the Art | BERTNLP State of the Art | BERT
NLP State of the Art | BERT
 
Large Language Models - From RNN to BERT
Large Language Models - From RNN to BERTLarge Language Models - From RNN to BERT
Large Language Models - From RNN to BERT
 
An introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTAn introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERT
 
Introduction to PyTorch
Introduction to PyTorchIntroduction to PyTorch
Introduction to PyTorch
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
 
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)
 
Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and Transformer
 
Python Seaborn Data Visualization
Python Seaborn Data Visualization Python Seaborn Data Visualization
Python Seaborn Data Visualization
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
 
Bert
BertBert
Bert
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
Tutorial on word2vec
Tutorial on word2vecTutorial on word2vec
Tutorial on word2vec
 

Similar to Building Named Entity Recognition Models Efficiently using NERDS

SoDA v2 - Named Entity Recognition from streaming text
SoDA v2 - Named Entity Recognition from streaming textSoDA v2 - Named Entity Recognition from streaming text
SoDA v2 - Named Entity Recognition from streaming text
Sujit Pal
 
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in FirmwareUsing Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
Lastline, Inc.
 
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Fwdays
 
An Introduction to NLP4L
An Introduction to NLP4LAn Introduction to NLP4L
An Introduction to NLP4L
Koji Sekiguchi
 
Grammarly AI-NLP Club #6 - Sequence Tagging using Neural Networks - Artem Che...
Grammarly AI-NLP Club #6 - Sequence Tagging using Neural Networks - Artem Che...Grammarly AI-NLP Club #6 - Sequence Tagging using Neural Networks - Artem Che...
Grammarly AI-NLP Club #6 - Sequence Tagging using Neural Networks - Artem Che...
Grammarly
 
Elasticsearch Basics
Elasticsearch BasicsElasticsearch Basics
Elasticsearch Basics
Shifa Khan
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...eswcsummerschool
 
Hunting for anglerfish in datalakes
Hunting for anglerfish in datalakesHunting for anglerfish in datalakes
Hunting for anglerfish in datalakes
Dominic Egger
 
Nltk natural language toolkit overview and application @ PyCon.tw 2012
Nltk  natural language toolkit overview and application @ PyCon.tw 2012Nltk  natural language toolkit overview and application @ PyCon.tw 2012
Nltk natural language toolkit overview and application @ PyCon.tw 2012
Jimmy Lai
 
groovy & grails - lecture 1
groovy & grails - lecture 1groovy & grails - lecture 1
groovy & grails - lecture 1
Alexandre Masselot
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Roelof Pieters
 
Are High Level Programming Languages for Multicore and Safety Critical Conver...
Are High Level Programming Languages for Multicore and Safety Critical Conver...Are High Level Programming Languages for Multicore and Safety Critical Conver...
Are High Level Programming Languages for Multicore and Safety Critical Conver...
InfinIT - Innovationsnetværket for it
 
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
ORAU
 
Next-generation sequencing data format and visualization with ngs.plot 2015
Next-generation sequencing data format and visualization with ngs.plot 2015Next-generation sequencing data format and visualization with ngs.plot 2015
Next-generation sequencing data format and visualization with ngs.plot 2015
Li Shen
 
Anton Dorfman - Reversing data formats what data can reveal
Anton Dorfman - Reversing data formats what data can revealAnton Dorfman - Reversing data formats what data can reveal
Anton Dorfman - Reversing data formats what data can revealDefconRussia
 
CoreML for NLP (Melb Cocoaheads 08/02/2018)
CoreML for NLP (Melb Cocoaheads 08/02/2018)CoreML for NLP (Melb Cocoaheads 08/02/2018)
CoreML for NLP (Melb Cocoaheads 08/02/2018)
Hon Weng Chong
 
Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...
Databricks
 
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Lucidworks
 
The openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageThe openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query Language
Neo4j
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_experts
Sanghamitra Deb
 

Similar to Building Named Entity Recognition Models Efficiently using NERDS (20)

SoDA v2 - Named Entity Recognition from streaming text
SoDA v2 - Named Entity Recognition from streaming textSoDA v2 - Named Entity Recognition from streaming text
SoDA v2 - Named Entity Recognition from streaming text
 
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in FirmwareUsing Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
 
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
 
An Introduction to NLP4L
An Introduction to NLP4LAn Introduction to NLP4L
An Introduction to NLP4L
 
Grammarly AI-NLP Club #6 - Sequence Tagging using Neural Networks - Artem Che...
Grammarly AI-NLP Club #6 - Sequence Tagging using Neural Networks - Artem Che...Grammarly AI-NLP Club #6 - Sequence Tagging using Neural Networks - Artem Che...
Grammarly AI-NLP Club #6 - Sequence Tagging using Neural Networks - Artem Che...
 
Elasticsearch Basics
Elasticsearch BasicsElasticsearch Basics
Elasticsearch Basics
 
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
ESWC SS 2012 - Wednesday Tutorial Barry Norton: Building (Production) Semanti...
 
Hunting for anglerfish in datalakes
Hunting for anglerfish in datalakesHunting for anglerfish in datalakes
Hunting for anglerfish in datalakes
 
Nltk natural language toolkit overview and application @ PyCon.tw 2012
Nltk  natural language toolkit overview and application @ PyCon.tw 2012Nltk  natural language toolkit overview and application @ PyCon.tw 2012
Nltk natural language toolkit overview and application @ PyCon.tw 2012
 
groovy & grails - lecture 1
groovy & grails - lecture 1groovy & grails - lecture 1
groovy & grails - lecture 1
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
 
Are High Level Programming Languages for Multicore and Safety Critical Conver...
Are High Level Programming Languages for Multicore and Safety Critical Conver...Are High Level Programming Languages for Multicore and Safety Critical Conver...
Are High Level Programming Languages for Multicore and Safety Critical Conver...
 
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
Non equilibrium Molecular Simulations of Polymers under Flow Saving Energy th...
 
Next-generation sequencing data format and visualization with ngs.plot 2015
Next-generation sequencing data format and visualization with ngs.plot 2015Next-generation sequencing data format and visualization with ngs.plot 2015
Next-generation sequencing data format and visualization with ngs.plot 2015
 
Anton Dorfman - Reversing data formats what data can reveal
Anton Dorfman - Reversing data formats what data can revealAnton Dorfman - Reversing data formats what data can reveal
Anton Dorfman - Reversing data formats what data can reveal
 
CoreML for NLP (Melb Cocoaheads 08/02/2018)
CoreML for NLP (Melb Cocoaheads 08/02/2018)CoreML for NLP (Melb Cocoaheads 08/02/2018)
CoreML for NLP (Melb Cocoaheads 08/02/2018)
 
Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...Why you should care about data layout in the file system with Cheng Lian and ...
Why you should care about data layout in the file system with Cheng Lian and ...
 
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
 
The openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query LanguageThe openCypher Project - An Open Graph Query Language
The openCypher Project - An Open Graph Query Language
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_experts
 

More from Sujit Pal

Supporting Concept Search using a Clinical Healthcare Knowledge Graph
Supporting Concept Search using a Clinical Healthcare Knowledge GraphSupporting Concept Search using a Clinical Healthcare Knowledge Graph
Supporting Concept Search using a Clinical Healthcare Knowledge Graph
Sujit Pal
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
Sujit Pal
 
Building Learning to Rank (LTR) search reranking models using Large Language ...
Building Learning to Rank (LTR) search reranking models using Large Language ...Building Learning to Rank (LTR) search reranking models using Large Language ...
Building Learning to Rank (LTR) search reranking models using Large Language ...
Sujit Pal
 
Cheap Trick for Question Answering
Cheap Trick for Question AnsweringCheap Trick for Question Answering
Cheap Trick for Question Answering
Sujit Pal
 
Searching Across Images and Test
Searching Across Images and TestSearching Across Images and Test
Searching Across Images and Test
Sujit Pal
 
Learning a Joint Embedding Representation for Image Search using Self-supervi...
Learning a Joint Embedding Representation for Image Search using Self-supervi...Learning a Joint Embedding Representation for Image Search using Self-supervi...
Learning a Joint Embedding Representation for Image Search using Self-supervi...
Sujit Pal
 
The power of community: training a Transformer Language Model on a shoestring
The power of community: training a Transformer Language Model on a shoestringThe power of community: training a Transformer Language Model on a shoestring
The power of community: training a Transformer Language Model on a shoestring
Sujit Pal
 
Backprop Visualization
Backprop VisualizationBackprop Visualization
Backprop Visualization
Sujit Pal
 
Accelerating NLP with Dask and Saturn Cloud
Accelerating NLP with Dask and Saturn CloudAccelerating NLP with Dask and Saturn Cloud
Accelerating NLP with Dask and Saturn Cloud
Sujit Pal
 
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Sujit Pal
 
Leslie Smith's Papers discussion for DL Journal Club
Leslie Smith's Papers discussion for DL Journal ClubLeslie Smith's Papers discussion for DL Journal Club
Leslie Smith's Papers discussion for DL Journal Club
Sujit Pal
 
Using Graph and Transformer Embeddings for Vector Based Retrieval
Using Graph and Transformer Embeddings for Vector Based RetrievalUsing Graph and Transformer Embeddings for Vector Based Retrieval
Using Graph and Transformer Embeddings for Vector Based Retrieval
Sujit Pal
 
Transformer Mods for Document Length Inputs
Transformer Mods for Document Length InputsTransformer Mods for Document Length Inputs
Transformer Mods for Document Length Inputs
Sujit Pal
 
Question Answering as Search - the Anserini Pipeline and Other Stories
Question Answering as Search - the Anserini Pipeline and Other StoriesQuestion Answering as Search - the Anserini Pipeline and Other Stories
Question Answering as Search - the Anserini Pipeline and Other Stories
Sujit Pal
 
Graph Techniques for Natural Language Processing
Graph Techniques for Natural Language ProcessingGraph Techniques for Natural Language Processing
Graph Techniques for Natural Language Processing
Sujit Pal
 
Learning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildLearning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search Guild
Sujit Pal
 
Search summit-2018-ltr-presentation
Search summit-2018-ltr-presentationSearch summit-2018-ltr-presentation
Search summit-2018-ltr-presentation
Sujit Pal
 
Search summit-2018-content-engineering-slides
Search summit-2018-content-engineering-slidesSearch summit-2018-content-engineering-slides
Search summit-2018-content-engineering-slides
Sujit Pal
 
Evolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchEvolving a Medical Image Similarity Search
Evolving a Medical Image Similarity Search
Sujit Pal
 
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...
Sujit Pal
 

More from Sujit Pal (20)

Supporting Concept Search using a Clinical Healthcare Knowledge Graph
Supporting Concept Search using a Clinical Healthcare Knowledge GraphSupporting Concept Search using a Clinical Healthcare Knowledge Graph
Supporting Concept Search using a Clinical Healthcare Knowledge Graph
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Building Learning to Rank (LTR) search reranking models using Large Language ...
Building Learning to Rank (LTR) search reranking models using Large Language ...Building Learning to Rank (LTR) search reranking models using Large Language ...
Building Learning to Rank (LTR) search reranking models using Large Language ...
 
Cheap Trick for Question Answering
Cheap Trick for Question AnsweringCheap Trick for Question Answering
Cheap Trick for Question Answering
 
Searching Across Images and Test
Searching Across Images and TestSearching Across Images and Test
Searching Across Images and Test
 
Learning a Joint Embedding Representation for Image Search using Self-supervi...
Learning a Joint Embedding Representation for Image Search using Self-supervi...Learning a Joint Embedding Representation for Image Search using Self-supervi...
Learning a Joint Embedding Representation for Image Search using Self-supervi...
 
The power of community: training a Transformer Language Model on a shoestring
The power of community: training a Transformer Language Model on a shoestringThe power of community: training a Transformer Language Model on a shoestring
The power of community: training a Transformer Language Model on a shoestring
 
Backprop Visualization
Backprop VisualizationBackprop Visualization
Backprop Visualization
 
Accelerating NLP with Dask and Saturn Cloud
Accelerating NLP with Dask and Saturn CloudAccelerating NLP with Dask and Saturn Cloud
Accelerating NLP with Dask and Saturn Cloud
 
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
Accelerating NLP with Dask on Saturn Cloud: A case study with CORD-19
 
Leslie Smith's Papers discussion for DL Journal Club
Leslie Smith's Papers discussion for DL Journal ClubLeslie Smith's Papers discussion for DL Journal Club
Leslie Smith's Papers discussion for DL Journal Club
 
Using Graph and Transformer Embeddings for Vector Based Retrieval
Using Graph and Transformer Embeddings for Vector Based RetrievalUsing Graph and Transformer Embeddings for Vector Based Retrieval
Using Graph and Transformer Embeddings for Vector Based Retrieval
 
Transformer Mods for Document Length Inputs
Transformer Mods for Document Length InputsTransformer Mods for Document Length Inputs
Transformer Mods for Document Length Inputs
 
Question Answering as Search - the Anserini Pipeline and Other Stories
Question Answering as Search - the Anserini Pipeline and Other StoriesQuestion Answering as Search - the Anserini Pipeline and Other Stories
Question Answering as Search - the Anserini Pipeline and Other Stories
 
Graph Techniques for Natural Language Processing
Graph Techniques for Natural Language ProcessingGraph Techniques for Natural Language Processing
Graph Techniques for Natural Language Processing
 
Learning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildLearning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search Guild
 
Search summit-2018-ltr-presentation
Search summit-2018-ltr-presentationSearch summit-2018-ltr-presentation
Search summit-2018-ltr-presentation
 
Search summit-2018-content-engineering-slides
Search summit-2018-content-engineering-slidesSearch summit-2018-content-engineering-slides
Search summit-2018-content-engineering-slides
 
Evolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchEvolving a Medical Image Similarity Search
Evolving a Medical Image Similarity Search
 
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...
Embed, Encode, Attend, Predict – applying the 4 step NLP recipe for text clas...
 

Recently uploaded

standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 

Recently uploaded (20)

standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 

Building Named Entity Recognition Models Efficiently using NERDS

  • 1. Building Named Entity Recognition Models efficiently using NERDS Sujit Pal, Elsevier Labs December 2019.
  • 2. About me • Work at Elsevier Labs • (Mostly self-taught) data scientist • Mostly work with Deep Learning, Machine Learning, Natural Language Processing, and Search. • Got interested in Named Entity Recognition (NER) and NERDS as part of Search and Knowledge Graph development. 2 I am NOT the author or maintainer of NERDS! • Originally built by Panagiotis Eustratiadis. • See CONTRIBUTORS.md for list of contributors. • Open sourced by Elsevier July 3, 2018.
  • 3. Agenda • What can NER do for you? • Evolution of NER techniques • NERDS Architecture. • NERDS Usage. • Future Work. 3
  • 4. Agenda • What can NER do for you? • Evolution of NER techniques • NERDS Architecture. • NERDS Usage. • Future Work. 4
  • 5. What can NER do for you? • In general… • Foundational task for NLP pipelines. • Good NERs available OOB for “standard” named entities. • Topic Modeling, Co-reference Resolution, etc. • Information Retrieval (IR) • Chunk Entities into meaningful multi-word phrases. • Understanding query intent. • Automated Knowledge Graph Construction (AKBC) • NER extracts entities from incoming text. • Relationship Extraction extracts relationships between entity pairs. • Entity Relationship triple inserted into Knowledge Graph. 5 ConceptSearch!
  • 6. Agenda • What can NER do for you? • Evolution of NER techniques • NERDS Architecture. • NERDS Usage. • Future Work. 6
  • 7. Evolution of NER Techniques • Rules • Regular Expressions • Gazetteers 7 • Word-based models – PMI, log-likelihood. • Sequence models – Conditional Random Fields • Bi-LSTM • Bi-LSTM+CRF • Transformer based Models Traditional Statistical Neural
  • 8. Input Format – BIO Tagging • BIO – Begin In Out. • Barack/B-PER Obama/I-PER is/O 44th/O United/B-LOC States/I-LOC President/O ./O • BILOU – a tagging variant: • U – Unit token (for single token entities) • L – Last token in sequence, ex. Barack/B- PER Obama/L-PER 8 Barack B-PER Obama I-PER is O 44th O United B-LOC States I-LOC President O . O
  • 9. Gazetteer – Aho Corasick • Create in-memory data structure from dictionary. • Stream content against data structure. • Multiple matches with single pass. 9 Aho, A.V., and Corasick, M.J., 1975. Efficient String Matching: An aid to bibliographic search 21 43 0 Barack Obama United States NOT(Barack, United) 5 Airlines PER LOC ORG
  • 10. Sequence Modeling - CRF • Sequence version of logistic regression. • Computes optimum labeling l (y0, …, yn) over entire sentence s. • Build multiple feature functions f on each token, return real value in range 0..1. Function parameters: • sentence s with tokens (x0, …, xn) – feature can use any token, the entire sentence, or functions computed over the sentence (POS), • current position i, • previous and next labels yi-1 and yi+1. • Optimum labeling computed as follows, probability computed using softmax. • Weights wj learned using gradient descent. 10
  • 11. Neural Model - BiLSTM • Input is sequence of tokens, output is sequence of BIO tags. • Weights trained end-to-end, no feature engineering needed. • Bidirectional LSTM gets signal from neighboring words on both sides. 11 B-PER I-PER O O B-LOC I-LOC O O Barack Obama is 44th United States PresidentStates .
  • 12. Neural Model – BiLSTM-CRF • Same as previous model, with additional CRF layer. • No feature engineering for CRF, unlike CRF only NER model. • Pre-trained embeddings observed to improve performance. 12 Barack Obama is 44th United States PresidentStates . B-PER I-PER O O B-LOC I-LOC O O CRFBi-LSTM
  • 13. Neural Model – adding char embeddings • Concatenate char embedding + word embedding and feed to Bi-LSTM-CRF. • All weights learned end-to-end. • Handles rare / unknown words; Exploits signal in prefix/suffix. 13 .Barack Obama is 44th United PresidentStates B-PER I-PER O O B-LOC I-LOC O O word embeddings char LSTM/CNN Bi-LSTM-CRF concatenate
  • 14. Neural Model – ELMo preprocessing 14 .Barack Obama is 44th United PresidentStates B-PER I-PER O O B-LOC I-LOC O O char LSTM/CNN Bi-LSTM-CRF concat Contextualized wordembeddings
  • 15. Neural Model – Transformer based • BERT = Bidirectional Encoder Representation for Transformers. • Source of embeddings similar to ELMo in standard BiLSTM + CRF models, OR • Fine-tune LM backed NERs such as HuggingFace’s BertForTokenClassification. 15 .Barack Obama is 44th United PresidentStates[CLS] B-PER I-PER O O B-LOC I-LOC O O
  • 16. More Info on NER Techniques • High level overview on NER in series of blog posts by Tobias Sterbak (https://bit.ly/2pNdgPG). • Traditional NER techniques covered in paper by Rahul Shernagat (2014) -- Named Entity Recognition: A Literature Survey (https://bit.ly/2NRaCAg). • Introduction to Neural Models in paper by Ronan Collolbert and Jason Weston (2008) – A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning (https://bit.ly/32rRYnO) • Others (more modern papers) mentioned in slides. 16
  • 17. Agenda • What can NER do for you? • Evolution of NER techniques • NERDS Architecture • NERDS Usage • Future Work 17
  • 18. NERDS Overview • Framework that provides easy to use NER capabilities to Data Scientists. • Wraps various popular third party NER models. • Extendable, new third party NER tools can be added as needed. • Software Engineering tooling to boost Data Science productivity. • Looking for support, bug reports, contributions, and ideas. 18
  • 19. Unification through I/O Format 19 pyAhoCorasick CRFSuite SpaCy NER Anago BiLSTM AnnotatedDocument ( doc: Document(“Barack Obama is 44th United States President .”), annotations: [ Annotation(start_offset:0, end_offset:12, text:”Barack Obama”, label:”PER”), Annotation(start_offset:22, end_offset:35, text:”United States”, label:”LOC”) ])
  • 20. Benefits of Unification • Consistent API – all models are subclasses of NERModel. • Data prep. done once per project and reused across multiple models. • Reusable Training and Evaluation code. • Familiar Scikit-Learn like API, and access to Scikit-Learn utility functions. • Duck-typing allows us to build Ensembles of NER. • Easy to benchmark NER label data. 20
  • 21. Can we do better? 21 Data: [[“Barack”, “Obama”, “is”, “44th”, “United” “States”, “President”, “.”]] Labels and Predictions: [[“B-PER”, “I-PER”, “O”, “O”, “B-LOC”, “I-LOC”, “O”, “O”]] DictionaryNER I/O Convert SpacyNER I/O Convert CrfNER BiLstmCrfNER
  • 22. ELMo NER Model from Anago 22 DictionaryNER CrfNER SpacyNER BiLstmCrfNER Data: [[“Barack”, “Obama”, “is”, “44th”, “United” “States”, “President”, “.”]] Labels and Predictions: [[“B-PER”, “I-PER”, “O”, “O”, “B-LOC”, “I-LOC”, “O”, “O”]] I/O Convert I/O Convert ElmoNER
  • 23. Agenda • What can NER do for you? • Evolution of NER techniques • NERDS Architecture • NERDS Usage • Future Work 23
  • 24. Dataset • Bio Entity recognition task from BioNLP 2004. • Training and Test sets provided in BIO format. • 511,097 training examples • 104,895 test examples. • Entity Distribution (training set) • 25,307 DNA • 2,481 RNA • 11,217 cell_line • 15,466 cell_type • 55,117 protein 24
  • 25. Dictionary NER • Wraps pyAhoCorasick Automaton • Improvements in fork. • Supports dictionary loading as well as fit(X, y) like other NER models. • Handles multiple entity classes. 25
  • 26. Dictionary NER • Wraps pyAhoCorasick Automaton • Improvements in fork. • Supports dictionary loading as well as fit(X, y) like other NER models. • Handles multiple entity classes. 26
  • 27. CRF NER • Wraps sklearn.crfsuite CRF • Improvements in this fork: • Removes NLTK dependency, replaces with SpaCy. • Allows non-default features to be passed in. 27
  • 28. CRF NER • Wraps sklearn.crfsuite CRF • Improvements • Removes NLTK dependency, replaces with SpaCy. • Allows non-default features to be passed in. 28
  • 29. SpaCy NER • Wraps NER provided by SpaCy toolkit. • Improvements in this fork: • More robust to large data sizes, uses mini-batches for training. 29
  • 30. SpaCy NER • Wraps NER provided by SpaCy toolkit. • Improvements in this fork: • More robust to large data sizes, uses mini-batches for training. 30
  • 31. BiLSTM CRF NER • Wraps Anago BiLSTMCRF. • Improvements in this fork: • Works against latest release (1.0.5) of Anago. • No more intermittent failures due to time step mismatches. 31
  • 32. BiLSTM CRF NER • Wraps Anago BiLSTMCRF. • Improvements in this fork: • Works against latest release (1.0.5) of Anago. • No more intermittent failures due to time step mismatches. 32
  • 33. Elmo NER • Wraps Anago ELModel. • New in this fork, available in current (dev) version of Anago. • Needs (mandatory) base embedding for ELMo preprocessor. 33
  • 34. Elmo NER • Wraps Anago ELModel. • New in this fork, available in current (dev) version of Anago. • Needs (mandatory) base embedding for ELMo preprocessor. 34
  • 35. Ensemble NER • Max Voting • Improvements in this fork: • Unifies Max Voting and Weighted Max Voting NERs into single model. 35
  • 36. Ensemble NER • Max Voting • Improvements in this fork: • Unifies Max Voting and Weighted Max Voting NERs into single model. 36
  • 37. Results (OOTB) • Comparison across models • ELMO based CRF has best performance. • SpaCy and BiLSTM have comparable performance, but CRF is competitive. • Model based NERs outperform gazetteers. • F1-scores range from 0.65 to 0.80 • Comparison across entity types • Some correlation observed between data volume and F1-scores for other models. • F1-scores range from 0.61 to 0.81 37
  • 38. Agenda • What can NER do for you? • Evolution of NER techniques • NERDS Architecture • NERDS Usage • Future Work 38
  • 39. Future Work • Current API is only superficially Scikit-Learn like, convert to make models fully conform to Scikit-Learn Classifier API. • Eliminate Serialization issues reported by joblib.Parallel. • Eliminate EnsembleNER in favor of ScikitLearn’s VotingClassifier. • Leverage Scikit-Learn’s Model Selection classes (RandomizedSearchCV and GridSearchCV). • Add FLAIR and BERT based NER to supported model collection. • BRAT annotation adapter. 39