Recent Advances in Natural Language Processing

Recent Advances in
Natural Language
Processing
Seth Grimes
Alta Plana Corporation
@SethGrimes – grimes@altaplana.com
November 16, 2021

Disclaimer
I use A LOT of commercial product materials in the
slides that follow. These are illustrations and not
recommendations, and I have no financial interest in
the companies (unless disclosed).

Natural Language Processing
Natural Language Understanding (NLU)
• OCR, language detection, tokenization, parsing
• Information extraction: parts of speech, chunks , entities,
aspects, topics/themes, relations, attributes, events, intent …
• Speech processing: verbal and nonverbal
Natural Language Generation (NLG)
NLU + NLG together, for example:
• Summarization
• Machine translation
• Conversational interfaces
• Question answering

Functions
https://gradientflow.com/2020nlpsurvey/

Empirical Methods in Natural Language Processing (EMNLP2020)

Early Days (1958)
Transcribing
Encoding
Abstracting
Who needs to know?
Who knows what?
What is known?
Hans Peter Luhn
“A Business Intelligence System”
IBM Journal, October 1958

“Statistical information
derived from word frequency
and distribution is used by the
machine to compute a relative
measure of significance, first
for individual words and then
for sentences. Sentences scoring
highest in significance are
extracted and printed out to
become the auto-abstract.”
-- H.P. Luhn, The Automatic
Creation of Literature Abstracts,
IBM Journal, 1958.

“All models are wrong, but some are useful.”
-- George Box

+17 years
https://en.wikipedia.org/wiki/Document-term_matrix

Skipping Over a Lot of Stuff…
Rules
Taxonomies & ontologies
Booleans
Statistical models, especially cooccurrence
Sequence models: RNNs & LSTM
…

Word2Vec (2013)
https://code.google.com/p/word2vec/
https://developers.google.com/machine-learning/crash-course/embeddings/translating-to-a-lower-dimensional-space
“You shall know a
word by the
company it
keeps.”
– J.R. Firth, 1957

Word2Vec: Key Concepts
Continuous bag-of-
words (CBOW)
predicts a word from
a window of
surrounding words.
Skip-gram uses a
word to predict a
window of
surrounding words.

Doc2Vec (2014)
https://arxiv.org/abs/1405.4053

Sense2Vec (2015)
“Sense2vec (Trask
et. al, 2015) is a
new twist on
word2vec that lets
you learn more
interesting, detailed
and context-
sensitive word
vectors.”

Encoder-
Decoder
Architecture
Here, machine
translation:
https://leonoverweel.com/projects/2019/nlu-coursework/

Transformers (2017)
2020:
https://arxiv.org/pdf/1910.03771.pdf

BERT (2018)
https://arxiv.org/pdf/1910.03771.pdf

Transfer Learning
https://pennylane.ai/qml/demos/tutorial_quantum_transfer_learning.html

https://pair-code.github.io/lit/

https://blog.rasa.com/rasa-nlu-in-depth-part-1-intent-classification/

Hugging Face Pipeline Examples

Amazon Comprehend Medical
https://aws.amazon.com/comprehend/medical/
“With a simple API call to Amazon Comprehend Medical you can quickly and
accurately extract information such as medical conditions, medications, dosages,
tests, treatments and procedures, and protected health information while retaining
the context of the information. Amazon Comprehend Medical can identify the
relationships among the extracted information to help you build applications for use
cases like population health analytics, clinical trial management, pharmacovigilance,
and summarization. You can also use Amazon Comprehend Medical to link the
extracted information to medical ontologies...”

AWS Comprehend:
Ontology Linking
https://aws.amazon.com/blogs/aws/new-amazon-comprehend-medical-adds-ontology-linking/

Services and Solutions: Examples

https://www.qualtrics.com/experience-management/research/text-analysis/

https://blog.rasa.com/conversational-ai-your-guide-to-five-levels-of-ai-assistants-in-enterprise/
(2018)

https://www.language-technology.com/twin https://newsletter.ruder.io/

Recent Advances in Natural Language Processing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Recent Advances in Natural Language Processing

Similar to Recent Advances in Natural Language Processing (20)

More from Seth Grimes

More from Seth Grimes (20)

Recently uploaded

Recently uploaded (20)

Recent Advances in Natural Language Processing