SlideShare a Scribd company logo
1 of 41
Download to read offline
Beyond the $!mb0ls
A 30-minute Overview of Natural Language Processing
Mengsay Loem
Tokyo Insitute of Technology
2023/06/17
NLP-based Applications/Services
2
https://twitter.com/EconomyApp/status/1622029832099082241
What is NLP? Natural Language Processing
• Processing of natural languages used by humans
• e.g., English, Japanese, Khmer, Chinese …
• Making natural language understandable to machines
3
English
⽇本語
!"ែខ%រ
C/C++, Java
Python, R
Lojban
Natural Language Artificial Language
Natural Language Processing
Some Tasks in NLP
4
Recent advances in dept
learning-based optimization and
computational hardware have
greatly facilitated progress in
natural language processing
近年のディープラーニングに
基づく最適化と計算機ハード
ウェアの進歩は、⾃然⾔語処
理の進展を⼤きく促している
⾃然⾔語処理の進展促す
=ディープラーニング最適化
と計算機ハードウェア
Recent advances in dept
learning-based optimization and
computational hardware has
greatly facilitated progress in
naturail language processing
Machine translation Information Retrieval Proofreading/
Error Correction
Document
classification/clustering
Dialogue System
Proofreading/
Error Correction
Translation
Summarization
Difficulties in NLP
Back to Basic of Languages
5
What are Languages for?
6
Tool for Communication Tool for Thought Tool for Record
Some Features of Languages
7
Arbitrariness
Connection between a
word and its meaning
is often arbitrary
Sociolinguistics
Many aspects of language
usage are based on social
customs and can't always
be logically explained
Evolving
Language changes over
time and varies across
regions and cultures
Networked
Language reflects complex
relationships between
entities and concepts
Ambiguous
A single expression can
have multiple meanings
based on context
Some Features of Languages
8
Arbitrariness
Connection between a
word and its meaning
is often arbitrary
Sociolinguistics
Many aspects of language
usage are based on social
customs and can't always
be logically explained
Evolving
Language changes over
time and varies across
regions and cultures
Networked
Language reflects complex
relationships between
entities and concepts
Ambiguous
A single expression can
have multiple meanings
based on context
Computer must have vast
knowledge of languages
Computers must be flexible in interpreting
meaning in text
Some Difficulties in NLP
• Ambiguity in Semantics
• sleep = 寝る
• Natural Language Processing = NLP
• … machine learning… ; … car machine …
• Dealing with hierarchical sequential data
• character → word → phrase → sentence → paragraph → text
• What does language comprehension entail for machine?
• To machine, input text is just a sequence to SYMBOLs
9
Approaches to NLP
From Rule-based to Deep Learning-based
10
Before vs. After Deep Learning
11
Input text
POS Tagging
Syntatic Parsing
Predicate-Argument
Recognition
Application System
Output
Input text
Output
Training
Data
Deep Learning based NLP
Traditional NLP
Training
Data
Neural Network
NLP’s Methods in a Century
• 1940s~1960s
• First Computer: ENIAC (1946)
• Translating Machine Project (1952)
• Limit of Computing Performance
• 1960s~1990s
• Digital text data; Brown Corpus (1967)
• MEDLINE Database Service (1971)
• Manual-Rule-based Text Analysis
• 1990s~2010s (※1990: WWW, 1998: Google)
• Large-scale Corpora + Machine Learning
• Translation by Analogy (京大1981)
• Statistical Machine Translation (1980s)
• CALO (origin of Siri), Watson
12
• 2010s
• Mainly on Neural Network
• Word2Vec (Google, 2013)
• Neural Machine Translation (2014)
• Transformer (Google, 2017)
• Pre-trained Language Models, BERT
(2018), GPT-2 (2019), BART(2019)
• 2020s
• Large-scaled Pre-trained Language
Models (LLMs)
• GPT-3 (2020), T5 (2020), ChatGPT
(2022), GPT-4(2023)
Neural Network + NLP
Basic of Neural Network and How it works in NLP
13
Basic structure of Neural Network
• Linear and Non-linear Transformation
• Matrix Multiplication+Non-linear Activation Function
• ! = # ∑% &%'%
14
'(
')
0.3
−0.5
0.1
1
1 + 123 0.76
0.4
∑
'7
'8
Basic structure of Neural Network
• Linear and Non-linear Transformation
• Matrix Multiplication+Non-linear Activation Function
• ! = # ∑% &%'%
• Input: adjective and noun in a movie review
• Output: 0 (negative) / 1 (positive)
• Such a wonderful movie. → 1 (positive)
15
movie
boring
wonderful
0.3
−2.5
3.1
1
1 + 012 0.97
time 0.1
∑
Basic structure of Neural Network
• Linear and Non-linear Transformation
• Matrix Multiplication+Non-linear Activation Function
• ! = # ∑% &%'%
• Input: adjective and noun in a movie review
• Output: 0 (negative) / 1 (positive)
• Boring time of the year. → 0 (negative)
16
movie
boring
wonderful
0.3
−2.5
3.1
1
1 + 012 0.08
time 0.1
∑
Basic structure of Neural Network
• Linear and Non-linear Transformation
• Matrix Multiplication+Non-linear Activation Function
• ! = # ∑% &%'%
• Input: adjective and noun in a movie review
• Output: 0 (negative) / 1 (positive)
• Boring time of the year. → 0 (negative)
17
movie
boring
wonderful
0.3
−2.5
3.1
1
1 + 012 0.08
time 0.1
∑
How to represent these words?
Neural Network for NLP
• How to represent input word/sentence/text as a vector ?
• Simple solution: one-hot vector for word level
• Bag-of-words: each element represents occurrence frequency
18
0
0
1
⋮
0
0
1
0
0
⋮
0
0
movie
great
like
⋮
love
I
like
movie
1
0
1
⋮
0
1
I like movie
vocabulary
size
(10K - 100K)
Neural Network for NLP
• How to represent input word/sentence/text as a vector ?
• Simple solution: one-hot vector for word level
• Bag-of-words: each element represents occurrence frequency
• Problems:
• Cannot deal with Synonymy
• Cannot deal with differences in word orders
• Distributional hypothesis
words in the same contexts tend to be sematically similar
19
1
0
0
⋮
0
0
like
0
0
0
⋮
1
0
love
Embedding Representation
• Represent a word by a real number vector
• Share features of each word in embedding space (vector space)
• Represent similar words with similar vectors
• (use lower dimension compared to Bag-of-words)
20
0.12
−1.90
⋮
0.55
1.37
book dictionary
0.17
−1.80
⋮
0.52
1.57
phone
0.97
0.10
⋮
1.63
−0.11
Similar Not Similar
Dense vector
100 - 1K dimensions
book
dictionary
phone
Defined by Neural Network
word2vec
• Learn to represent a word by an embedding vector
• CBoW, skip-gram
• Example:
• Start with randomly initialized embedding vectors
• Learn to predict a target word given its contexts
• Maximize target word’s probability
21
…… he ate an apple yesterday with ……
!"#$ !"#% !" !"&% !"&$
'
"(%
)
'
#*+,+*,,./
log 3 !"|!"&,
Neural Network for NLP
• How to represent input word/sentence/text as a vector ?
• Simple solution: one-hot vector for word level
• Bag-of-words: each element represents occurrence frequency
• Problems:
• Cannot deal with Synonymy
→ Embedding representation
• Cannot deal with differences in word orders
• Distributional hypothesis
words in the same contexts tend to be sematically similar
22
1.67
0.45
−1.80
0.12
like
1.57
0.65
−1.60
0.15
love
Neural Network for NLP
• How to represent input word/sentence/text as a vector ?
• Simple solution: one-hot vector for word level
• Bag-of-words: each element represents occurrence frequency
• Problems:
• Cannot deal with Synonymy
→ Embedding representation
• Cannot deal with differences in word orders
→ revise network architecture
• Distributional hypothesis
words in the same contexts tend to be sematically similar
23
Dealing with Sequential Data
• Recurrent Neural Network
• Related architectures
• Long Short-Term Memory (LSTM)
• Gated Recurrent Unit (GRU)
• Transformer
• Attention mechanism
• Basic Neural Network architecture (Feed-Forward Network)
24
Solving NLP Tasks: Examples
• Task:
• Machine Translation (MT) : English → Khmer
• Summarization : Document → Summary
• Grammatical Error Correction
• Model:
• Encoder-Decoder
• Data
• Parallel data
{(input, output)}
25
Encoder Decoder
Hidden
Representation
Input sequence
Output sequence
RNN, LSTM, Transformer,…
LLMs Era of NLP
What is happening with Large-scale Language Models?
26
LLMs Revolution
27
https://github.com/Mooler0410/LLMsPracticalGuide
Solving NLP Tasks with LLMs
• Pre-train & Fine-tune Paradigm
• Pre-train a language model with large corpora
• Fine-tune pre-trained model on specific task with (small) data sets
• Prompt-based Method
• Pre-train a language model with large corpora
• Ask model to solve various tasks with prompt written in natural
language
28
Pre-train & Fine-tune Paradigm
• Train a Language Model with (very very) large data sets
• Fine-tune pre-trained model on (specific) target tasks
• Document Classification, Machine Translation, Summarization…
29
機械学習(きかいがくしゅう、英: machine
learning)とは、経験からの学習により⾃
動で改善するコンピューターアルゴリズム
もしくはその研究領域で[1][2]、⼈⼯知能
の⼀種であるとみなされている。「訓練
データ」もしくは「学習データ」と呼ばれ
るデータを使って学習し、学習結果を使っ
て何らかのタスクをこなす。例えば過去の
スパムメールを訓練データとして⽤いて学
習し、スパムフィルタリングというタスク
をこなす、といった事が可能となる。
Machine learning (ML) is the study of
computer algorithms that can improve
automatically through experience and by the
use of data.[1] It is seen as a part of artificial
intelligence. Machine learning algorithms build
a model based on sample data, known as
training data, in order to make predictions or
decisions without being explicitly programmed
to do so.[2] Machine learning algorithms are
used in a wide variety of applications, such as
in medicine, email filtering, speech recognition,
and computer vision, where it is difficult or
unfeasible to develop conventional algorithms
to perform the needed tasks.[3]
Fine-tuning
Recent advances in deep learning-based optimization and
computational hardware have greatly facilitated progress in
natural language processing
近年のディープラーニングに基づく最適化と計算機ハードウェ
アの進歩は、⾃然⾔語処理の進展を⼤きく促している
Pre-training
I have a GPT-n pen
I [MASK] a pen BERT have
I [MASK] pen a BART I have a pen
Pre-trained Model
• Why pre-train on Language Model ?
• Large scale training data is available !!!
• Neural-based model is Data Hungry!
30
Pre-training
I have a GPT-n pen
I [MASK] a pen BERT have
I [MASK] pen a BART I have a pen
Prompt-based Methods
31
• ChatGPT/GPT-x
How ChatGPT works?
From next word prediction to human-instructed training
32
(Probabilistic) Language Model
• Models that assign probabilities to sequence of words
• To evaluate text (sequence) generated by a system
• ! I, like, watching, movie > ! I, eat, watch, movie
• Predict next coming word
• Many NLP tasks can be formulated as language modeling
33
The capital of Japan is
Beijing
Seattle
Tokyo
0.20
0.05
0.75
! Tokyo | The, capital, of, Japan, is = 0.75
! The, capital, of, Japan, is, Tokyo, ja , 日本, の, 首都, は, A
Training with Human Feedback
34
L Ouyang, J Wu, X Jiang, D Almeida, et. al. 2022. Training Language Models to Follow Instructions with Human Feedback. arXiv:2203.02155.
Improve LLMs with Discussions
• Solving NLP Problems through Human-System
Collaboration: A Discussion-based Approach
• Kaneko et al. 2023
35
Beyond Performance
Risks and Problems to Solve in LLMs Era
36
Threats to LLMs
• Turning point in a wide range of fields
• including search engine, finance, advertising, education, and legal.
• There's been an explosive increase in services incorporating LLMs.
• Jobs such as translators, investigators, writers, etc. are being shortened
(Eloundou+2023).
• Hallucinations
• Even when unsure, they can calmly lie, responding without basis in fact.
• Bias
• They learn and amplify societal biases related to gender, race, etc.
• They can be adjusted to respond in ways that benefit specific individuals
or groups.
• Personal information exposure
• Misuse
37
- T Eloundou, S Manning, P Mishkin, D Rock. 2023. GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large
Language Models. arXiv:2303.10130.
- 岡崎 直観 2023. ⼤規模⾔語モデルの驚異と脅威. https://speakerdeck.com/chokkan/20230327_riken_llm
38
39
40
Data and GPUs
are All You Need!
Summary
• NLP aims to make human’s language understandable to
machines.
• Many difficulties for NLP due to languages’ characteristics,
such as ambiguity, complex network structures, etc.
• Deep Learning-based methods have been pushing NLP to an
impressive achievement over these decades.
• New era of NLP has been coming with large-scale language
models
• There are still many problems to deal with NLP
• Low-resource languages
• Bias, Hallucination, etc.
41

More Related Content

What's hot

Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer modelsDing Li
 
Fine tuning large LMs
Fine tuning large LMsFine tuning large LMs
Fine tuning large LMsSylvainGugger
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Alia Hamwi
 
Customizing LLMs
Customizing LLMsCustomizing LLMs
Customizing LLMsJim Steele
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers Arvind Devaraj
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingYasir Khan
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processingrewa_monami
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingMariana Soffer
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsRoelof Pieters
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language ModelsLeon Dohmen
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDevashish Shanker
 
Natural language processing
Natural language processingNatural language processing
Natural language processingYogendra Tamang
 
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapEpisode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapAnant Corporation
 
Fine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsFine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsOVHcloud
 
Neural Language Generation Head to Toe
Neural Language Generation Head to Toe Neural Language Generation Head to Toe
Neural Language Generation Head to Toe Hady Elsahar
 

What's hot (20)

Bert
BertBert
Bert
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
 
Fine tuning large LMs
Fine tuning large LMsFine tuning large LMs
Fine tuning large LMs
 
Gpt models
Gpt modelsGpt models
Gpt models
 
Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)Introduction to natural language processing (NLP)
Introduction to natural language processing (NLP)
 
Customizing LLMs
Customizing LLMsCustomizing LLMs
Customizing LLMs
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 
NLP.pptx
NLP.pptxNLP.pptx
NLP.pptx
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
BERT introduction
BERT introductionBERT introduction
BERT introduction
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapEpisode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
 
Fine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsFine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP models
 
Nlp
NlpNlp
Nlp
 
Neural Language Generation Head to Toe
Neural Language Generation Head to Toe Neural Language Generation Head to Toe
Neural Language Generation Head to Toe
 

Similar to Beyond the Symbols: A 30-minute Overview of NLP

NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPAnuj Gupta
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersRoelof Pieters
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014Paris Open Source Summit
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Saurabh Kaushik
 
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0Plain Concepts
 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLPSatyam Saxena
 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLPAnuj Gupta
 
Natural language processing and search
Natural language processing and searchNatural language processing and search
Natural language processing and searchNathan McMinn
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for developmentAravind Reddy
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for developmentAravind Reddy
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introductionananth
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPMachine Learning Prague
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Yuriy Guts
 
Module 8: Natural language processing Pt 1
Module 8:  Natural language processing Pt 1Module 8:  Natural language processing Pt 1
Module 8: Natural language processing Pt 1Sara Hooker
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingVeenaSKumar2
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?NAVER Engineering
 
History of deep learning
History of deep learningHistory of deep learning
History of deep learningayatan2
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsRoelof Pieters
 
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersYoung Seok Kim
 

Similar to Beyond the Symbols: A 30-minute Overview of NLP (20)

NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLP
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ers
 
NLP Bootcamp
NLP BootcampNLP Bootcamp
NLP Bootcamp
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
 
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
DotNet 2019 | Pablo Doval - Recurrent Neural Networks with TF2.0
 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLP
 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLP
 
Natural language processing and search
Natural language processing and searchNatural language processing and search
Natural language processing and search
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLP
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Module 8: Natural language processing Pt 1
Module 8:  Natural language processing Pt 1Module 8:  Natural language processing Pt 1
Module 8: Natural language processing Pt 1
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
 
History of deep learning
History of deep learningHistory of deep learning
History of deep learning
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
 
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask Learners
 

Recently uploaded

My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 

Recently uploaded (20)

My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 

Beyond the Symbols: A 30-minute Overview of NLP

  • 1. Beyond the $!mb0ls A 30-minute Overview of Natural Language Processing Mengsay Loem Tokyo Insitute of Technology 2023/06/17
  • 3. What is NLP? Natural Language Processing • Processing of natural languages used by humans • e.g., English, Japanese, Khmer, Chinese … • Making natural language understandable to machines 3 English ⽇本語 !"ែខ%រ C/C++, Java Python, R Lojban Natural Language Artificial Language Natural Language Processing
  • 4. Some Tasks in NLP 4 Recent advances in dept learning-based optimization and computational hardware have greatly facilitated progress in natural language processing 近年のディープラーニングに 基づく最適化と計算機ハード ウェアの進歩は、⾃然⾔語処 理の進展を⼤きく促している ⾃然⾔語処理の進展促す =ディープラーニング最適化 と計算機ハードウェア Recent advances in dept learning-based optimization and computational hardware has greatly facilitated progress in naturail language processing Machine translation Information Retrieval Proofreading/ Error Correction Document classification/clustering Dialogue System Proofreading/ Error Correction Translation Summarization
  • 5. Difficulties in NLP Back to Basic of Languages 5
  • 6. What are Languages for? 6 Tool for Communication Tool for Thought Tool for Record
  • 7. Some Features of Languages 7 Arbitrariness Connection between a word and its meaning is often arbitrary Sociolinguistics Many aspects of language usage are based on social customs and can't always be logically explained Evolving Language changes over time and varies across regions and cultures Networked Language reflects complex relationships between entities and concepts Ambiguous A single expression can have multiple meanings based on context
  • 8. Some Features of Languages 8 Arbitrariness Connection between a word and its meaning is often arbitrary Sociolinguistics Many aspects of language usage are based on social customs and can't always be logically explained Evolving Language changes over time and varies across regions and cultures Networked Language reflects complex relationships between entities and concepts Ambiguous A single expression can have multiple meanings based on context Computer must have vast knowledge of languages Computers must be flexible in interpreting meaning in text
  • 9. Some Difficulties in NLP • Ambiguity in Semantics • sleep = 寝る • Natural Language Processing = NLP • … machine learning… ; … car machine … • Dealing with hierarchical sequential data • character → word → phrase → sentence → paragraph → text • What does language comprehension entail for machine? • To machine, input text is just a sequence to SYMBOLs 9
  • 10. Approaches to NLP From Rule-based to Deep Learning-based 10
  • 11. Before vs. After Deep Learning 11 Input text POS Tagging Syntatic Parsing Predicate-Argument Recognition Application System Output Input text Output Training Data Deep Learning based NLP Traditional NLP Training Data Neural Network
  • 12. NLP’s Methods in a Century • 1940s~1960s • First Computer: ENIAC (1946) • Translating Machine Project (1952) • Limit of Computing Performance • 1960s~1990s • Digital text data; Brown Corpus (1967) • MEDLINE Database Service (1971) • Manual-Rule-based Text Analysis • 1990s~2010s (※1990: WWW, 1998: Google) • Large-scale Corpora + Machine Learning • Translation by Analogy (京大1981) • Statistical Machine Translation (1980s) • CALO (origin of Siri), Watson 12 • 2010s • Mainly on Neural Network • Word2Vec (Google, 2013) • Neural Machine Translation (2014) • Transformer (Google, 2017) • Pre-trained Language Models, BERT (2018), GPT-2 (2019), BART(2019) • 2020s • Large-scaled Pre-trained Language Models (LLMs) • GPT-3 (2020), T5 (2020), ChatGPT (2022), GPT-4(2023)
  • 13. Neural Network + NLP Basic of Neural Network and How it works in NLP 13
  • 14. Basic structure of Neural Network • Linear and Non-linear Transformation • Matrix Multiplication+Non-linear Activation Function • ! = # ∑% &%'% 14 '( ') 0.3 −0.5 0.1 1 1 + 123 0.76 0.4 ∑ '7 '8
  • 15. Basic structure of Neural Network • Linear and Non-linear Transformation • Matrix Multiplication+Non-linear Activation Function • ! = # ∑% &%'% • Input: adjective and noun in a movie review • Output: 0 (negative) / 1 (positive) • Such a wonderful movie. → 1 (positive) 15 movie boring wonderful 0.3 −2.5 3.1 1 1 + 012 0.97 time 0.1 ∑
  • 16. Basic structure of Neural Network • Linear and Non-linear Transformation • Matrix Multiplication+Non-linear Activation Function • ! = # ∑% &%'% • Input: adjective and noun in a movie review • Output: 0 (negative) / 1 (positive) • Boring time of the year. → 0 (negative) 16 movie boring wonderful 0.3 −2.5 3.1 1 1 + 012 0.08 time 0.1 ∑
  • 17. Basic structure of Neural Network • Linear and Non-linear Transformation • Matrix Multiplication+Non-linear Activation Function • ! = # ∑% &%'% • Input: adjective and noun in a movie review • Output: 0 (negative) / 1 (positive) • Boring time of the year. → 0 (negative) 17 movie boring wonderful 0.3 −2.5 3.1 1 1 + 012 0.08 time 0.1 ∑ How to represent these words?
  • 18. Neural Network for NLP • How to represent input word/sentence/text as a vector ? • Simple solution: one-hot vector for word level • Bag-of-words: each element represents occurrence frequency 18 0 0 1 ⋮ 0 0 1 0 0 ⋮ 0 0 movie great like ⋮ love I like movie 1 0 1 ⋮ 0 1 I like movie vocabulary size (10K - 100K)
  • 19. Neural Network for NLP • How to represent input word/sentence/text as a vector ? • Simple solution: one-hot vector for word level • Bag-of-words: each element represents occurrence frequency • Problems: • Cannot deal with Synonymy • Cannot deal with differences in word orders • Distributional hypothesis words in the same contexts tend to be sematically similar 19 1 0 0 ⋮ 0 0 like 0 0 0 ⋮ 1 0 love
  • 20. Embedding Representation • Represent a word by a real number vector • Share features of each word in embedding space (vector space) • Represent similar words with similar vectors • (use lower dimension compared to Bag-of-words) 20 0.12 −1.90 ⋮ 0.55 1.37 book dictionary 0.17 −1.80 ⋮ 0.52 1.57 phone 0.97 0.10 ⋮ 1.63 −0.11 Similar Not Similar Dense vector 100 - 1K dimensions book dictionary phone Defined by Neural Network
  • 21. word2vec • Learn to represent a word by an embedding vector • CBoW, skip-gram • Example: • Start with randomly initialized embedding vectors • Learn to predict a target word given its contexts • Maximize target word’s probability 21 …… he ate an apple yesterday with …… !"#$ !"#% !" !"&% !"&$ ' "(% ) ' #*+,+*,,./ log 3 !"|!"&,
  • 22. Neural Network for NLP • How to represent input word/sentence/text as a vector ? • Simple solution: one-hot vector for word level • Bag-of-words: each element represents occurrence frequency • Problems: • Cannot deal with Synonymy → Embedding representation • Cannot deal with differences in word orders • Distributional hypothesis words in the same contexts tend to be sematically similar 22 1.67 0.45 −1.80 0.12 like 1.57 0.65 −1.60 0.15 love
  • 23. Neural Network for NLP • How to represent input word/sentence/text as a vector ? • Simple solution: one-hot vector for word level • Bag-of-words: each element represents occurrence frequency • Problems: • Cannot deal with Synonymy → Embedding representation • Cannot deal with differences in word orders → revise network architecture • Distributional hypothesis words in the same contexts tend to be sematically similar 23
  • 24. Dealing with Sequential Data • Recurrent Neural Network • Related architectures • Long Short-Term Memory (LSTM) • Gated Recurrent Unit (GRU) • Transformer • Attention mechanism • Basic Neural Network architecture (Feed-Forward Network) 24
  • 25. Solving NLP Tasks: Examples • Task: • Machine Translation (MT) : English → Khmer • Summarization : Document → Summary • Grammatical Error Correction • Model: • Encoder-Decoder • Data • Parallel data {(input, output)} 25 Encoder Decoder Hidden Representation Input sequence Output sequence RNN, LSTM, Transformer,…
  • 26. LLMs Era of NLP What is happening with Large-scale Language Models? 26
  • 28. Solving NLP Tasks with LLMs • Pre-train & Fine-tune Paradigm • Pre-train a language model with large corpora • Fine-tune pre-trained model on specific task with (small) data sets • Prompt-based Method • Pre-train a language model with large corpora • Ask model to solve various tasks with prompt written in natural language 28
  • 29. Pre-train & Fine-tune Paradigm • Train a Language Model with (very very) large data sets • Fine-tune pre-trained model on (specific) target tasks • Document Classification, Machine Translation, Summarization… 29 機械学習(きかいがくしゅう、英: machine learning)とは、経験からの学習により⾃ 動で改善するコンピューターアルゴリズム もしくはその研究領域で[1][2]、⼈⼯知能 の⼀種であるとみなされている。「訓練 データ」もしくは「学習データ」と呼ばれ るデータを使って学習し、学習結果を使っ て何らかのタスクをこなす。例えば過去の スパムメールを訓練データとして⽤いて学 習し、スパムフィルタリングというタスク をこなす、といった事が可能となる。 Machine learning (ML) is the study of computer algorithms that can improve automatically through experience and by the use of data.[1] It is seen as a part of artificial intelligence. Machine learning algorithms build a model based on sample data, known as training data, in order to make predictions or decisions without being explicitly programmed to do so.[2] Machine learning algorithms are used in a wide variety of applications, such as in medicine, email filtering, speech recognition, and computer vision, where it is difficult or unfeasible to develop conventional algorithms to perform the needed tasks.[3] Fine-tuning Recent advances in deep learning-based optimization and computational hardware have greatly facilitated progress in natural language processing 近年のディープラーニングに基づく最適化と計算機ハードウェ アの進歩は、⾃然⾔語処理の進展を⼤きく促している Pre-training I have a GPT-n pen I [MASK] a pen BERT have I [MASK] pen a BART I have a pen
  • 30. Pre-trained Model • Why pre-train on Language Model ? • Large scale training data is available !!! • Neural-based model is Data Hungry! 30 Pre-training I have a GPT-n pen I [MASK] a pen BERT have I [MASK] pen a BART I have a pen
  • 32. How ChatGPT works? From next word prediction to human-instructed training 32
  • 33. (Probabilistic) Language Model • Models that assign probabilities to sequence of words • To evaluate text (sequence) generated by a system • ! I, like, watching, movie > ! I, eat, watch, movie • Predict next coming word • Many NLP tasks can be formulated as language modeling 33 The capital of Japan is Beijing Seattle Tokyo 0.20 0.05 0.75 ! Tokyo | The, capital, of, Japan, is = 0.75 ! The, capital, of, Japan, is, Tokyo, ja , 日本, の, 首都, は, A
  • 34. Training with Human Feedback 34 L Ouyang, J Wu, X Jiang, D Almeida, et. al. 2022. Training Language Models to Follow Instructions with Human Feedback. arXiv:2203.02155.
  • 35. Improve LLMs with Discussions • Solving NLP Problems through Human-System Collaboration: A Discussion-based Approach • Kaneko et al. 2023 35
  • 36. Beyond Performance Risks and Problems to Solve in LLMs Era 36
  • 37. Threats to LLMs • Turning point in a wide range of fields • including search engine, finance, advertising, education, and legal. • There's been an explosive increase in services incorporating LLMs. • Jobs such as translators, investigators, writers, etc. are being shortened (Eloundou+2023). • Hallucinations • Even when unsure, they can calmly lie, responding without basis in fact. • Bias • They learn and amplify societal biases related to gender, race, etc. • They can be adjusted to respond in ways that benefit specific individuals or groups. • Personal information exposure • Misuse 37 - T Eloundou, S Manning, P Mishkin, D Rock. 2023. GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models. arXiv:2303.10130. - 岡崎 直観 2023. ⼤規模⾔語モデルの驚異と脅威. https://speakerdeck.com/chokkan/20230327_riken_llm
  • 38. 38
  • 39. 39
  • 40. 40 Data and GPUs are All You Need!
  • 41. Summary • NLP aims to make human’s language understandable to machines. • Many difficulties for NLP due to languages’ characteristics, such as ambiguity, complex network structures, etc. • Deep Learning-based methods have been pushing NLP to an impressive achievement over these decades. • New era of NLP has been coming with large-scale language models • There are still many problems to deal with NLP • Low-resource languages • Bias, Hallucination, etc. 41