AI411: NLP (NaturalLanguage Processing )
Lecture 1: Introduction
Fall 2024
Dr. Ensaf Hussein
Associate Professor, Artificial Intelligence,
School of Information Technology and Computer Science,
Nile University.
2.
In this Coursewe will learn to:
1. The foundations of the effective modern methods for deep learning
applied to NLP
▪ Basics first, then key methods used in NLP: Word vectors, feed-forward
networks, recurrent networks, attention, encoder-decoder models,
transformers, etc.
2. A big picture understanding of human languages and the difficulties in
understanding and producing them via computers
3. An understanding of and ability to build systems for some of the major
problems in NLP:
▪ Word meaning, machine translation, summarization, question answering
Grading policy
• CourseWork:
• Lect Quizzes 10%
• Lab Quizzes: 10%
• Assignments: 10%
• Project : 20%
• Midterm: 20%
• Final Exam: 30%
• students with less than 30% in final exam will get an F in the course
• Students should attend 75% of lectures and labs to enter the final exam
Today’s Agenda
• Whatis NLP?
• History of NLP
• NLP Tasks
– NLU Applications
– NLG Applications
• What Is Language?
– Budling Blocks of language
– Why is Language challenging?
• ML, DL, and NLP: An Overview
• Approaches to NLP
– Heuristics-Based NLP
– Machine Learning for NLP
– Deep Learning for NLP
9.
What is NaturalLanguage Processing (NLP)?
• NLP deals with analyzing, understanding, and generating
human language through computational models.
• While humans communicate using natural language, machines
operate on structured data.
• NLP acts as the intermediary, converting unstructured human
language into structured data that machines can interpret.
History of NLP
•NLP has been through (at least) 3 major eras:
▪1950s-1980s: Linguistics Methods and Handwritten Rules
▪1980s-2013: Corpus/Statistical Methods
▪2013s- Now: Deep Learning
• Lucky you! You’re right near the start of a paradigm shift!
12.
1950s - 1980s:Linguistics/RuleSystems
• NLP systems focus on:
▪Linguistics: Grammar rules, sentence structure parsing, etc
▪Handwritten Rules: Huge sets of logical (if/else) statements
▪Ontologies: Manually created (domain-specific!) knowledge
bases to augment rules above
• Problems:
▪Too complex to maintain
▪Can’t scale!
▪Can’t generalize!
13.
Eliza: 1966
• ELIZAis a simple pattern-based that uses pattern matching to recognize
phrases like “You are X” and translate them into suitable outputs like “What
makes you think I am X?”.
• ELIZA doesn’t actually need to know anything to mimic a Rogerian
psychotherapist.
• modern conversational agents are much more than a diversion; they can
answer questions, book flights, or find restaurants, functions for which they
rely on a much more sophisticated understanding of the user’s intent.
14.
1980s – 2013s:Corpus/StatisticalMethods
• NLP starts using Machine Learning methods
• Use statistical learning over huge datasets of unstructured text
▪Corpus: Collection of text documents
▪e.g. Supervised Learning: Machine Translation
▪e.g. Unsupervised Learning: Deriving Word "Meanings"
(vectors)
15.
2013s - NowDeepLearning
• Deep Learning made its name with Images first
• 2013: Deep Learning has major NLP breakthroughs
▪Researchers use a neural network to win the Large Scale
Visual Recognition Challenge (LSVRC)
▪This state of the art approach beat other ML approaches with
half their error
rate (26% vs 16%)
• Very useful for unified processing of Language + Images
16.
The Role ofLanguage in NLP
Language is a complex structure made up of different
components, such as phonemes, morphemes, syntax, and
context. Understanding these components is essential for
building NLP systems that can process and interpret human
language effectively.
Phonemes
• The smallestunit of sound in a
language.
• Though they carry no meaning by
themselves, they form the basis
for speech recognition systems.
• For instance, "p" in "pat" and "b"
in "bat" are distinct phonemes in
English.
19.
Morphemes
• The smallestunit of meaning in a
language, often seen as prefixes,
suffixes, or roots.
• For example, in the word
"unbreakable," "un-", "break,"
and "-able" are all morphemes.
20.
Syntax
• The setof rules that govern how
sentences are structured in a
language.
• It dictates how words are combined
to form grammatically correct
sentences.
• Parsing techniques in NLP rely on
syntax to understand sentence
structure.
21.
Context
• Context iscrucial in determining the meaning of a sentence,
especially when words have multiple meanings.
• For example, the word "bank" could refer to a financial
institution or the side of a river.
• Context helps resolve such ambiguities.
22.
Levels of NLP
MorphemesPhonemes
Syntax
Semantics
Discourse
Text
Speech
Early Rules Engines
Corpus Methods
Modern Deep Learning is
about here!
23.
NLP: Speech vsText
Concept Space
Text Speech
• Natural Language can refer to Text or Speech
• Goal of both is the same: translate raw data (text or speech) into underlying
concepts (NLU) then possibly into the other form (NLG)
Text to Speech
Speech to Text
NLU
NLG
NLU
NLG
NLU Vs. NLGApplications
• ML on Text (Classification, Regression,
Clustering)
• Document Recommendation
• Language Identification
• Natural LanguageSearch
• Sentiment Analysis
• Text Summarization
• Extracting Word/Document Meaning(vectors)
• Relationship Extraction
• Topic Modeling
• …andmore!
• Image Captioning
• (Better) TextSummarization
• Machine Translation
• Question Answering/Chatbots
• …so much more
• Notice NLU is almost a prerequisite for NLG
26.
NLU Application
• DocumentClassification: “documents” - discrete collections of text - into
categories
▪Example: classify movie reviews as positive vs. negative
• Document Recommendation: Choosing the most relevant document based
on some information:
▪Example: show most relevant webpages based on query to search engine
• Topic Modeling: Breaking a set of documents into topics at the word level
▪Example: find documents belonging to a certain topic
NLG: Machine Translation
Examplefrom Google®’s machine translation system (2016)
Source: https://ai.googleblog.com/2016/09/a-neural-network-for-
machine.html
• Automatically translate text between language
29.
NLG: TextSummarization
• Automaticallygenerate text summaries of documents
▪ Example: generate headlines of news articles
Source: https://ai.googleblog.com/2016/08/text-summarization-with-
tensorflow.html
33
determiner verb (past)prep. proper proper poss. adj. noun
Some questioned if Tim Cook ’s first product
modal verb det. adjective noun prep. proper punc.
would be a breakaway hit for Apple .
Part-of-Speech-Tagging
34.
determiner verb (past)prep. proper proper poss. adj. noun
modal verb det. adjective noun prep. proper punc.
34
determiner verb (past) prep. noun noun poss. adj. noun
Some questioned if Tim Cook ’s first product
modal verb det. adjective noun prep. noun punc.
would be a breakaway hit for Apple .
Part-of-Speech-Tagging
35.
35
NP NP
Cook ’sfirst product may not be a breakaway hit
Syntactic Parsing
36.
36
NP NP
VP
Cook ’sfirst product may not be a breakaway hit
Syntactic Parsing
37.
37
NP NP
VP
Cook ’sfirst product may not be a breakaway hit
S
Syntactic Parsing
38.
38
Some questioned ifTim Cook’s first product would be a breakaway hit for Apple.
PERSON ORGANIZATION
Named Entity Recognition
39.
39
Some questioned ifTim Cook’s first product would be a breakaway hit for Apple.
Entity Linking
40.
40
Some questioned ifTim Cook’s first product would be a
breakaway hit for Apple.
It’s the company’s first new device since he became CEO.
Coreference Resolution
41.
41
Some questioned ifTim Cook’s first product
would be a breakaway hit for Apple.
It’s the company’s first new device since he
became CEO.
Coreference Resolution
42.
42
Some questioned ifTim Cook’s first product
would be a breakaway hit for Apple.
It’s the company’s first new device since he
became CEO.
Coreference Resolution
43.
43
Some questioned ifTim Cook’s first product
would be a breakaway hit for Apple.
It’s the company’s first new device since he
became CEO.
??
Coreference Resolution
44.
• Once therewas a boy named Fritz who loved to draw. He drew everything.
In the morning, he drew a picture of his cereal with milk. His papa said,
“Don’t draw your cereal. Eat it!”
• After school, Fritz drew a picture of his bicycle. His uncle said, “Don't draw
your bicycle. Ride it!”
• …
• What did Fritz draw first?
• A) the toothpaste
• B) his mama
• C) cereal and milk
• D) his bicycle
44
Reading Comprehension
46
Other ways areneeded.
We must find other ways.
I absolutely do believe there was an iceberg in those waters.
I don't believe there was any iceberg at all anywhere near the Titanic.
4.4
1.2
Input Output
Pakistan bomb victims’ families end protest
Pakistan bomb victims to be buried after protest ends
2.6
Sentence Similarity
47.
47
he bent downand searched the large container, trying to find
anything else hidden in it other than the _____
Word Prediction
48.
48
he turned toone of the cops beside him. “search the entire
coffin.” the man nodded and bustled forward towards the coffin.
he bent down and searched the large container, trying to find
anything else hidden in it other than the _____
Word Prediction
49.
Language Models
• Amodel like the one in the previous examples is called
language model.
• A language model is trained on a huge amount of texts, trying
to learn the distribution of the words in the language.
• This can be done by training the model to predict the words
that follow a sentence, or trying to recover masked words from
their surrounding context.
50.
GPT3
• This modelis a transformer neural network consisting of 175
billions of parameters and requiring approximately 800GB of
storage.
• The training data consists of hundreds of billions of words from
texts from around the Internet. It has been trained by OpenAI,
which hosts it and make it available as a service.
Challenges in NLP
DespiteNLP’s success, the field is fraught with several challenges due
to the inherent complexity of human language:
1. Ambiguity
Human language is inherently ambiguous. A single sentence can have
multiple meanings depending on the context.
For example, “I made her duck” could mean either preparing a meal or causing
someone to crouch.
2. Common Knowledge
Humans use common knowledge—unstated facts and assumptions—
in conversations. Machines, however, struggle to understand this
unless explicitly programmed.
Challenges in NLP
3.Creativity
Language often includes metaphors, idioms, and sarcasm,
making it difficult for machines to interpret literal versus
figurative meanings.
4. Diversity of Languages
The grammatical and syntactical rules of languages vary greatly,
making it challenging to create universal NLP systems.
55.
Machine Learning, DeepLearning, and NLP: An Overview
• NLP has evolved from heuristic, rule-based systems to approaches
leveraging machine learning (ML) and deep learning (DL).
• Early approaches focused on hand-crafted rules, but ML models,
such as Naive Bayes and SVMs, now dominate.
•
• Deep learning, especially with architectures like RNNs, LSTMs,
CNNs, and Transformers (e.g., BERT), has significantly enhanced
NLP capabilities.
Approaches to NLP
Thereare three main approaches:
• Heuristics-Based NLP: Early systems based on rules and domain-specific
knowledge (e.g., lexicons, regex).
• Machine Learning for NLP: Supervised and unsupervised learning
techniques, such as Naive Bayes, SVMs, and HMMs, have been applied
to NLP tasks.
• Deep Learning for NLP: Advanced models like LSTMs and Transformers
dominate the field but still face challenges like overfitting, domain
adaptation, and interpretability.
58.
Wrapping Up …
•What is NLP?
• History of NLP
• NLP Tasks
– NLU Applications
– NLG Applications
• What Is Language?
– Budling Blocks of language
– Why is Language challenging?
• ML, DL, and NLP: An Overview
• Approaches to NLP
– Heuristics-Based NLP
– Machine Learning for NLP
– Deep Learning for NLP