SlideShare a Scribd company logo
1 of 64
Download to read offline
From Natural Language Processing
to Artificial Intelligence
Jonathan Mugan, PhD
@jmugan
Data Day Texas
January 14, 2017
“ It is not my aim to surprise or shock you – but the simplest way I
can summarize is to say that there are now in the world machines
that think, that learn and that create. Moreover, their ability to do
these things is going to increase rapidly until – in a visible future –
the range of problems they can handle will be coextensive with the
range to which the human mind has been applied. ”
“ It is not my aim to surprise or shock you – but the simplest way I
can summarize is to say that there are now in the world machines
that think, that learn and that create. Moreover, their ability to do
these things is going to increase rapidly until – in a visible future –
the range of problems they can handle will be coextensive with the
range to which the human mind has been applied. ”
Herbert Simon, 1957
“ It is not my aim to surprise or shock you – but the simplest way I
can summarize is to say that there are now in the world machines
that think, that learn and that create. Moreover, their ability to do
these things is going to increase rapidly until – in a visible future –
the range of problems they can handle will be coextensive with the
range to which the human mind has been applied. ”
Herbert Simon, 1957
So what is going on here?
There is no actual
understanding.
AI has gotten smarter
• especially with deep
learning, but
• computers can’t read or
converse intelligently
This is disappointing because we want to
• Interact with our world using natural language
• Current chatbots are an embarrassment
• Have computers read all those documents out there
• So they can retrieve the best ones, answer our
questions, and summarize what is new
To understand language, computers need to
understand the world
Why can you pull a wagon with a string but not push it? -- Minsky
Why is it unusual that a gymnast competed with one leg? -- Schank
Why does it only rain outside?
If a book is on a table, and you push the table, what happens?
If Bob went to the junkyard, is he at the airport?
They need to be able to answer questions like:
Grounded Understanding
• We understand language in a way that is grounded in sensation and action.
sensation representation action
• When someone says “chicken,” we map that to our experience with chickens.
• We understand each other because we have had similar experiences.
• This is the kind of understanding that computers need.
See: Benjamin Bergen, Steven Pinker,
Mark Johnson, Jerome Feldman, and
Murray Shanahan
“chicken”
sensation representation action
Two paths to meaning:
sensation representation action
meaningless
tokens
manual
representations
world models
merged
representations
word2vec
seq2seq
question
answering
external world
training
symbolic
path
sub-symbolic
path
sensation representation action
meaningless
tokens
manual
representations
world models
merged
representations
word2vec
seq2seq
question
answering
external world
training
symbolic
path
sub-symbolic
path
Bag-of-words representation
“The aardvark ate the zoo.” = [1,0,1, ..., 0, 1]
We can do a little better and count how often the words occur.
tf: term frequency, how often does the word occur.
“The aardvark ate the aardvark by the zoo.” = [2,0,1, ..., 0, 1]
Treat words as arbitrary symbols and look at their frequencies.
“dog bit man” will be the same as “man bit dog”
Consider a vocabulary of 50,000 words where:
• “aardvark” is position 0
• “ate” is position 2
• “zoo” is position 49,999
A bag-of-words can be a vector with 50,000 dimensions.
Give rare words a boost
We can get fancier and say that rare words are more important than
common words for characterizing documents.
Multiply each entry by a measure of how common it is in the corpus.
idf: inverse document frequency
idf( term, document) = log(num. documents / num. with term)
Called a vector space model. You can throw these
vectors into any classifier, or find similar documents
based on similar vectors.
10 documents, only 1 has “aardvark” and 5 have “zoo” and 5 have “ate”
tf-idf: tf * idf
“The aardvark ate the aardvark by the zoo.” =[4.6,0,0.7, ..., 0, 0.7]
Topic Modeling (LDA)
Latent Dirichlet Allocation
• You pick the number of topics
• Each topic is a distribution over words
• Each document is a distribution over topics
Easy to do in gensim
https://radimrehurek.com/gensim/models/ldamodel.html
LDA of my tweets shown in pyLDAvis
I sometimes
tweet about
movies.
https://github.com/bmabey/pyLDAvis
Sentiment analysis: How the author feels about the text
Sentiment dictionary: word list with
sentiment values associated with all words
that might indicate sentiment
...
happy: +2
...
joyful: +2
...
pain: -3
painful: -3
...
We can do sentiment analysis using labeled data and meaningless
tokens, with supervised learning over tf-idf.
We can also do sentiment analysis by
adding the first hint of meaning: some
words are positive and some words are
negative.
One such word list is VADER https://github.com/cjhutto/vaderSentiment
“I went to the junkyard and was happy to see joyful people.”
sensation representation action
meaningless
tokens
manual
representations
world models
merged
representations
word2vec
seq2seq
question
answering
external world
training
symbolic
path
sub-symbolic
path
Manually Constructing Representations
We tell the computer what
things mean by manually
specifying relationships between
symbols
1. Stores meaning using
predefined relationships
2. Maps multiple ways of
writing something to the
same representation
Allows us to code
what the machine
should do for a
relatively small
number of
representations
Manually Constructing Representations
vehicle
Honda
Civic
tires
is_ahas
Who might be in the market
for tires?
Allows us to code
what the machine
should do for a
relatively small
number of
representations
We tell the computer what
things mean by manually
specifying relationships between
symbols
1. Stores meaning using
predefined relationships
2. Maps multiple ways of
writing something to the
same representation
Manually Constructing Representations
vehicle
“... the car ...”
“... an automobile ...”
“... my truck ...”
Honda
Civic
“... cruising in my civic ...”
tires
is_ahas
Who might be in the market
for tires?
Allows us to code
what the machine
should do for a
relatively small
number of
representations
We tell the computer what
things mean by manually
specifying relationships between
symbols
1. Stores meaning using
predefined relationships
2. Maps multiple ways of
writing something to the
same representation
WordNet
auto,
automobile,
motorcar
WordNet Search http://wordnetweb.princeton.edu/perl/webwn
ambulance
hyponym
(subordinate)
sports_car,
sport_car
hyponym
(subordinate)
motor_vehicle,
automotive_vechicle
hypernym
(superordinate)
automobile_horn,
car_horn,
motor_horn, horn,
hooter
meronym
(has-part)
• Organizes sets of words into synsets, which are meanings
• A word can be a member of more than one synset, such as “bank”
[synsets shown as boxes]
WordNet
auto,
automobile,
machine,
motorcar
WordNet Search http://wordnetweb.princeton.edu/perl/webwn
ambulance
hyponym
(subordinate)
sports_car,
sport_car
hyponym
(subordinate)
motor_vehicle,
automotive_vechicle
hypernym
(superordinate)
automobile_horn,
car_horn,
motor_horn, horn,
hooter
meronym
(has-part)
Other relations:
• has-instance
• car : Honda Civic
• has-member
• team : stringer (like “first stringer”)
• antonym
• ... a few more
• Organizes sets of words into synsets, which are meanings
• A word can be a member of more than one synset, such as “bank”
FrameNet
• More integrative than WordNet: represents situations
• One example is a child’s birthday party, another is Commerce_buy
• situations have slots (roles) that are filled
• Frames are triggered by keywords in text (more or less)
FrameNet: https://framenet.icsi.berkeley.edu/fndrupal/IntroPage
Commerce_buy: https://framenet2.icsi.berkeley.edu/fnReports/data/frameIndex.xml?frame=Commerce_buy
Commerce_buy triggered by the words: buy,
buyer, client, purchase, or purchaser
Roles: Buyer, Goods,
Money, Place (where
bought), ...
Commerce_buy
Getting
Inherits from
Commerce_buy indicates a change of possession, but we need
a world model to actually change a state.
ConceptNet
Bread
ToasterAtLocation
Automobile
RelatedTo
• Provides commonsense linkages between words
• My experience: too shallow and haphazard to be useful
Might be good for a conversation bot:
“You know what you normally find by a
toaster? Bread.”
ConceptNet: http://conceptnet.io/
YAGO: Yet Another Great Ontology
• Built on WordNet and DBpedia
• http://www.mpi-inf.mpg.de/departments/databases-and-
information-systems/research/yago-naga/yago/
• DBpedia has a machine readable page for each Wikipedia
page
• Used by IBM Watson to play Jeopardy!
• Big on named entities, like entertainers
• Browse
• https://gate.d5.mpi-inf.mpg.de/webyago3spotlx/Browser
SUMO: Suggested Upper Merged Ontology
There is also YAGO-SUMO that merges the low-
level organization of SUMO with the instance
information of YAGO.http://people.mpi-
inf.mpg.de/~gdemelo/yagosumo/
Deep: organizes concepts down to the lowest level
http://www.adampease.org/OP/
Example: cooking is a type of making that is a type
of intentional process that is a type of process that
is a physical thing that is an entity.
Image Schemas
Humans use image schemas comprehend spatial
arrangements and movements in space [Mandler, 2004]
Examples of image schemas include path, containment,
blockage, and attraction [Johnson, 1987]
Abstract concepts such as romantic relationships and
arguments are represented as metaphors to this kind of
experience [Lakoff and Johnson, 1980]
Image schemas are representations of human experience
that are common across cultures [Feldman, 2006]
Semantic Web (Linked Data)
• Broad, but not
organized or deep
• Way too complicated
• May eventually be
streamlined (e.g.
JSON-LD), and it could
be very cool if it gets
linked with deeper,
better organized data
• Tools to map text:
• FRED
• DBpedia Spotlight
• Pikes
Semantic Web Layer Cake Spaghetti Monster
sensation representation action
meaningless
tokens
manual
representations
world models
merged
representations
word2vec
seq2seq
question
answering
external world
training
symbolic
path
sub-symbolic
path
World models
Computers need causal models of how the world works
and how we interact with it.
• People don’t say everything to get a message across,
just what is not covered by our shared conception
• Most efficient way to encode our shared conception is
a model
Models express how the world changes based on events
• Recall the Commerce_buy frame
• Afterward, one person has more money and another
person has less
• Read such inferences right off the model
Dimensions of models
• Probabilistic
• Deterministic compared with stochastic
• E.g., logic compared with probabilistic programming
• Factor state
• whole states compared with using variables
• E.g., finite automata compared with dynamic Bayesian networks
• Relational
• Propositional logic compared with first-order logic
• E.g., Bayesian networks compared with Markov logic networks
• Concurrent
• Model one thing compared with multiple things
• E.g., finite automata compared with Petri Nets
• Temporal
• Static compared with dynamic
• E.g., Bayesian networks compared with dynamic Bayesian
networks
sensation representation action
meaningless
tokens
manual
representations
world models
merged
representations
word2vec
seq2seq
question
answering
external world
training
symbolic
path
sub-symbolic
path
Merge representations with models
Explain your conception of
conductivity?
Electrons are small spheres, and
electricity is small spheres going
through a tube. Conductivity is how
little blockage is in the tube.
Cyc has a model that uses
representations, but it is not clear if
logic is sufficiently supple.
Why does it only rain outside?
A roof blocks the path of things
from above.
The meaning of the word
“chicken” is everything explicitly
stated in the representation and
everything that can be inferred
from the world model.
“chicken”
The final step on this path is to
create a robust model around
rich representations.
sensation representation action
meaningless
tokens
manual
representations
world models
merged
representations
word2vec
seq2seq
question
answering
external world
training
symbolic
path
sub-symbolic
path
sensation representation action
meaningless
tokens
manual
representations
world models
merged
representations
word2vec
seq2seq
question
answering
external world
training
symbolic
path
sub-symbolic
path
sensation representation action
meaningless
tokens
manual
representations
world models
merged
representations
symbolic
path
sub-symbolic
path
Neural networks (deep learning)
Great place to start is the
landmark publication:
Parallel Distributed Processing,
Vols. 1 and 2, Rumelhart and
McClelland, 1987
sensation representation action
meaningless
tokens
manual
representations
world models
merged
representations
word2vec
seq2seq
question
answering
external world
training
symbolic
path
sub-symbolic
path
word2vec
The word2vec model learns a vector for each
word in the vocabulary.
The number of dimensions for each word vector is
the same and is usually around 300.
Unlike the tf-idf vectors, word vectors are dense,
meaning that most values are not 0.
word2vec
1. Initialize each word with a random vector
2. For each word w1 in the set of documents:
3. For each word w2 around w1:
4. Move vectors for w1 and w2 closer
together and move all others and w1
farther apart
5. Goto 2 if not done
• Skip-gram model [Mikolov et al., 2013]
• Note: there are really two vectors per word, because you don’t want a word to
be likely to be around itself, see Goldberg and
Levyhttps://arxiv.org/pdf/1402.3722v1.pdf
• First saw that double-for-loop explanation from Christopher Moody
word2vec meaning
“You shall know a word by the company it keeps.” J. R. Firth [1957]
The quote we often see:
This seems at least kind of true.
• Vectors have internal structure [Mikolov et al., 2013]
• Italy – Rome = France – Paris
• King – Queen = Man – Woman
But ... words aren’t grounded in experience; they are only
grounded in being around other words.
Can also do word2vec on ConceptNet, see https://arxiv.org/pdf/1612.03975v1.pdf
sensation representation action
meaningless
tokens
manual
representations
world models
merged
representations
word2vec
seq2seq
question
answering
external world
training
symbolic
path
sub-symbolic
path
seq2seq model
The seq2seq (sequence-to-sequence) model can encode
sequences of tokens, such as sentences, into single
vectors.
It can then decode these vectors into other sequences
of tokens.
Both the encoding and decoding are done using
recurrent neural networks (RNNs).
One obvious application for this is machine translation.
For example, where the source sentences are English
and the target sentences are Spanish.
Encoding sentence meaning into a vector
h0
The
“The patient fell.”
Encoding sentence meaning into a vector
h0
The
h1
patient
“The patient fell.”
Encoding sentence meaning into a vector
h0
The
h1
patient
h2
fell
“The patient fell.”
Encoding sentence meaning into a vector
Like a hidden Markov model, but doesn’t make the Markov
assumption and benefits from a vector representation.
h0
The
h1
patient
h2
fell
h3
.
“The patient fell.”
Decoding sentence meaning
El
h3
Machine translation, or structure learning more generally.
Decoding sentence meaning
El
h3 h4
Machine translation, or structure learning more generally.
Decoding sentence meaning
El
h3
paciente
h4
Machine translation, or structure learning more generally.
Decoding sentence meaning
Machine translation, or structure learning more generally.
El
h3
paciente
h4
cayó
h5
.
h5
[Cho et al., 2014]
It keeps generating until it generates a stop symbol. Note that the lengths don’t
need to be the same. It could generate the correct “se cayó.”
• Treats this task like it is devoid of meaning.
• Great that this can work on just about any kind of seq2seq
problem, but this generality highlights its limitation for use
as language understanding. No Chomsky universal grammar.
Generating image captions
Convolutional
neural network
Sad
h0
wet
h1
kid
h2
.
h3
[Karpathy and Fei-Fei, 2015]
[Vinyals et al., 2015]
Attention [Bahdanau et al., 2014]
El
h3
paciente
h4
cayó
h5
.
h5
h0
The
h1
patient
h2
fell
h3
.
sensation representation action
meaningless
tokens
manual
representations
world models
merged
representations
word2vec
seq2seq
question
answering
external world
training
symbolic
path
sub-symbolic
path
Deep learning and question answering
RNNs answer questions.
What is the translation of this
phrase to French?
What is the next word?
Attention is useful for question
answering.
This can be generalized to which facts
the learner should pay attention to
when answering questions.
Deep learning and question answering
Bob went home.
Tim went to the junkyard.
Bob picked up the jar.
Bob went to town.
Where is the jar? A: town
• Memory Networks [Weston et al.,
2014]
• Updates memory vectors based on
a question and finds the best one to
give the output.
The office is north of the yard.
The bath is north of the office.
The yard is west of the kitchen.
How do you go from the office to
the kitchen? A: south, east
• Neural Reasoner [Peng et al.,
2015]
• Encodes the question and facts in
many layers, and the final layer is
put through a function that gives
the answer.
Deep learning and question answering
The network is learning linkages between sequences
of symbols, but these kinds of stories do not have
sufficiently rich linkages to our world.
sensation representation action
meaningless
tokens
manual
representations
world models
merged
representations
word2vec
seq2seq
question
answering
external world
training
symbolic
path
sub-symbolic
path
External world training
If we want to talk to machines
1. We need to train them in an environment as much like our
own as possible
2. Can’t just be dialog!
Harnad [1990] http://users.ecs.soton.ac.uk/harnad/Papers/Harnad/harnad90.sgproblem.html
To understand “chicken” we need the machine
to have had as much experience with chickens
as possible.
When we say “chicken” we don’t just mean the
bird, we mean everything one can do with it
and everything it represents in our culture.“chicken”
There has been work in this direction
Industry
• OpenAI
• Universe: train on screens with VNC
• Now with Grand Theft Auto!
https://openai.com/blog/GTA-V-plus-Universe/
• Google
• Mikolov et al., A Roadmap towards
Machine Intelligence. They define an
artificial environment.
https://arxiv.org/pdf/1511.08130v2.pdf
• Facebook
• Weston, memory networks to dialogs
https://arxiv.org/pdf/1604.06045v7.pdf
• Kiela et al., Virtual Embodiment: A
Scalable Long-Term Strategy for
Artificial Intelligence Res. Advocate
using video games “with a purpose.”
https://arxiv.org/pdf/1610.07432v1.pdf
Academia
• Ray Mooney
• Maps text to situations
http://videolectures.net/aaai2013_moo
ney_language_learning/
• Luc Steels
• Robots come up with
vocabulary and simple grammar
• Narasimhan et al.
• Train a neural net to play text-
based adventure games
https://arxiv.org/pdf/1506.08941v2.pdf
iCub
But we need more training centered in our world
Maybe if Amazon Alexa had a camera and
rotating head?
How far could we get without the benefit of a
teacher?
• Could we use eye gaze as a cue? [Yu and
Ballard, 2010]
sensation representation action
meaningless
tokens
manual
representations
world models
merged
representations
word2vec
seq2seq
question
answering
external world
training
symbolic
path
sub-symbolic
path
sensation representation action
meaningless
tokens
manual
representations
world models
merged
representations
word2vec
seq2seq
question
answering
external world
training
symbolic
path
sub-symbolic
path
sensation representation action
symbolic
path
sub-symbolic
path
Building world
models based
on large, deep,
and organized
representations
Training large neural
networks in an
environment with similar
objects, relationships, and
dynamics as our own
Two paths from NLP to AI
sensation representation action
Final thought
Open problem: What is the simplest commercially viable
task that requires commonsense knowledge and
reasoning?
Thanks for listening
Jonathan Mugan
@jmugan
www.deepgrammar.com

More Related Content

What's hot

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingIla Group
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingToine Bogers
 
Natural language procssing
Natural language procssing Natural language procssing
Natural language procssing Rajnish Raj
 
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Dawn Anderson MSc DigM
 
HarambeeNet: Data by the people, for the people
HarambeeNet: Data by the people, for the peopleHarambeeNet: Data by the people, for the people
HarambeeNet: Data by the people, for the peopleMichael Bernstein
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language ProcessingMichel Bruley
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processingpunedevscom
 
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...Grammarly
 
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...AI Frontiers
 
Google BERT - SMX London 2020 Virtual Conference
Google BERT - SMX London 2020 Virtual ConferenceGoogle BERT - SMX London 2020 Virtual Conference
Google BERT - SMX London 2020 Virtual ConferenceDawn Anderson MSc DigM
 
2017 Tutorial - Deep Learning for Dialogue Systems
2017 Tutorial - Deep Learning for Dialogue Systems2017 Tutorial - Deep Learning for Dialogue Systems
2017 Tutorial - Deep Learning for Dialogue SystemsMLReview
 
NLP 101 + Chatbots
NLP 101 + ChatbotsNLP 101 + Chatbots
NLP 101 + ChatbotsChris Shei
 
Google BERT - What SEOs and Marketers Need to Know
Google BERT - What SEOs and Marketers Need to KnowGoogle BERT - What SEOs and Marketers Need to Know
Google BERT - What SEOs and Marketers Need to KnowDawn Anderson MSc DigM
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingTed Xiao
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Yuriy Guts
 
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
2019 Tech SEO Boost Dawn Anderson Contextual Recommender SearchDawn Anderson MSc DigM
 
Pycon ke word vectors
Pycon ke   word vectorsPycon ke   word vectors
Pycon ke word vectorsOsebe Sammi
 
DataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyan
DataFest 2017. Introduction to Natural Language Processing by Rudolf EremyanDataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyan
DataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyanrudolf eremyan
 

What's hot (20)

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Intro to nlp
Intro to nlpIntro to nlp
Intro to nlp
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Natural language procssing
Natural language procssing Natural language procssing
Natural language procssing
 
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
 
HarambeeNet: Data by the people, for the people
HarambeeNet: Data by the people, for the peopleHarambeeNet: Data by the people, for the people
HarambeeNet: Data by the people, for the people
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language Processing
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
Grammarly AI-NLP Club #2 - Recent advances in applied chatbot technology - Jo...
 
Blenderbot
BlenderbotBlenderbot
Blenderbot
 
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
 
Google BERT - SMX London 2020 Virtual Conference
Google BERT - SMX London 2020 Virtual ConferenceGoogle BERT - SMX London 2020 Virtual Conference
Google BERT - SMX London 2020 Virtual Conference
 
2017 Tutorial - Deep Learning for Dialogue Systems
2017 Tutorial - Deep Learning for Dialogue Systems2017 Tutorial - Deep Learning for Dialogue Systems
2017 Tutorial - Deep Learning for Dialogue Systems
 
NLP 101 + Chatbots
NLP 101 + ChatbotsNLP 101 + Chatbots
NLP 101 + Chatbots
 
Google BERT - What SEOs and Marketers Need to Know
Google BERT - What SEOs and Marketers Need to KnowGoogle BERT - What SEOs and Marketers Need to Know
Google BERT - What SEOs and Marketers Need to Know
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language Processing
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
2019 Tech SEO Boost Dawn Anderson Contextual Recommender Search
 
Pycon ke word vectors
Pycon ke   word vectorsPycon ke   word vectors
Pycon ke word vectors
 
DataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyan
DataFest 2017. Introduction to Natural Language Processing by Rudolf EremyanDataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyan
DataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyan
 

Similar to From Natural Language Processing to Artificial Intelligence

My Best PPT
My Best PPTMy Best PPT
My Best PPTczczczxc
 
Large Components in the Rearview Mirror
Large Components in the Rearview MirrorLarge Components in the Rearview Mirror
Large Components in the Rearview MirrorMichelle Brush
 
Dialogare con agenti artificiali
Dialogare con agenti artificiali  Dialogare con agenti artificiali
Dialogare con agenti artificiali Agnese Augello
 
Web & Social Media Analystics - Workshop Semantica
Web & Social Media Analystics - Workshop SemanticaWeb & Social Media Analystics - Workshop Semantica
Web & Social Media Analystics - Workshop SemanticaRoberto Cirillo
 
Best Practices for Designing High-Fidelity Voice Experiences
Best Practices for Designing High-Fidelity Voice ExperiencesBest Practices for Designing High-Fidelity Voice Experiences
Best Practices for Designing High-Fidelity Voice ExperiencesPullString
 
Tutorial on Creative Twitterbots
Tutorial on Creative TwitterbotsTutorial on Creative Twitterbots
Tutorial on Creative TwitterbotsTony Veale
 
Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6Paul Houle
 
Malaysian Higher Ed-UN Learning
Malaysian Higher Ed-UN LearningMalaysian Higher Ed-UN Learning
Malaysian Higher Ed-UN LearningWayne Hodgins
 
Different Kinds Of Essay. 8 Types of Essays in College: All You Need to Know ...
Different Kinds Of Essay. 8 Types of Essays in College: All You Need to Know ...Different Kinds Of Essay. 8 Types of Essays in College: All You Need to Know ...
Different Kinds Of Essay. 8 Types of Essays in College: All You Need to Know ...Sara Carter
 
Introduction to chat bot
Introduction to chat botIntroduction to chat bot
Introduction to chat botmohamed ali
 
Reframing the Net
Reframing the NetReframing the Net
Reframing the NetDoc Searls
 
Midwest km pugh conversational ai and ai for conversation 190809
Midwest km pugh conversational ai and ai for conversation 190809Midwest km pugh conversational ai and ai for conversation 190809
Midwest km pugh conversational ai and ai for conversation 190809Katrina (Kate) Pugh
 
Hpai class 12 - potpourri & perception - 032620 actual
Hpai   class 12 - potpourri & perception - 032620 actualHpai   class 12 - potpourri & perception - 032620 actual
Hpai class 12 - potpourri & perception - 032620 actualmelendez321
 
#1NWebinar: Digital on the Runway
#1NWebinar: Digital on the Runway#1NWebinar: Digital on the Runway
#1NWebinar: Digital on the RunwayOne North
 

Similar to From Natural Language Processing to Artificial Intelligence (20)

My Best PPT
My Best PPTMy Best PPT
My Best PPT
 
Large Components in the Rearview Mirror
Large Components in the Rearview MirrorLarge Components in the Rearview Mirror
Large Components in the Rearview Mirror
 
Dialogare con agenti artificiali
Dialogare con agenti artificiali  Dialogare con agenti artificiali
Dialogare con agenti artificiali
 
Query Understanding
Query UnderstandingQuery Understanding
Query Understanding
 
Web & Social Media Analystics - Workshop Semantica
Web & Social Media Analystics - Workshop SemanticaWeb & Social Media Analystics - Workshop Semantica
Web & Social Media Analystics - Workshop Semantica
 
Module2
Module2Module2
Module2
 
Best Practices for Designing High-Fidelity Voice Experiences
Best Practices for Designing High-Fidelity Voice ExperiencesBest Practices for Designing High-Fidelity Voice Experiences
Best Practices for Designing High-Fidelity Voice Experiences
 
Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligence
 
Flow based-1994
Flow based-1994Flow based-1994
Flow based-1994
 
Tutorial on Creative Twitterbots
Tutorial on Creative TwitterbotsTutorial on Creative Twitterbots
Tutorial on Creative Twitterbots
 
Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6
 
Malaysian Higher Ed-UN Learning
Malaysian Higher Ed-UN LearningMalaysian Higher Ed-UN Learning
Malaysian Higher Ed-UN Learning
 
Different Kinds Of Essay. 8 Types of Essays in College: All You Need to Know ...
Different Kinds Of Essay. 8 Types of Essays in College: All You Need to Know ...Different Kinds Of Essay. 8 Types of Essays in College: All You Need to Know ...
Different Kinds Of Essay. 8 Types of Essays in College: All You Need to Know ...
 
Essay Pythagoras
Essay PythagorasEssay Pythagoras
Essay Pythagoras
 
Introduction to chat bot
Introduction to chat botIntroduction to chat bot
Introduction to chat bot
 
Reframing the Net
Reframing the NetReframing the Net
Reframing the Net
 
Web 2.0 Tools Go Elementary
Web 2.0 Tools Go ElementaryWeb 2.0 Tools Go Elementary
Web 2.0 Tools Go Elementary
 
Midwest km pugh conversational ai and ai for conversation 190809
Midwest km pugh conversational ai and ai for conversation 190809Midwest km pugh conversational ai and ai for conversation 190809
Midwest km pugh conversational ai and ai for conversation 190809
 
Hpai class 12 - potpourri & perception - 032620 actual
Hpai   class 12 - potpourri & perception - 032620 actualHpai   class 12 - potpourri & perception - 032620 actual
Hpai class 12 - potpourri & perception - 032620 actual
 
#1NWebinar: Digital on the Runway
#1NWebinar: Digital on the Runway#1NWebinar: Digital on the Runway
#1NWebinar: Digital on the Runway
 

Recently uploaded

Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfrahulyadav957181
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfNicoChristianSunaryo
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfPratikPatil591646
 
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfWorld Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfsimulationsindia
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformationAnnie Melnic
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
knowledge representation in artificial intelligence
knowledge representation in artificial intelligenceknowledge representation in artificial intelligence
knowledge representation in artificial intelligencePriyadharshiniG41
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 

Recently uploaded (20)

Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
Rithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdfRithik Kumar Singh codealpha pythohn.pdf
Rithik Kumar Singh codealpha pythohn.pdf
 
Digital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdfDigital Indonesia Report 2024 by We Are Social .pdf
Digital Indonesia Report 2024 by We Are Social .pdf
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Non Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdfNon Text Magic Studio Magic Design for Presentations L&P.pdf
Non Text Magic Studio Magic Design for Presentations L&P.pdf
 
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdfWorld Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
World Economic Forum Metaverse Ecosystem By Utpal Chakraborty.pdf
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
Role of Consumer Insights in business transformation
Role of Consumer Insights in business transformationRole of Consumer Insights in business transformation
Role of Consumer Insights in business transformation
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
knowledge representation in artificial intelligence
knowledge representation in artificial intelligenceknowledge representation in artificial intelligence
knowledge representation in artificial intelligence
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 

From Natural Language Processing to Artificial Intelligence

  • 1. From Natural Language Processing to Artificial Intelligence Jonathan Mugan, PhD @jmugan Data Day Texas January 14, 2017
  • 2. “ It is not my aim to surprise or shock you – but the simplest way I can summarize is to say that there are now in the world machines that think, that learn and that create. Moreover, their ability to do these things is going to increase rapidly until – in a visible future – the range of problems they can handle will be coextensive with the range to which the human mind has been applied. ”
  • 3. “ It is not my aim to surprise or shock you – but the simplest way I can summarize is to say that there are now in the world machines that think, that learn and that create. Moreover, their ability to do these things is going to increase rapidly until – in a visible future – the range of problems they can handle will be coextensive with the range to which the human mind has been applied. ” Herbert Simon, 1957
  • 4. “ It is not my aim to surprise or shock you – but the simplest way I can summarize is to say that there are now in the world machines that think, that learn and that create. Moreover, their ability to do these things is going to increase rapidly until – in a visible future – the range of problems they can handle will be coextensive with the range to which the human mind has been applied. ” Herbert Simon, 1957 So what is going on here? There is no actual understanding. AI has gotten smarter • especially with deep learning, but • computers can’t read or converse intelligently
  • 5. This is disappointing because we want to • Interact with our world using natural language • Current chatbots are an embarrassment • Have computers read all those documents out there • So they can retrieve the best ones, answer our questions, and summarize what is new
  • 6. To understand language, computers need to understand the world Why can you pull a wagon with a string but not push it? -- Minsky Why is it unusual that a gymnast competed with one leg? -- Schank Why does it only rain outside? If a book is on a table, and you push the table, what happens? If Bob went to the junkyard, is he at the airport? They need to be able to answer questions like:
  • 7. Grounded Understanding • We understand language in a way that is grounded in sensation and action. sensation representation action • When someone says “chicken,” we map that to our experience with chickens. • We understand each other because we have had similar experiences. • This is the kind of understanding that computers need. See: Benjamin Bergen, Steven Pinker, Mark Johnson, Jerome Feldman, and Murray Shanahan “chicken”
  • 9. sensation representation action meaningless tokens manual representations world models merged representations word2vec seq2seq question answering external world training symbolic path sub-symbolic path
  • 10. sensation representation action meaningless tokens manual representations world models merged representations word2vec seq2seq question answering external world training symbolic path sub-symbolic path
  • 11. Bag-of-words representation “The aardvark ate the zoo.” = [1,0,1, ..., 0, 1] We can do a little better and count how often the words occur. tf: term frequency, how often does the word occur. “The aardvark ate the aardvark by the zoo.” = [2,0,1, ..., 0, 1] Treat words as arbitrary symbols and look at their frequencies. “dog bit man” will be the same as “man bit dog” Consider a vocabulary of 50,000 words where: • “aardvark” is position 0 • “ate” is position 2 • “zoo” is position 49,999 A bag-of-words can be a vector with 50,000 dimensions.
  • 12. Give rare words a boost We can get fancier and say that rare words are more important than common words for characterizing documents. Multiply each entry by a measure of how common it is in the corpus. idf: inverse document frequency idf( term, document) = log(num. documents / num. with term) Called a vector space model. You can throw these vectors into any classifier, or find similar documents based on similar vectors. 10 documents, only 1 has “aardvark” and 5 have “zoo” and 5 have “ate” tf-idf: tf * idf “The aardvark ate the aardvark by the zoo.” =[4.6,0,0.7, ..., 0, 0.7]
  • 13. Topic Modeling (LDA) Latent Dirichlet Allocation • You pick the number of topics • Each topic is a distribution over words • Each document is a distribution over topics Easy to do in gensim https://radimrehurek.com/gensim/models/ldamodel.html
  • 14. LDA of my tweets shown in pyLDAvis I sometimes tweet about movies. https://github.com/bmabey/pyLDAvis
  • 15. Sentiment analysis: How the author feels about the text Sentiment dictionary: word list with sentiment values associated with all words that might indicate sentiment ... happy: +2 ... joyful: +2 ... pain: -3 painful: -3 ... We can do sentiment analysis using labeled data and meaningless tokens, with supervised learning over tf-idf. We can also do sentiment analysis by adding the first hint of meaning: some words are positive and some words are negative. One such word list is VADER https://github.com/cjhutto/vaderSentiment “I went to the junkyard and was happy to see joyful people.”
  • 16. sensation representation action meaningless tokens manual representations world models merged representations word2vec seq2seq question answering external world training symbolic path sub-symbolic path
  • 17. Manually Constructing Representations We tell the computer what things mean by manually specifying relationships between symbols 1. Stores meaning using predefined relationships 2. Maps multiple ways of writing something to the same representation Allows us to code what the machine should do for a relatively small number of representations
  • 18. Manually Constructing Representations vehicle Honda Civic tires is_ahas Who might be in the market for tires? Allows us to code what the machine should do for a relatively small number of representations We tell the computer what things mean by manually specifying relationships between symbols 1. Stores meaning using predefined relationships 2. Maps multiple ways of writing something to the same representation
  • 19. Manually Constructing Representations vehicle “... the car ...” “... an automobile ...” “... my truck ...” Honda Civic “... cruising in my civic ...” tires is_ahas Who might be in the market for tires? Allows us to code what the machine should do for a relatively small number of representations We tell the computer what things mean by manually specifying relationships between symbols 1. Stores meaning using predefined relationships 2. Maps multiple ways of writing something to the same representation
  • 20. WordNet auto, automobile, motorcar WordNet Search http://wordnetweb.princeton.edu/perl/webwn ambulance hyponym (subordinate) sports_car, sport_car hyponym (subordinate) motor_vehicle, automotive_vechicle hypernym (superordinate) automobile_horn, car_horn, motor_horn, horn, hooter meronym (has-part) • Organizes sets of words into synsets, which are meanings • A word can be a member of more than one synset, such as “bank” [synsets shown as boxes]
  • 21. WordNet auto, automobile, machine, motorcar WordNet Search http://wordnetweb.princeton.edu/perl/webwn ambulance hyponym (subordinate) sports_car, sport_car hyponym (subordinate) motor_vehicle, automotive_vechicle hypernym (superordinate) automobile_horn, car_horn, motor_horn, horn, hooter meronym (has-part) Other relations: • has-instance • car : Honda Civic • has-member • team : stringer (like “first stringer”) • antonym • ... a few more • Organizes sets of words into synsets, which are meanings • A word can be a member of more than one synset, such as “bank”
  • 22. FrameNet • More integrative than WordNet: represents situations • One example is a child’s birthday party, another is Commerce_buy • situations have slots (roles) that are filled • Frames are triggered by keywords in text (more or less) FrameNet: https://framenet.icsi.berkeley.edu/fndrupal/IntroPage Commerce_buy: https://framenet2.icsi.berkeley.edu/fnReports/data/frameIndex.xml?frame=Commerce_buy Commerce_buy triggered by the words: buy, buyer, client, purchase, or purchaser Roles: Buyer, Goods, Money, Place (where bought), ... Commerce_buy Getting Inherits from Commerce_buy indicates a change of possession, but we need a world model to actually change a state.
  • 23. ConceptNet Bread ToasterAtLocation Automobile RelatedTo • Provides commonsense linkages between words • My experience: too shallow and haphazard to be useful Might be good for a conversation bot: “You know what you normally find by a toaster? Bread.” ConceptNet: http://conceptnet.io/
  • 24. YAGO: Yet Another Great Ontology • Built on WordNet and DBpedia • http://www.mpi-inf.mpg.de/departments/databases-and- information-systems/research/yago-naga/yago/ • DBpedia has a machine readable page for each Wikipedia page • Used by IBM Watson to play Jeopardy! • Big on named entities, like entertainers • Browse • https://gate.d5.mpi-inf.mpg.de/webyago3spotlx/Browser
  • 25. SUMO: Suggested Upper Merged Ontology There is also YAGO-SUMO that merges the low- level organization of SUMO with the instance information of YAGO.http://people.mpi- inf.mpg.de/~gdemelo/yagosumo/ Deep: organizes concepts down to the lowest level http://www.adampease.org/OP/ Example: cooking is a type of making that is a type of intentional process that is a type of process that is a physical thing that is an entity.
  • 26. Image Schemas Humans use image schemas comprehend spatial arrangements and movements in space [Mandler, 2004] Examples of image schemas include path, containment, blockage, and attraction [Johnson, 1987] Abstract concepts such as romantic relationships and arguments are represented as metaphors to this kind of experience [Lakoff and Johnson, 1980] Image schemas are representations of human experience that are common across cultures [Feldman, 2006]
  • 27. Semantic Web (Linked Data) • Broad, but not organized or deep • Way too complicated • May eventually be streamlined (e.g. JSON-LD), and it could be very cool if it gets linked with deeper, better organized data • Tools to map text: • FRED • DBpedia Spotlight • Pikes Semantic Web Layer Cake Spaghetti Monster
  • 28. sensation representation action meaningless tokens manual representations world models merged representations word2vec seq2seq question answering external world training symbolic path sub-symbolic path
  • 29. World models Computers need causal models of how the world works and how we interact with it. • People don’t say everything to get a message across, just what is not covered by our shared conception • Most efficient way to encode our shared conception is a model Models express how the world changes based on events • Recall the Commerce_buy frame • Afterward, one person has more money and another person has less • Read such inferences right off the model
  • 30. Dimensions of models • Probabilistic • Deterministic compared with stochastic • E.g., logic compared with probabilistic programming • Factor state • whole states compared with using variables • E.g., finite automata compared with dynamic Bayesian networks • Relational • Propositional logic compared with first-order logic • E.g., Bayesian networks compared with Markov logic networks • Concurrent • Model one thing compared with multiple things • E.g., finite automata compared with Petri Nets • Temporal • Static compared with dynamic • E.g., Bayesian networks compared with dynamic Bayesian networks
  • 31. sensation representation action meaningless tokens manual representations world models merged representations word2vec seq2seq question answering external world training symbolic path sub-symbolic path
  • 32. Merge representations with models Explain your conception of conductivity? Electrons are small spheres, and electricity is small spheres going through a tube. Conductivity is how little blockage is in the tube. Cyc has a model that uses representations, but it is not clear if logic is sufficiently supple. Why does it only rain outside? A roof blocks the path of things from above. The meaning of the word “chicken” is everything explicitly stated in the representation and everything that can be inferred from the world model. “chicken” The final step on this path is to create a robust model around rich representations.
  • 33. sensation representation action meaningless tokens manual representations world models merged representations word2vec seq2seq question answering external world training symbolic path sub-symbolic path
  • 34. sensation representation action meaningless tokens manual representations world models merged representations word2vec seq2seq question answering external world training symbolic path sub-symbolic path
  • 35. sensation representation action meaningless tokens manual representations world models merged representations symbolic path sub-symbolic path Neural networks (deep learning) Great place to start is the landmark publication: Parallel Distributed Processing, Vols. 1 and 2, Rumelhart and McClelland, 1987
  • 36. sensation representation action meaningless tokens manual representations world models merged representations word2vec seq2seq question answering external world training symbolic path sub-symbolic path
  • 37. word2vec The word2vec model learns a vector for each word in the vocabulary. The number of dimensions for each word vector is the same and is usually around 300. Unlike the tf-idf vectors, word vectors are dense, meaning that most values are not 0.
  • 38. word2vec 1. Initialize each word with a random vector 2. For each word w1 in the set of documents: 3. For each word w2 around w1: 4. Move vectors for w1 and w2 closer together and move all others and w1 farther apart 5. Goto 2 if not done • Skip-gram model [Mikolov et al., 2013] • Note: there are really two vectors per word, because you don’t want a word to be likely to be around itself, see Goldberg and Levyhttps://arxiv.org/pdf/1402.3722v1.pdf • First saw that double-for-loop explanation from Christopher Moody
  • 39. word2vec meaning “You shall know a word by the company it keeps.” J. R. Firth [1957] The quote we often see: This seems at least kind of true. • Vectors have internal structure [Mikolov et al., 2013] • Italy – Rome = France – Paris • King – Queen = Man – Woman But ... words aren’t grounded in experience; they are only grounded in being around other words. Can also do word2vec on ConceptNet, see https://arxiv.org/pdf/1612.03975v1.pdf
  • 40. sensation representation action meaningless tokens manual representations world models merged representations word2vec seq2seq question answering external world training symbolic path sub-symbolic path
  • 41. seq2seq model The seq2seq (sequence-to-sequence) model can encode sequences of tokens, such as sentences, into single vectors. It can then decode these vectors into other sequences of tokens. Both the encoding and decoding are done using recurrent neural networks (RNNs). One obvious application for this is machine translation. For example, where the source sentences are English and the target sentences are Spanish.
  • 42. Encoding sentence meaning into a vector h0 The “The patient fell.”
  • 43. Encoding sentence meaning into a vector h0 The h1 patient “The patient fell.”
  • 44. Encoding sentence meaning into a vector h0 The h1 patient h2 fell “The patient fell.”
  • 45. Encoding sentence meaning into a vector Like a hidden Markov model, but doesn’t make the Markov assumption and benefits from a vector representation. h0 The h1 patient h2 fell h3 . “The patient fell.”
  • 46. Decoding sentence meaning El h3 Machine translation, or structure learning more generally.
  • 47. Decoding sentence meaning El h3 h4 Machine translation, or structure learning more generally.
  • 48. Decoding sentence meaning El h3 paciente h4 Machine translation, or structure learning more generally.
  • 49. Decoding sentence meaning Machine translation, or structure learning more generally. El h3 paciente h4 cayó h5 . h5 [Cho et al., 2014] It keeps generating until it generates a stop symbol. Note that the lengths don’t need to be the same. It could generate the correct “se cayó.” • Treats this task like it is devoid of meaning. • Great that this can work on just about any kind of seq2seq problem, but this generality highlights its limitation for use as language understanding. No Chomsky universal grammar.
  • 50. Generating image captions Convolutional neural network Sad h0 wet h1 kid h2 . h3 [Karpathy and Fei-Fei, 2015] [Vinyals et al., 2015]
  • 51. Attention [Bahdanau et al., 2014] El h3 paciente h4 cayó h5 . h5 h0 The h1 patient h2 fell h3 .
  • 52. sensation representation action meaningless tokens manual representations world models merged representations word2vec seq2seq question answering external world training symbolic path sub-symbolic path
  • 53. Deep learning and question answering RNNs answer questions. What is the translation of this phrase to French? What is the next word? Attention is useful for question answering. This can be generalized to which facts the learner should pay attention to when answering questions.
  • 54. Deep learning and question answering Bob went home. Tim went to the junkyard. Bob picked up the jar. Bob went to town. Where is the jar? A: town • Memory Networks [Weston et al., 2014] • Updates memory vectors based on a question and finds the best one to give the output. The office is north of the yard. The bath is north of the office. The yard is west of the kitchen. How do you go from the office to the kitchen? A: south, east • Neural Reasoner [Peng et al., 2015] • Encodes the question and facts in many layers, and the final layer is put through a function that gives the answer.
  • 55. Deep learning and question answering The network is learning linkages between sequences of symbols, but these kinds of stories do not have sufficiently rich linkages to our world.
  • 56. sensation representation action meaningless tokens manual representations world models merged representations word2vec seq2seq question answering external world training symbolic path sub-symbolic path
  • 57. External world training If we want to talk to machines 1. We need to train them in an environment as much like our own as possible 2. Can’t just be dialog! Harnad [1990] http://users.ecs.soton.ac.uk/harnad/Papers/Harnad/harnad90.sgproblem.html To understand “chicken” we need the machine to have had as much experience with chickens as possible. When we say “chicken” we don’t just mean the bird, we mean everything one can do with it and everything it represents in our culture.“chicken”
  • 58. There has been work in this direction Industry • OpenAI • Universe: train on screens with VNC • Now with Grand Theft Auto! https://openai.com/blog/GTA-V-plus-Universe/ • Google • Mikolov et al., A Roadmap towards Machine Intelligence. They define an artificial environment. https://arxiv.org/pdf/1511.08130v2.pdf • Facebook • Weston, memory networks to dialogs https://arxiv.org/pdf/1604.06045v7.pdf • Kiela et al., Virtual Embodiment: A Scalable Long-Term Strategy for Artificial Intelligence Res. Advocate using video games “with a purpose.” https://arxiv.org/pdf/1610.07432v1.pdf Academia • Ray Mooney • Maps text to situations http://videolectures.net/aaai2013_moo ney_language_learning/ • Luc Steels • Robots come up with vocabulary and simple grammar • Narasimhan et al. • Train a neural net to play text- based adventure games https://arxiv.org/pdf/1506.08941v2.pdf iCub
  • 59. But we need more training centered in our world Maybe if Amazon Alexa had a camera and rotating head? How far could we get without the benefit of a teacher? • Could we use eye gaze as a cue? [Yu and Ballard, 2010]
  • 60. sensation representation action meaningless tokens manual representations world models merged representations word2vec seq2seq question answering external world training symbolic path sub-symbolic path
  • 61. sensation representation action meaningless tokens manual representations world models merged representations word2vec seq2seq question answering external world training symbolic path sub-symbolic path
  • 62. sensation representation action symbolic path sub-symbolic path Building world models based on large, deep, and organized representations Training large neural networks in an environment with similar objects, relationships, and dynamics as our own Two paths from NLP to AI
  • 63. sensation representation action Final thought Open problem: What is the simplest commercially viable task that requires commonsense knowledge and reasoning?
  • 64. Thanks for listening Jonathan Mugan @jmugan www.deepgrammar.com

Editor's Notes

  1. AI has gotten a lot smarter, in recent years, especially with the benefits of deep learning, but natural language understanding is still lacking.
  2. Computers must understand our world to understand language.
  3. We don’t understand our biological structure, so we can’t just copy it in software, and we don’t know how to implement an alternative structure with equivalent capabilities, so we employ a bunch of parlor tricks to get us close to what we want. This is the field of natural language processing.
  4. For sub-symbolic PDP original book and pep at 25.
  5. For sub-symbolic pep original book and pep at 25.
  6. For sub-symbolic pep original book and pep at 25.
  7. There is some indirect meaning based on how people use symbols together. In real writing you have a thesis, a central thing to to say. A predicate.
  8. There is some indirect meaning based on how people use symbols together. In real writing you have a thesis, a central thing to to say. A predicate.
  9. For sub-symbolic pep original book and pep at 25.
  10. A set of meanings where each one is a set of sense entities called synsets.
  11. A set of meanings where each one is a set of sense entities called synsets.
  12. Frames are more integrative; they represent more kinds of relationships between concepts, and those relationships are situation specific, so in a sense the representation is richer. However, Commerce_buy doesn’t say that one person owns something and now has less money (confirm this).
  13. For sub-symbolic PDP original book and pep at 25.
  14. Should I say they are causal models? We want causal because we want to know how the system would behave in new situations. Causality is what happens if there is an exogenous change. Compare I flipped the light switch. the light will go on. 2. I made the rooster crow. the sun will not rise.
  15. For sub-symbolic pep original book and pep at 25.
  16. For sub-symbolic PDP original book and PDP at 25.
  17. Should “happy” and “sad” have similar vectors when they are, in a sense, opposite?