SlideShare a Scribd company logo
1 of 114
Download to read offline
Categorical Evaluation for
Advanced Distributional
Semantic Models
An Undergraduate Dissertation
by Reid Kilgore
Agenda
• Background
• Syntactic breakdown of model competency
• Using new models to examine competency
improvements
• Analysis and new approaches
Word Representation
We need some way to allow NLP systems to leverage
language information
• Tokenization
• Distributional Semantics
Distributional Semantics
“A word can be known by the company it keeps” - Firth
Premise:
• Words that appear together are more similar than
words that do not appear together
Idea:
• We can define a word by analyzing the frequency
with which every word occurs with every other word
Distributional Models
• Brown Clustering
• Grouping words by the terms most likely to have
come before
• Word Embeddings
• Mapping words to high-dimensional vectors
Language Modeling
Sentence:
I’m giving a presentation right now
Language Modeling
Sentence:
I’m giving a presentation right now
Window = 2
Language Modeling
Sentence:
I’m giving a presentation right now
Window = 2
giving a talk right now
giving a lecture right now
giving a speech right now
Neural Network Models
• A Neural Network is a model built with the use of
synthetic neurons
• This is an unsupervised machine learning
technique
Neural Network Language Models generally try to
determine the likelihood of a word relative to its
context, or vice versa
Word2Vec
• By far the most influential modern Neural Network
Language Model
• Significantly more efficient than previous models
Vector Offset
vector(a’) - vector(a)
We can extract the relationship between two word
vectors by taking their offset.
vector(goats) - vector(goat) => relationship(plural)
Vector Offset
vector(a’) - vector(a)
We can extract the relationship between two word
vectors by taking their offset.
vector(goats) - vector(goat) => relationship(plural)
vector(dog) + relationship(plural) => vector(dogs)
Analogies
a is to a’ as b is to b’
Run is to Runs as Walk is to Walks
Amazing is to Amazingly as Great is to Greatly
Atlanta is to Georgia as Tampa is to Florida
Analogy Tests
! "
👑
!is to as is to ⁇
👑
Analogy Tests
!
👑
"
👑
👾
%
&
! "
Analogy Tests
a is to a’ as b is to b’
vector(a’) - vector(a) => relationship(a’:a)
relationship(a’:b) + vector(b) => c
Analogy Tests
a is to a’ as b is to b’
vector(a’) - vector(a) => relationship(a’:a)
relationship(a’:b) + vector(b) => c
We search for the vector most similar to c and check
if it represents b’.
Analogy Tests
a is to a’ as b is to b’
vector(a’) - vector(a) => relationship(a’:a)
relationship(a’:b) + vector(b) => c
We search for the vector most similar to c and check
if it represents b’.
vector(a’) - vector(a) + vector(b) => c
Analogy Tests
We can evaluate a model by evaluating how many
analogies it can recognize
Analogy Tests
! "
👑
"
👑
!- + => ✅
Analogy Tests
! "
👑
"
👑
!- + => ✅
! "
👑
(!- + => ❌
Analogy Tests
! "
👑
"
👑
!- + => ✅
! "
👑
(!- + => ❌
! "
👑
⛄
👑
!- + => ❌
Analogy Tests
! "
👑
"
👑
!- + => ✅
! "
👑
(!- + => ❌
! "
👑
⛄
👑
!- + =>
! "
👑
🚀!- + =>
❌
❌
Analogy Test Approach
• Train EmoryNLP Word2Vec on Wikipedia 2015 and
NYTimes
• We take the predominantly used test set and
analyze based on linguistic categories
Analogy Test Categories
• Lexical
Common Capital Countries
• Grammatical
Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013.
Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
Analogy Test Categories
England is to London
• Lexical
Common Capital Countries
• Grammatical
Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013.
Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
Analogy Test Categories
Nigeria is to Abuja
• Lexical
Common Capital Countries, World Capitals
• Grammatical
Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013.
Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
Analogy Test Categories
Los Angeles is to California
• Lexical
Common Capital Countries, World Capitals, City in State
• Grammatical
Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013.
Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
Analogy Test Categories
England is to Pound
• Lexical
Common Capital Countries, World Capitals, City in State, Currency
• Grammatical
Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013.
Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
Analogy Test Categories
King is to Queen
• Lexical
Common Capital Countries, World Capitals, City in State, Currency, Gender
• Grammatical
Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013.
Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
Analogy Test Categories
Hot is to Cold
• Lexical
Common Capital Countries, World Capitals, City in State, Currency, Gender, Opposite
• Grammatical
Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013.
Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
Analogy Test Categories
American is to America
• Lexical
Common Capital Countries, World Capitals, City in State, Currency, Gender, Opposite,
Nationality Adjective
• Grammatical
Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013.
Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
Analogy Test Categories
Amazing is to Amazingly
• Lexical
Common Capital Countries, World Capitals, City in State, Currency, Gender, Opposite,
Nationality Adjective
• Grammatical
Adjective to Adverb
Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013.
Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
Analogy Test Categories
Warm is to Warmer
• Lexical
Common Capital Countries, World Capitals, City in State, Currency, Gender, Opposite,
Nationality Adjective
• Grammatical
Adjective to Adverb, Comparative
Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013.
Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
Analogy Test Categories
Loud is to Loudest
• Lexical
Common Capital Countries, World Capitals, City in State, Currency, Gender, Opposite,
Nationality Adjective
• Grammatical
Adjective to Adverb, Comparative, Superlative
Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013.
Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
Analogy Test Categories
Coding is to Code
• Lexical
Common Capital Countries, World Capitals, City in State, Currency, Gender, Opposite,
Nationality Adjective
• Grammatical
Adjective to Adverb, Comparative, Superlative, Present Participle
Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013.
Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
Analogy Test Categories
Dancing is to Danced
• Lexical
Common Capital Countries, World Capitals, City in State, Currency, Gender, Opposite,
Nationality Adjective
• Grammatical
Adjective to Adverb, Comparative, Superlative, Present Participle, Past Tense
Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013.
Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
Analogy Test Categories
City is to Cities
• Lexical
Common Capital Countries, World Capitals, City in State, Currency, Gender, Opposite,
Nationality Adjective
• Grammatical
Adjective to Adverb, Comparative, Superlative, Present Participle, Past Tense, Plural
Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013.
Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
Analogy Test Categories
Describe is to Describes
• Lexical
Common Capital Countries, World Capitals, City in State, Currency, Gender, Opposite,
Nationality Adjective
• Grammatical
Adjective to Adverb, Comparative, Superlative, Present Participle, Past Tense, Plural, Plural
Verb
Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013.
Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
Word2Vec Evaluation
Word2Vec Evaluation
Word2Vec Analysis
Very high grammatical competency, very low lexical
• Adverb to adjective - only derivational morpheme
Idea: could we modify the training process to
improve weak model attributes?
Contexts
Recall:
• The Distributional Hypothesis posits that full co-
occurrence information would capture all
information about a language
It is possible that existing methods aren’t selecting
the best possible information.
Contexts
What if we used linguistic structure to select
contexts?
• Dependency Structure
• Predicate Argument Structure
Dependency Structure
• Linguistic units are connected by directed links called
dependencies
He opened the bottle with an opener for her.
opened
the
he bottle with
an
opener
for
her
Predicate Argument
Structure
• Predicate argument structure is concerned with the
arguments accepted by verbs and predicates in a
sentence
• Ex: “open” might accept an “opener”, “thing opened”,
“instrument” and “benefactive”
Predicate Argument
Structure
Arg0: Opener
Arg1: Thing opened
Arg2: Instrument
Arg3: Benefactive
He opened the bottle with an opener for her.
opened
the
he bottle with
an
opener
for
her
Predicate Argument
Structure
Arg0: Opener
Arg1: Thing opened
Arg2: Instrument
Arg3: Benefactive
He opened the bottle with an opener for her.
opened
the
he bottle with
an
opener
for
her
Predicate Argument
Structure
Arg0: Opener
Arg1: Thing opened
Arg2: Instrument
Arg3: Benefactive
He opened the bottle with an opener for her.
opened
the
he bottle with
an
opener
for
her
Predicate Argument
Structure
Arg0: Opener
Arg1: Thing opened
Arg2: Instrument
Arg3: Benefactive
He opened the bottle with an opener for her.
opened
the
he bottle with
an
opener
for
her
Predicate Argument
Structure
Arg0: Opener
Arg1: Thing opened
Arg2: Instrument
Arg3: Benefactive
He opened the bottle with an opener for her.
opened
the
he bottle with
an
opener
for
her
Previous Models
Word2vec
• Only adjacent contexts, very little categorical
analysis
Levy and Goldberg
• Used some basic dependency context information
• Did not explore much information, didn't do
thorough categorical analysis
New Models
wants
He throw
herto frisbee
the bright red
Sentence He wants to throw her the bright red frisbee in the park
in
park
the
wants
He throw
herto frisbee
the bright red
Sentence He wants to throw her the bright red frisbee in the park
Distance from
Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8
Training Word: throw
in
park
the
wants
He throw
herto frisbee
the bright red
Sentence He wants to throw her the bright red frisbee in the park
Word2Vec Context He wants to her the bright red frisbee
Distance from
Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8
Training Word: throw
in
park
the
wants
He throw
herto frisbee
the bright red
Sentence He wants to throw her the bright red frisbee in the park
Dep1 Context
Word2Vec Context He wants to her the bright red frisbee
Distance from
Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8
Training Word: throw
in
park
the
Dep1
Dependency Children
wants
He throw
herto frisbee
the bright red
Sentence He wants to throw her the bright red frisbee in the park
Dep1 Context to
Word2Vec Context He wants to her the bright red frisbee
Distance from
Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8
Training Word: throw
in
park
the
Dep1
Dependency Children
wants
He throw
herto frisbee
the bright red
Sentence He wants to throw her the bright red frisbee in the park
Dep1 Context to her
Word2Vec Context He wants to her the bright red frisbee
Distance from
Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8
Training Word: throw
in
park
the
Dep1
Dependency Children
wants
He throw
herto frisbee
the bright red
Sentence He wants to throw her the bright red frisbee in the park
Dep1 Context to her frisbee
Word2Vec Context He wants to her the bright red frisbee
Distance from
Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8
Training Word: throw
in
park
the
Dep1
Dependency Children
wants
He throw
herto frisbee
the bright red
Sentence He wants to throw her the bright red frisbee in the park
Dep1 Context to her frisbee in
Word2Vec Context He wants to her the bright red frisbee
Distance from
Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8
Training Word: throw
in
park
the
Dep1
Dependency Children
wants
He throw
herto frisbee
the bright red
Sentence He wants to throw her the bright red frisbee in the park
Dep1 Context to her frisbee in
Word2Vec Context He wants to her the bright red frisbee
Distance from
Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8
Training Word: throw
in
park
the
Dep1
Dependency Children
wants
He throw
herto frisbee
the bright red
Sentence He wants to throw her the bright red frisbee in the park
Dep2 Context to her the bright red frisbee in park
Word2Vec Context He wants to her the bright red frisbee
Distance from
Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8
Training Word: throw
in
park
the
Dep2
Dependency Grandchildren
wants
He throw
herto frisbee
the bright red
Sentence He wants to throw her the bright red frisbee in the park
Dep2h Context wants to her the bright red frisbee in park
Word2Vec Context He wants to her the bright red frisbee
Distance from
Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8
Training Word: throw
in
park
the
Dep2h
Dependency Grandchildren, Head
wants
He throw
herto frisbee
the bright red
Sentence He wants to throw her the bright red frisbee in the park
Sib1 Context her in
Word2Vec Context throw her the bright red in the park
Distance from
Training Word -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3
Training Word: frisbee
in
park
the
Sib1
Nearest Siblings
wants
He throw
herto frisbee
the bright red
Sentence He wants to throw her the bright red frisbee in the park
Sib2 Context to her in
Word2Vec Context throw her the bright red in the park
Distance from
Training Word -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3
Training Word: frisbee
in
park
the
Sib2
Nearest Two Siblings
wants
He throw
herto frisbee
the bright red
Sentence He wants to throw her the bright red frisbee in the park
Sib1Dep1 Context her the bright red in
Word2Vec Context throw her the bright red in the park
Distance from
Training Word -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3
Training Word: frisbee
in
park
the
Sib1Dep1
Dependency Children, Nearest Siblings
wants
He throw
herto frisbee
the bright red
Sentence He wants to throw her the bright red frisbee in the park
Sib2Dep1 Context to her the bright red in
Word2Vec Context throw her the bright red in the park
Distance from
Training Word -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3
Training Word: frisbee
in
park
the
Sib2Dep1
Dependency Children, Nearest Two Siblings
wants
He throw
herto frisbee
the bright red
Sentence He wants to throw her the bright red frisbee in the park
Sib1Dep2 Context He to her the bright red frisbee in park
Word2Vec Context He wants to her the bright red frisbee
Distance from
Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8
Training Word: throw
in
park
the
Sib1Dep2
Dependency Grandchildren, Nearest Siblings
wants
He throw
herto frisbee
the bright red
Sentence He wants to throw her the bright red frisbee in the park
All Siblings
Context
to her in
Word2Vec Context throw her the bright red in the park
Distance from
Training Word -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3
Training Word: frisbee
in
park
the
All Siblings
All Nodes Sharing The Same Head
Sentence He wants to throw her the bright red frisbee in the park
Srl1 Context wants throw
Word2Vec Context He wants to throw her the bright red
Distance from
Training Word 0 1 2 3 4 5 6 7 8 9 10 11
Training Word: He
Srl1
Semantic Role Head
verb: wants
wanter: He
verb: throw
thrower: He
wants
He throw
herto frisbee
the bright red
in
park
the
Overall Precision
Syntactic Breakdown
Syntactic Breakdown
Categorical Breakdown:
Lexical
Categorical Breakdown:
Lexical
Categorical Breakdown:
Lexical
Categorical Breakdown:
Lexical
Categorical Breakdown:
Lexical
Categorical Breakdown:
Grammatical
Categorical Breakdown:
Grammatical
Context Analysis
Why are these models doing so well lexically?
• Idea: the data outside of the Word2Vec context
window is providing most of the improvement.
Sentence He wants to throw her the bright red frisbee in the park
Dep2 Context to her the bright red frisbee in park
Word2Vec Context He wants to her the bright red frisbee
Distance from
Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8
Training Word: throw
Dep2
Dependency Grandchildren
Context Analysis
• Idea: the data outside of the Word2Vec context window is
providing most of the improvement.
Sentence He wants to throw her the bright red frisbee in the park
Dep2 Context to her the bright red frisbee in park
Word2Vec Context He wants to her the bright red frisbee
Distance from
Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8
Training Word: throw
Dep2
Dependency Grandchildren
Context Analysis
New Information
• Idea: the data outside of the Word2Vec context window is
providing most of the improvement.
Context Analysis
Context Analysis
Context Analysis
Context Analysis
Some of the biggest improvements are in models that
aren’t getting nearly as much new information.
• This indicates that the benefit is from the context
selection process
Rank Analysis
We find the rank score of a model by taking the
following:
Rank Analysis
We find the rank score of a model by taking the
following:
Rank Analysis
Rank Sum Analysis
We find the rank sum score of a model by taking the
sum of all the model’s categorical scores
Rank Sum Analysis
We find the rank sum score of a model by taking the
sum of all the model’s categorical scores
Rank Sum Analysis
Ensemble Models
It seems difficult to construct models that are at least
the sum of their parts.
• What would the ideal composite model look like?
• Can we achieve the same result in a different way?
Ensemble Models
Idea: Specific models have specific competencies.
We can build a class that chooses what model to use
based on the current task.
Ensemble Selection
We start by building an ensemble with only one
model, then two models and so on
Ensemble Selection
• Sum of Categorical Competencies
• Rank Score Selection
• Maximum Information Selection
Ensemble Selection
Sum of Categorical Competencies
Rank Score Selection
1
2
3
4
5
6
7
.
.
.
1
2
3
4
5
6
7
.
.
.
Ensemble Selection
Maximum Information Selection
Ensemble Selection
Maximum Information Selection
1
Ensemble Selection
Maximum Information Selection
1
Ensemble Selection
Maximum Information Selection
1
2
Ensemble Selection
Maximum Information Selection
1
2
Ensemble Selection
Maximum Information Selection
1
2
3
Ensemble Selection
Maximum Information Selection
1
2
3
Ensemble Selection
Maximum Information Selection
1
2
3
4
Ensemble Results
Ensemble Results
Ensemble Results
Word2Vec
Ensemble Analysis
• With this method we show a theoretical upper
bound for the information learned by these models
• This can be rapidly adapted to other problems,
allowing NLP systems to categorically select
models based on their specific competencies
Conclusion
• The meaning extracted from each context word is not
uniform
• Context selection massively impacts linguistic competencies
• Adjacency contexts are uniquely proficient at extracting
inflectional morpheme information
• Dependency contexts are significantly better at learning
lexical information
• Contexts are not compositional
• Categorically selecting models can be incredibly effective
Future Work
• Vector space analysis
• Additional models and linguistic context building
Questions

More Related Content

Similar to Categorical Evaluation for Advanced Distributional Semantic Models

Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jWilliam Lyon
 
Compare and contrast (1)
Compare and contrast (1)Compare and contrast (1)
Compare and contrast (1)Tracy Mccleskey
 
Natural Language Processing with Graphs
Natural Language Processing with GraphsNatural Language Processing with Graphs
Natural Language Processing with GraphsNeo4j
 
Using and learning phrases
Using and learning phrasesUsing and learning phrases
Using and learning phrasesCassandra Jacobs
 
Word2Vec Network Structure Explained
Word2Vec Network Structure ExplainedWord2Vec Network Structure Explained
Word2Vec Network Structure ExplainedSubhashis Hazarika
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingTed Xiao
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language ProcessingPranav Gupta
 
Noun Paraphrasing Based on a Variety of Contexts
Noun Paraphrasing Based on a Variety of ContextsNoun Paraphrasing Based on a Variety of Contexts
Noun Paraphrasing Based on a Variety of ContextsTomoyuki Kajiwara
 
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...Seth Grimes
 
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...Seth Grimes
 
Designing, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural NetworksDesigning, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural Networksconnectbeubax
 
Using topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchUsing topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchDawn Anderson MSc DigM
 
Using Parallel Propbanks to enhance Word-alignments
Using Parallel Propbanks to enhance Word-alignmentsUsing Parallel Propbanks to enhance Word-alignments
Using Parallel Propbanks to enhance Word-alignmentsJinho Choi
 
Natural Language Search with Knowledge Graphs (Haystack 2019)
Natural Language Search with Knowledge Graphs (Haystack 2019)Natural Language Search with Knowledge Graphs (Haystack 2019)
Natural Language Search with Knowledge Graphs (Haystack 2019)Trey Grainger
 
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey GraingerHaystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey GraingerOpenSource Connections
 

Similar to Categorical Evaluation for Advanced Distributional Semantic Models (20)

NLP & DBpedia
 NLP & DBpedia NLP & DBpedia
NLP & DBpedia
 
Natural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4jNatural Language Processing with Graph Databases and Neo4j
Natural Language Processing with Graph Databases and Neo4j
 
Compare and contrast (1)
Compare and contrast (1)Compare and contrast (1)
Compare and contrast (1)
 
Natural Language Processing with Graphs
Natural Language Processing with GraphsNatural Language Processing with Graphs
Natural Language Processing with Graphs
 
Using and learning phrases
Using and learning phrasesUsing and learning phrases
Using and learning phrases
 
Word2Vec Network Structure Explained
Word2Vec Network Structure ExplainedWord2Vec Network Structure Explained
Word2Vec Network Structure Explained
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language Processing
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Noun Paraphrasing Based on a Variety of Contexts
Noun Paraphrasing Based on a Variety of ContextsNoun Paraphrasing Based on a Variety of Contexts
Noun Paraphrasing Based on a Variety of Contexts
 
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
Preposition Semantics: Challenges in Comprehensive Corpus Annotation and Auto...
 
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
The Ins and Outs of Preposition Semantics:
 Challenges in Comprehensive Corpu...
 
Word vectors
Word vectorsWord vectors
Word vectors
 
Designing, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural NetworksDesigning, Visualizing and Understanding Deep Neural Networks
Designing, Visualizing and Understanding Deep Neural Networks
 
Compare and contrast
Compare and contrastCompare and contrast
Compare and contrast
 
Using topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchUsing topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic search
 
Using Parallel Propbanks to enhance Word-alignments
Using Parallel Propbanks to enhance Word-alignmentsUsing Parallel Propbanks to enhance Word-alignments
Using Parallel Propbanks to enhance Word-alignments
 
Tips for better writing
Tips for better writingTips for better writing
Tips for better writing
 
Natural Language Search with Knowledge Graphs (Haystack 2019)
Natural Language Search with Knowledge Graphs (Haystack 2019)Natural Language Search with Knowledge Graphs (Haystack 2019)
Natural Language Search with Knowledge Graphs (Haystack 2019)
 
The Duet model
The Duet modelThe Duet model
The Duet model
 
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey GraingerHaystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
Haystack 2019 - Natural Language Search with Knowledge Graphs - Trey Grainger
 

More from Jinho Choi

Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Jinho Choi
 
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Jinho Choi
 
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Jinho Choi
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Jinho Choi
 
The Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionThe Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionJinho Choi
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Jinho Choi
 
Abstract Meaning Representation
Abstract Meaning RepresentationAbstract Meaning Representation
Abstract Meaning RepresentationJinho Choi
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role LabelingJinho Choi
 
CS329 - WordNet Similarities
CS329 - WordNet SimilaritiesCS329 - WordNet Similarities
CS329 - WordNet SimilaritiesJinho Choi
 
CS329 - Lexical Relations
CS329 - Lexical RelationsCS329 - Lexical Relations
CS329 - Lexical RelationsJinho Choi
 
Automatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementAutomatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementJinho Choi
 
Attention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingAttention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingJinho Choi
 
Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueJinho Choi
 
Real-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingReal-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingJinho Choi
 
Topological Sort
Topological SortTopological Sort
Topological SortJinho Choi
 
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseMulti-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseJinho Choi
 
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsBuilding Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsJinho Choi
 
How to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyHow to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyJinho Choi
 

More from Jinho Choi (20)

Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
 
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
 
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
 
The Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionThe Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference Resolution
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
 
Abstract Meaning Representation
Abstract Meaning RepresentationAbstract Meaning Representation
Abstract Meaning Representation
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role Labeling
 
CKY Parsing
CKY ParsingCKY Parsing
CKY Parsing
 
CS329 - WordNet Similarities
CS329 - WordNet SimilaritiesCS329 - WordNet Similarities
CS329 - WordNet Similarities
 
CS329 - Lexical Relations
CS329 - Lexical RelationsCS329 - Lexical Relations
CS329 - Lexical Relations
 
Automatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementAutomatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue Management
 
Attention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingAttention is All You Need for AMR Parsing
Attention is All You Need for AMR Parsing
 
Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to Dialogue
 
Real-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingReal-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue Understanding
 
Topological Sort
Topological SortTopological Sort
Topological Sort
 
Tries - Put
Tries - PutTries - Put
Tries - Put
 
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseMulti-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
 
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsBuilding Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
 
How to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyHow to make Emora talk about Sports Intelligently
How to make Emora talk about Sports Intelligently
 

Recently uploaded

Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe中 央社
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxFIDO Alliance
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Hiroshi SHIBATA
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGDSC PJATK
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentationyogeshlabana357357
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTopCSSGallery
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxFIDO Alliance
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingScyllaDB
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024Lorenzo Miniero
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxFIDO Alliance
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Paige Cruz
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform EngineeringMarcus Vechiato
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024Stephen Perrenod
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfdanishmna97
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxjbellis
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireExakis Nelite
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...marcuskenyatta275
 

Recently uploaded (20)

Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 

Categorical Evaluation for Advanced Distributional Semantic Models

  • 1. Categorical Evaluation for Advanced Distributional Semantic Models An Undergraduate Dissertation by Reid Kilgore
  • 2. Agenda • Background • Syntactic breakdown of model competency • Using new models to examine competency improvements • Analysis and new approaches
  • 3. Word Representation We need some way to allow NLP systems to leverage language information • Tokenization • Distributional Semantics
  • 4. Distributional Semantics “A word can be known by the company it keeps” - Firth Premise: • Words that appear together are more similar than words that do not appear together Idea: • We can define a word by analyzing the frequency with which every word occurs with every other word
  • 5. Distributional Models • Brown Clustering • Grouping words by the terms most likely to have come before • Word Embeddings • Mapping words to high-dimensional vectors
  • 6. Language Modeling Sentence: I’m giving a presentation right now
  • 7. Language Modeling Sentence: I’m giving a presentation right now Window = 2
  • 8. Language Modeling Sentence: I’m giving a presentation right now Window = 2 giving a talk right now giving a lecture right now giving a speech right now
  • 9. Neural Network Models • A Neural Network is a model built with the use of synthetic neurons • This is an unsupervised machine learning technique Neural Network Language Models generally try to determine the likelihood of a word relative to its context, or vice versa
  • 10. Word2Vec • By far the most influential modern Neural Network Language Model • Significantly more efficient than previous models
  • 11. Vector Offset vector(a’) - vector(a) We can extract the relationship between two word vectors by taking their offset. vector(goats) - vector(goat) => relationship(plural)
  • 12. Vector Offset vector(a’) - vector(a) We can extract the relationship between two word vectors by taking their offset. vector(goats) - vector(goat) => relationship(plural) vector(dog) + relationship(plural) => vector(dogs)
  • 13. Analogies a is to a’ as b is to b’ Run is to Runs as Walk is to Walks Amazing is to Amazingly as Great is to Greatly Atlanta is to Georgia as Tampa is to Florida
  • 14. Analogy Tests ! " 👑 !is to as is to ⁇ 👑
  • 16. Analogy Tests a is to a’ as b is to b’ vector(a’) - vector(a) => relationship(a’:a) relationship(a’:b) + vector(b) => c
  • 17. Analogy Tests a is to a’ as b is to b’ vector(a’) - vector(a) => relationship(a’:a) relationship(a’:b) + vector(b) => c We search for the vector most similar to c and check if it represents b’.
  • 18. Analogy Tests a is to a’ as b is to b’ vector(a’) - vector(a) => relationship(a’:a) relationship(a’:b) + vector(b) => c We search for the vector most similar to c and check if it represents b’. vector(a’) - vector(a) + vector(b) => c
  • 19. Analogy Tests We can evaluate a model by evaluating how many analogies it can recognize
  • 21. Analogy Tests ! " 👑 " 👑 !- + => ✅ ! " 👑 (!- + => ❌
  • 22. Analogy Tests ! " 👑 " 👑 !- + => ✅ ! " 👑 (!- + => ❌ ! " 👑 ⛄ 👑 !- + => ❌
  • 23. Analogy Tests ! " 👑 " 👑 !- + => ✅ ! " 👑 (!- + => ❌ ! " 👑 ⛄ 👑 !- + => ! " 👑 🚀!- + => ❌ ❌
  • 24. Analogy Test Approach • Train EmoryNLP Word2Vec on Wikipedia 2015 and NYTimes • We take the predominantly used test set and analyze based on linguistic categories
  • 25. Analogy Test Categories • Lexical Common Capital Countries • Grammatical Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013. Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
  • 26. Analogy Test Categories England is to London • Lexical Common Capital Countries • Grammatical Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013. Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
  • 27. Analogy Test Categories Nigeria is to Abuja • Lexical Common Capital Countries, World Capitals • Grammatical Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013. Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
  • 28. Analogy Test Categories Los Angeles is to California • Lexical Common Capital Countries, World Capitals, City in State • Grammatical Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013. Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
  • 29. Analogy Test Categories England is to Pound • Lexical Common Capital Countries, World Capitals, City in State, Currency • Grammatical Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013. Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
  • 30. Analogy Test Categories King is to Queen • Lexical Common Capital Countries, World Capitals, City in State, Currency, Gender • Grammatical Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013. Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
  • 31. Analogy Test Categories Hot is to Cold • Lexical Common Capital Countries, World Capitals, City in State, Currency, Gender, Opposite • Grammatical Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013. Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
  • 32. Analogy Test Categories American is to America • Lexical Common Capital Countries, World Capitals, City in State, Currency, Gender, Opposite, Nationality Adjective • Grammatical Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013. Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
  • 33. Analogy Test Categories Amazing is to Amazingly • Lexical Common Capital Countries, World Capitals, City in State, Currency, Gender, Opposite, Nationality Adjective • Grammatical Adjective to Adverb Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013. Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
  • 34. Analogy Test Categories Warm is to Warmer • Lexical Common Capital Countries, World Capitals, City in State, Currency, Gender, Opposite, Nationality Adjective • Grammatical Adjective to Adverb, Comparative Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013. Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
  • 35. Analogy Test Categories Loud is to Loudest • Lexical Common Capital Countries, World Capitals, City in State, Currency, Gender, Opposite, Nationality Adjective • Grammatical Adjective to Adverb, Comparative, Superlative Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013. Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
  • 36. Analogy Test Categories Coding is to Code • Lexical Common Capital Countries, World Capitals, City in State, Currency, Gender, Opposite, Nationality Adjective • Grammatical Adjective to Adverb, Comparative, Superlative, Present Participle Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013. Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
  • 37. Analogy Test Categories Dancing is to Danced • Lexical Common Capital Countries, World Capitals, City in State, Currency, Gender, Opposite, Nationality Adjective • Grammatical Adjective to Adverb, Comparative, Superlative, Present Participle, Past Tense Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013. Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
  • 38. Analogy Test Categories City is to Cities • Lexical Common Capital Countries, World Capitals, City in State, Currency, Gender, Opposite, Nationality Adjective • Grammatical Adjective to Adverb, Comparative, Superlative, Present Participle, Past Tense, Plural Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013. Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
  • 39. Analogy Test Categories Describe is to Describes • Lexical Common Capital Countries, World Capitals, City in State, Currency, Gender, Opposite, Nationality Adjective • Grammatical Adjective to Adverb, Comparative, Superlative, Present Participle, Past Tense, Plural, Plural Verb Distributed Representations of Words and Phrases and their Compositionality - Mikolov et al., 2013. Tests available at at code.google.com/p/word2vec/source/browse/trunk/questions-words.txt
  • 42. Word2Vec Analysis Very high grammatical competency, very low lexical • Adverb to adjective - only derivational morpheme Idea: could we modify the training process to improve weak model attributes?
  • 43. Contexts Recall: • The Distributional Hypothesis posits that full co- occurrence information would capture all information about a language It is possible that existing methods aren’t selecting the best possible information.
  • 44. Contexts What if we used linguistic structure to select contexts? • Dependency Structure • Predicate Argument Structure
  • 45. Dependency Structure • Linguistic units are connected by directed links called dependencies He opened the bottle with an opener for her. opened the he bottle with an opener for her
  • 46. Predicate Argument Structure • Predicate argument structure is concerned with the arguments accepted by verbs and predicates in a sentence • Ex: “open” might accept an “opener”, “thing opened”, “instrument” and “benefactive”
  • 47. Predicate Argument Structure Arg0: Opener Arg1: Thing opened Arg2: Instrument Arg3: Benefactive He opened the bottle with an opener for her. opened the he bottle with an opener for her
  • 48. Predicate Argument Structure Arg0: Opener Arg1: Thing opened Arg2: Instrument Arg3: Benefactive He opened the bottle with an opener for her. opened the he bottle with an opener for her
  • 49. Predicate Argument Structure Arg0: Opener Arg1: Thing opened Arg2: Instrument Arg3: Benefactive He opened the bottle with an opener for her. opened the he bottle with an opener for her
  • 50. Predicate Argument Structure Arg0: Opener Arg1: Thing opened Arg2: Instrument Arg3: Benefactive He opened the bottle with an opener for her. opened the he bottle with an opener for her
  • 51. Predicate Argument Structure Arg0: Opener Arg1: Thing opened Arg2: Instrument Arg3: Benefactive He opened the bottle with an opener for her. opened the he bottle with an opener for her
  • 52. Previous Models Word2vec • Only adjacent contexts, very little categorical analysis Levy and Goldberg • Used some basic dependency context information • Did not explore much information, didn't do thorough categorical analysis
  • 54. wants He throw herto frisbee the bright red Sentence He wants to throw her the bright red frisbee in the park in park the
  • 55. wants He throw herto frisbee the bright red Sentence He wants to throw her the bright red frisbee in the park Distance from Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8 Training Word: throw in park the
  • 56. wants He throw herto frisbee the bright red Sentence He wants to throw her the bright red frisbee in the park Word2Vec Context He wants to her the bright red frisbee Distance from Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8 Training Word: throw in park the
  • 57. wants He throw herto frisbee the bright red Sentence He wants to throw her the bright red frisbee in the park Dep1 Context Word2Vec Context He wants to her the bright red frisbee Distance from Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8 Training Word: throw in park the Dep1 Dependency Children
  • 58. wants He throw herto frisbee the bright red Sentence He wants to throw her the bright red frisbee in the park Dep1 Context to Word2Vec Context He wants to her the bright red frisbee Distance from Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8 Training Word: throw in park the Dep1 Dependency Children
  • 59. wants He throw herto frisbee the bright red Sentence He wants to throw her the bright red frisbee in the park Dep1 Context to her Word2Vec Context He wants to her the bright red frisbee Distance from Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8 Training Word: throw in park the Dep1 Dependency Children
  • 60. wants He throw herto frisbee the bright red Sentence He wants to throw her the bright red frisbee in the park Dep1 Context to her frisbee Word2Vec Context He wants to her the bright red frisbee Distance from Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8 Training Word: throw in park the Dep1 Dependency Children
  • 61. wants He throw herto frisbee the bright red Sentence He wants to throw her the bright red frisbee in the park Dep1 Context to her frisbee in Word2Vec Context He wants to her the bright red frisbee Distance from Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8 Training Word: throw in park the Dep1 Dependency Children
  • 62. wants He throw herto frisbee the bright red Sentence He wants to throw her the bright red frisbee in the park Dep1 Context to her frisbee in Word2Vec Context He wants to her the bright red frisbee Distance from Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8 Training Word: throw in park the Dep1 Dependency Children
  • 63. wants He throw herto frisbee the bright red Sentence He wants to throw her the bright red frisbee in the park Dep2 Context to her the bright red frisbee in park Word2Vec Context He wants to her the bright red frisbee Distance from Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8 Training Word: throw in park the Dep2 Dependency Grandchildren
  • 64. wants He throw herto frisbee the bright red Sentence He wants to throw her the bright red frisbee in the park Dep2h Context wants to her the bright red frisbee in park Word2Vec Context He wants to her the bright red frisbee Distance from Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8 Training Word: throw in park the Dep2h Dependency Grandchildren, Head
  • 65. wants He throw herto frisbee the bright red Sentence He wants to throw her the bright red frisbee in the park Sib1 Context her in Word2Vec Context throw her the bright red in the park Distance from Training Word -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 Training Word: frisbee in park the Sib1 Nearest Siblings
  • 66. wants He throw herto frisbee the bright red Sentence He wants to throw her the bright red frisbee in the park Sib2 Context to her in Word2Vec Context throw her the bright red in the park Distance from Training Word -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 Training Word: frisbee in park the Sib2 Nearest Two Siblings
  • 67. wants He throw herto frisbee the bright red Sentence He wants to throw her the bright red frisbee in the park Sib1Dep1 Context her the bright red in Word2Vec Context throw her the bright red in the park Distance from Training Word -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 Training Word: frisbee in park the Sib1Dep1 Dependency Children, Nearest Siblings
  • 68. wants He throw herto frisbee the bright red Sentence He wants to throw her the bright red frisbee in the park Sib2Dep1 Context to her the bright red in Word2Vec Context throw her the bright red in the park Distance from Training Word -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 Training Word: frisbee in park the Sib2Dep1 Dependency Children, Nearest Two Siblings
  • 69. wants He throw herto frisbee the bright red Sentence He wants to throw her the bright red frisbee in the park Sib1Dep2 Context He to her the bright red frisbee in park Word2Vec Context He wants to her the bright red frisbee Distance from Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8 Training Word: throw in park the Sib1Dep2 Dependency Grandchildren, Nearest Siblings
  • 70. wants He throw herto frisbee the bright red Sentence He wants to throw her the bright red frisbee in the park All Siblings Context to her in Word2Vec Context throw her the bright red in the park Distance from Training Word -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 Training Word: frisbee in park the All Siblings All Nodes Sharing The Same Head
  • 71. Sentence He wants to throw her the bright red frisbee in the park Srl1 Context wants throw Word2Vec Context He wants to throw her the bright red Distance from Training Word 0 1 2 3 4 5 6 7 8 9 10 11 Training Word: He Srl1 Semantic Role Head verb: wants wanter: He verb: throw thrower: He wants He throw herto frisbee the bright red in park the
  • 82. Context Analysis Why are these models doing so well lexically? • Idea: the data outside of the Word2Vec context window is providing most of the improvement.
  • 83. Sentence He wants to throw her the bright red frisbee in the park Dep2 Context to her the bright red frisbee in park Word2Vec Context He wants to her the bright red frisbee Distance from Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8 Training Word: throw Dep2 Dependency Grandchildren Context Analysis • Idea: the data outside of the Word2Vec context window is providing most of the improvement.
  • 84. Sentence He wants to throw her the bright red frisbee in the park Dep2 Context to her the bright red frisbee in park Word2Vec Context He wants to her the bright red frisbee Distance from Training Word -3 -2 -1 0 1 2 3 4 5 6 7 8 Training Word: throw Dep2 Dependency Grandchildren Context Analysis New Information • Idea: the data outside of the Word2Vec context window is providing most of the improvement.
  • 88. Context Analysis Some of the biggest improvements are in models that aren’t getting nearly as much new information. • This indicates that the benefit is from the context selection process
  • 89. Rank Analysis We find the rank score of a model by taking the following:
  • 90. Rank Analysis We find the rank score of a model by taking the following:
  • 92. Rank Sum Analysis We find the rank sum score of a model by taking the sum of all the model’s categorical scores
  • 93. Rank Sum Analysis We find the rank sum score of a model by taking the sum of all the model’s categorical scores
  • 95. Ensemble Models It seems difficult to construct models that are at least the sum of their parts. • What would the ideal composite model look like? • Can we achieve the same result in a different way?
  • 96. Ensemble Models Idea: Specific models have specific competencies. We can build a class that chooses what model to use based on the current task.
  • 97. Ensemble Selection We start by building an ensemble with only one model, then two models and so on
  • 98. Ensemble Selection • Sum of Categorical Competencies • Rank Score Selection • Maximum Information Selection
  • 99. Ensemble Selection Sum of Categorical Competencies Rank Score Selection 1 2 3 4 5 6 7 . . . 1 2 3 4 5 6 7 . . .
  • 111. Ensemble Analysis • With this method we show a theoretical upper bound for the information learned by these models • This can be rapidly adapted to other problems, allowing NLP systems to categorically select models based on their specific competencies
  • 112. Conclusion • The meaning extracted from each context word is not uniform • Context selection massively impacts linguistic competencies • Adjacency contexts are uniquely proficient at extracting inflectional morpheme information • Dependency contexts are significantly better at learning lexical information • Contexts are not compositional • Categorically selecting models can be incredibly effective
  • 113. Future Work • Vector space analysis • Additional models and linguistic context building