SlideShare a Scribd company logo
Introduction to Distributional
Semantics
André Freitas
Insight Centre for Data Analytics
Insight Workshop on Distributional Semantics
Galway, 2014
Based on the Great ESSLLI Tutorial from Evert & Lenci
Outline
 Contemporary Semantics
 Distributional Semantics
 Compositional-Distributional Semantics
 Take-away message
Contemporary
Semantics
Shift in the Semantics Landscape
Corroboration
PraxisScientific / FormalPhilosophical
Semantics as a
complex phenomena
Semantics for a Complex World
• Most semantic models have dealt with particular types of
constructions, and have been carried out under very simplifying
assumptions, in true lab conditions.
• If these idealizations are removed it is not clear at all that modern
semantics can give a full account of all but the simplest
models/statements.
Sahlgren, 2013
Formal World Real World
Baroni et al., 2012
What is Distributional
Semantics?
Meaning
 Word meaning is usually represented in terms of some formal,
symbolic structure, either external or internal to the word
 External structure
- Associations between different concepts
 Internal structure
- Feature (property, attribute) lists
 The semantic properties of a word are derived from the formal
structure of its representation
- e.g. Inference algorithm, etc.
Semantics = Meaning representation model (data) +
inference model
Formal Representation of Meaning
 Modelling fine-grained lexical inferences
Formal Representation of Meaning
(Problems)
 Different meanings
- bat (animal), bat (artefact)
 Meaning variation in context
- clever politician, clever tycoon
 Meaning evolution
 Ambiguity, vagueness, inconsistency
Word meaning acquisition
Lack of flexibility
Scalability
Distributional Hypothesis
“Words occurring in similar (linguistic) contexts tend
to be semantically similar”
 He filled the wampimuk with the substance, passed it
around and we all drunk some
 We found a little, hairy wampimuk sleeping behind the
tree
Weak and Strong DH (Lenci, 2008)
 Weak DH:
- Word meaning is reflected in linguistic distributions
- By inspecting a sufficiently large number of distributional
contexts we may have a useful surrogate representation of
meaning.
 Strong DH:
- A cognitive hypothesis about the form and origin of semantic
representations
Contextual Representation
 Abstract structure that accumulates encounters with the words
in various (linguistic) contexts.
 For our purposes …
- Context is equated with linguistic context
Distributional Semantic Models (DSMs)
“The dog barked in the park. The owner of the dog put him on the
leash since he barked.”
Distributional Semantic Models (DSMs)
“The dog barked in the park. The owner of the dog put him on the
leash since he barked.”
contexts = nouns and verbs in the same
sentence
Distributional Semantic Models (DSMs)
“The dog barked in the park. The owner of the dog put him on the
leash since he barked.”
bark
dog
park
leash
contexts = nouns and verbs in the same
sentence
bark : 2
park : 1
leash : 1
owner : 1
Distributional Semantic Models (DSMs)
distributional matrix = targets x contexts
contexts
targets
Vector Space Model (VSM)
Semantic Similarity & Relatedness
θ
car
dog
cat
bark
run
leash
Semantic Similarity & Relatedness
 Semantic similarity - two words sharing a high number of
salient
- features (attributes)
- synonymy (car/automobile)
- hyperonymy (car/vehicle)
- co-hyponymy (car/van/truck)
 Semantic relatedness (Budanitsky & Hirst 2006) - two words
semantically associated without being necessarily similar
- function (car/drive)
- meronymy (car/tyre)
- location (car/road)
- attribute (car/fast)
Distributional Semantic Models (DSMs)
 Computational models that build contextual semantic representations
from corpus data
 Semantic context is represented by a vector
 Vectors are obtained through the statistical analysis of the linguistic
contexts of a word
 Salience of contexts (cf. context weighting scheme)
 Semantic similarity/relatedness as the core operation over the model
DSMs as Commonsense Reasoning
Commonsense is here
θ
car
dog
cat
bark
run
leash
DSMs as Commonsense Reasoning
DSMs as Commonsense Reasoning
θ
car
dog
cat
bark
run
leash
...
vs.
Semantic best-effort
Demonstration (EasyESA)
http://treo.deri.ie/easyesa/
Applications
 Applications
- Semantic search
- Question answering
- Approximate semantic inference
- Word sense disambiguation
- Paraphrase detection
- Text entailment
- Semantic anomaly detection
...
Alternative Names for DSMs
 Corpus-based semantics
 Statistical semantics
 Geometrical models of meaning
 Vector semantics
 Word (semantic) space models
Definition of DSMs
Building a DSM
 Pre-process a corpus (target, context)
 Count the target-context co-occurrences
 Weight the contexts (optional)
 Build the distributional matrix
 Reduce the matrix dimensions (optional)
 Parameters
- Corpus
- Context type
- Weighting scheme
- Similarity measure
- Number of dimensions
 A parameter configuration determines the DSM: (LSA, ESA, …)
Parameters
 Corpus pre-processing
- Stemming/lemmatization
- POS tagging
- Syntactic Dependencies
 Context
- Document
- Paragraph
- Passage
- Word windows
- Words
- Linguistic features
- Lingustic patterns
- Verbs : contexts nouns
- Verbs : contexts adverbs
- etc.
- Size
- Shape
Context
Engineering
Effect of Parameters
Context Weighting
 Smoothing frequency differences: From raw counts to log-
frequency.
 Association measures (Evert 2005): are used to give more
weight to contexts that are more significantly associated with a
target word
Context Weighting
Measures
Kiela & Clark, 2014
Similarity Measures
Kiela & Clark, 2014
What is the best parameter configuration?
 The best parameter configuration depends on the task.
 Systematic exploration of the parameters
DSM Instances
 Latent Semantic Analysis (Landauer & Dumais 1996)
 Hyperspace Analogue to Language (Lund & Burgess 1996)
 Infomap NLP (Widdows 2004)
 Random Indexing (Karlgren & Salhgren 2001)
 Dependency Vectors (Pad´o & Lapata 2007)
 Explicit Semanitc Analysis (Gabrilovich & Markovitch, 2008)
 Distributional Memory (Baroni & Lenci 2009)
Compositional
Semantics
Paraphrase Detection
I find it rather odd that people are already trying to tie the
Commission's hands in relation to the proposal for a
directive, while at the same calling on it to present a Green
Paper on the current situation with regard to optional and
supplementary health insurance schemes.
I find it a little strange to now obliging the Commission to
a motion for a resolution and to ask him at the same time
to draw up a Green Paper on the current state of voluntary
insurance and supplementary sickness insurance.
=?
Compositional Semantics
 Can we extend DS to account for the meaning of phrases
and sentences?
 Compositionality: The meaning of a complex expression
is a function of the meaning of its constituent parts.
Compositional Semantics
Words in which the meaning is
directly determined by their
distributional behaviour (e.g.,
nouns).
Words that act as functions
transforming the distributional
profile of other words (e.g., verbs,
adjectives, …).
Compositional Semantics
Mixture Function
Compositional Semantics
 Take the syntactic structure to constitute the backbone
guiding the assembly of the semantic representations of
phrases.
(CHASE × cats) × dogs.
3rd order tensor vector
vector
(CHASE × cats)
Baroni et al., 2012
Formal Model
 Distributional Semantics & Category Theory
Take-away message
 Low acquisition effort
 Simple way to build a commonsense KB
 Semantic approximation as a built-in construct
 Semantic best-effort
 Simple to use
 DSMs are evolving fast (compositional and formal grounding)
 Distributional semantics brings a promising approach for
building semantic models that work in the real world
Great Introductory References
 Evert & Lenci ESSLLI Tutorial on Distributional
Semantics, 2009. (many slides were taken or adapted
from this great tutorial).
 Turney & Pantel, From Frequency to Meaning:Vector
Space Models of Semantics, 2010.
 Baroni et al., Frege in Space: A Program for
Compositional Distributional Semantics, 2012.
 Kiela & Clark: A Systematic Study of Semantic Vector
Space Model Parameters, 2014.

More Related Content

What's hot

Types of machine translation
Types of machine translationTypes of machine translation
Types of machine translation
Rushdi Shams
 
Systemic Functional Linguistics: An approach to analyzing written academic di...
Systemic Functional Linguistics: An approach to analyzing written academic di...Systemic Functional Linguistics: An approach to analyzing written academic di...
Systemic Functional Linguistics: An approach to analyzing written academic di...
ClmentNdoricimpa
 
Transformational generative grammar
Transformational  generative grammarTransformational  generative grammar
Transformational generative grammar
Baishakhi Amin
 
MELT 104 - Construction Grammar
MELT 104 - Construction GrammarMELT 104 - Construction Grammar
MELT 104 - Construction Grammar
Glynn Palecpec
 

What's hot (20)

Semantics and Computational Semantics
Semantics and Computational SemanticsSemantics and Computational Semantics
Semantics and Computational Semantics
 
Nlp ambiguity presentation
Nlp ambiguity presentationNlp ambiguity presentation
Nlp ambiguity presentation
 
sentiment analysis
sentiment analysis sentiment analysis
sentiment analysis
 
Types of machine translation
Types of machine translationTypes of machine translation
Types of machine translation
 
Word Sense Disambiguation and Induction
Word Sense Disambiguation and InductionWord Sense Disambiguation and Induction
Word Sense Disambiguation and Induction
 
Machine translator Introduction
Machine translator IntroductionMachine translator Introduction
Machine translator Introduction
 
Binding theory
Binding theoryBinding theory
Binding theory
 
Text summarization
Text summarizationText summarization
Text summarization
 
Principles and Parameters in Syntax
Principles and Parameters in SyntaxPrinciples and Parameters in Syntax
Principles and Parameters in Syntax
 
Phrase Structure Grammar
Phrase Structure GrammarPhrase Structure Grammar
Phrase Structure Grammar
 
Anaphora resolution
Anaphora resolutionAnaphora resolution
Anaphora resolution
 
Text Summarization
Text SummarizationText Summarization
Text Summarization
 
Systemic Functional Linguistics: An approach to analyzing written academic di...
Systemic Functional Linguistics: An approach to analyzing written academic di...Systemic Functional Linguistics: An approach to analyzing written academic di...
Systemic Functional Linguistics: An approach to analyzing written academic di...
 
Transformational generative grammar
Transformational  generative grammarTransformational  generative grammar
Transformational generative grammar
 
Clinical linguistics: Overview
Clinical linguistics: OverviewClinical linguistics: Overview
Clinical linguistics: Overview
 
Semantics
SemanticsSemantics
Semantics
 
MELT 104 - Construction Grammar
MELT 104 - Construction GrammarMELT 104 - Construction Grammar
MELT 104 - Construction Grammar
 
Extraction Based automatic summarization
Extraction Based automatic summarizationExtraction Based automatic summarization
Extraction Based automatic summarization
 
Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...Natural language processing with python and amharic syntax parse tree by dani...
Natural language processing with python and amharic syntax parse tree by dani...
 
Polysemi
PolysemiPolysemi
Polysemi
 

Viewers also liked

Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...
Andre Freitas
 
Query Compilation in Impala
Query Compilation in ImpalaQuery Compilation in Impala
Query Compilation in Impala
Cloudera, Inc.
 

Viewers also liked (13)

A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...
A Distributional Semantics Approach for Selective Reasoning on Commonsense Gr...
 
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...
 
How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?How Semantic Technologies can help to cure Hearing Loss?
How Semantic Technologies can help to cure Hearing Loss?
 
Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...
Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...
Semantic Web, Knowledge Graph, and Other Changes to SERPS – A Google Semantic...
 
Knowledge graph
Knowledge graphKnowledge graph
Knowledge graph
 
Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...Schema-agnositc queries over large-schema databases: a distributional semanti...
Schema-agnositc queries over large-schema databases: a distributional semanti...
 
Query Compilation in Impala
Query Compilation in ImpalaQuery Compilation in Impala
Query Compilation in Impala
 
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary StudyOn the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study
 
Theories of meaning
Theories of meaningTheories of meaning
Theories of meaning
 
Understanding Queries through Entities
Understanding Queries through EntitiesUnderstanding Queries through Entities
Understanding Queries through Entities
 
Enterprise Knowledge Graph
Enterprise Knowledge GraphEnterprise Knowledge Graph
Enterprise Knowledge Graph
 
The Different Theories of Semantics
The Different Theories of Semantics The Different Theories of Semantics
The Different Theories of Semantics
 
Build Features, Not Apps
Build Features, Not AppsBuild Features, Not Apps
Build Features, Not Apps
 

Similar to Introduction to Distributional Semantics

An introduction to compositional models in distributional semantics
An introduction to compositional models in distributional semanticsAn introduction to compositional models in distributional semantics
An introduction to compositional models in distributional semantics
Andre Freitas
 
Sentence Processing by Muhammad Saleem.pptx
Sentence Processing by Muhammad Saleem.pptxSentence Processing by Muhammad Saleem.pptx
Sentence Processing by Muhammad Saleem.pptx
E&S Education Department, KP
 

Similar to Introduction to Distributional Semantics (20)

Towards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web StackTowards a Distributional Semantic Web Stack
Towards a Distributional Semantic Web Stack
 
Aaai 2006 Pedersen
Aaai 2006 PedersenAaai 2006 Pedersen
Aaai 2006 Pedersen
 
An introduction to compositional models in distributional semantics
An introduction to compositional models in distributional semanticsAn introduction to compositional models in distributional semantics
An introduction to compositional models in distributional semantics
 
DDH 2021-03-03: Text Processing and Searching in the Medical Domain
DDH 2021-03-03: Text Processing and Searching in the Medical DomainDDH 2021-03-03: Text Processing and Searching in the Medical Domain
DDH 2021-03-03: Text Processing and Searching in the Medical Domain
 
Chat bot using text similarity approach
Chat bot using text similarity approachChat bot using text similarity approach
Chat bot using text similarity approach
 
Ijcai 2007 Pedersen
Ijcai 2007 PedersenIjcai 2007 Pedersen
Ijcai 2007 Pedersen
 
Sentence Processing by Muhammad Saleem.pptx
Sentence Processing by Muhammad Saleem.pptxSentence Processing by Muhammad Saleem.pptx
Sentence Processing by Muhammad Saleem.pptx
 
NLP
NLPNLP
NLP
 
Eacl 2006 Pedersen
Eacl 2006 PedersenEacl 2006 Pedersen
Eacl 2006 Pedersen
 
Eurolan 2005 Pedersen
Eurolan 2005 PedersenEurolan 2005 Pedersen
Eurolan 2005 Pedersen
 
Exempler approach
Exempler approachExempler approach
Exempler approach
 
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffnL6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn
 
The Semantic Quilt
The Semantic QuiltThe Semantic Quilt
The Semantic Quilt
 
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATIONAN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
AN EMPIRICAL STUDY OF WORD SENSE DISAMBIGUATION
 
Using topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchUsing topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic search
 
Measuring Similarity Between Contexts and Concepts
Measuring Similarity Between Contexts and ConceptsMeasuring Similarity Between Contexts and Concepts
Measuring Similarity Between Contexts and Concepts
 
Interactive Analysis of Word Vector Embeddings
Interactive Analysis of Word Vector EmbeddingsInteractive Analysis of Word Vector Embeddings
Interactive Analysis of Word Vector Embeddings
 
nlp (1).pptx
nlp (1).pptxnlp (1).pptx
nlp (1).pptx
 
Using Text Embeddings for Information Retrieval
Using Text Embeddings for Information RetrievalUsing Text Embeddings for Information Retrieval
Using Text Embeddings for Information Retrieval
 
Nlp (1)
Nlp (1)Nlp (1)
Nlp (1)
 

More from Andre Freitas

AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
AI & Scientific Discovery in Oncology: Opportunities, Challenges & TrendsAI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
Andre Freitas
 
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeSchema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Andre Freitas
 

More from Andre Freitas (20)

AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
AI & Scientific Discovery in Oncology: Opportunities, Challenges & TrendsAI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
AI & Scientific Discovery in Oncology: Opportunities, Challenges & Trends
 
AI Systems @ Manchester
AI Systems @ ManchesterAI Systems @ Manchester
AI Systems @ Manchester
 
AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep Learning
 
Building AI Applications using Knowledge Graphs
Building AI Applications using Knowledge GraphsBuilding AI Applications using Knowledge Graphs
Building AI Applications using Knowledge Graphs
 
Open IE tutorial 2018
Open IE tutorial 2018Open IE tutorial 2018
Open IE tutorial 2018
 
Effective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP SystemsEffective Semantics for Engineering NLP Systems
Effective Semantics for Engineering NLP Systems
 
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs ...
 
Semantic Perspectives for Contemporary Question Answering Systems
Semantic Perspectives for Contemporary Question Answering SystemsSemantic Perspectives for Contemporary Question Answering Systems
Semantic Perspectives for Contemporary Question Answering Systems
 
Semantic Relation Classification: Task Formalisation and Refinement
Semantic Relation Classification: Task Formalisation and RefinementSemantic Relation Classification: Task Formalisation and Refinement
Semantic Relation Classification: Task Formalisation and Refinement
 
Categorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary DefinitionsCategorization of Semantic Roles for Dictionary Definitions
Categorization of Semantic Roles for Dictionary Definitions
 
Word Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology ClassesWord Tagging with Foundational Ontology Classes
Word Tagging with Foundational Ontology Classes
 
Different Semantic Perspectives for Question Answering Systems
Different Semantic Perspectives for Question Answering SystemsDifferent Semantic Perspectives for Question Answering Systems
Different Semantic Perspectives for Question Answering Systems
 
WiSS Challenge - Day 2
WiSS Challenge - Day 2WiSS Challenge - Day 2
WiSS Challenge - Day 2
 
WISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked DataWISS QA Do it yourself Question answering over Linked Data
WISS QA Do it yourself Question answering over Linked Data
 
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web ChallengeSchema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
Schema-Agnostic Queries (SAQ-2015): Semantic Web Challenge
 
Semantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional ApproachSemantics at Scale: A Distributional Approach
Semantics at Scale: A Distributional Approach
 
A Semantic Web Platform for Automating the Interpretation of Finite Element ...
A Semantic Web Platform for Automating the Interpretation of Finite Element ...A Semantic Web Platform for Automating the Interpretation of Finite Element ...
A Semantic Web Platform for Automating the Interpretation of Finite Element ...
 
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
 
Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)Question Answering over Linked Data (Reasoning Web Summer School)
Question Answering over Linked Data (Reasoning Web Summer School)
 
On the Semantic Representation and Extraction of Complex Category Descriptors
On the Semantic Representation and Extraction of Complex Category DescriptorsOn the Semantic Representation and Extraction of Complex Category Descriptors
On the Semantic Representation and Extraction of Complex Category Descriptors
 

Recently uploaded

一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
MAQIB18
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
Introduction-to-Cybersecurit57hhfcbbcxxx
Introduction-to-Cybersecurit57hhfcbbcxxxIntroduction-to-Cybersecurit57hhfcbbcxxx
Introduction-to-Cybersecurit57hhfcbbcxxx
zahraomer517
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 

Recently uploaded (20)

一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Uber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis ReportUber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis Report
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
Introduction-to-Cybersecurit57hhfcbbcxxx
Introduction-to-Cybersecurit57hhfcbbcxxxIntroduction-to-Cybersecurit57hhfcbbcxxx
Introduction-to-Cybersecurit57hhfcbbcxxx
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 

Introduction to Distributional Semantics

  • 1. Introduction to Distributional Semantics André Freitas Insight Centre for Data Analytics Insight Workshop on Distributional Semantics Galway, 2014 Based on the Great ESSLLI Tutorial from Evert & Lenci
  • 2. Outline  Contemporary Semantics  Distributional Semantics  Compositional-Distributional Semantics  Take-away message
  • 4. Shift in the Semantics Landscape Corroboration PraxisScientific / FormalPhilosophical Semantics as a complex phenomena
  • 5. Semantics for a Complex World • Most semantic models have dealt with particular types of constructions, and have been carried out under very simplifying assumptions, in true lab conditions. • If these idealizations are removed it is not clear at all that modern semantics can give a full account of all but the simplest models/statements. Sahlgren, 2013 Formal World Real World Baroni et al., 2012
  • 7. Meaning  Word meaning is usually represented in terms of some formal, symbolic structure, either external or internal to the word  External structure - Associations between different concepts  Internal structure - Feature (property, attribute) lists  The semantic properties of a word are derived from the formal structure of its representation - e.g. Inference algorithm, etc. Semantics = Meaning representation model (data) + inference model
  • 8. Formal Representation of Meaning  Modelling fine-grained lexical inferences
  • 9. Formal Representation of Meaning (Problems)  Different meanings - bat (animal), bat (artefact)  Meaning variation in context - clever politician, clever tycoon  Meaning evolution  Ambiguity, vagueness, inconsistency Word meaning acquisition Lack of flexibility Scalability
  • 10. Distributional Hypothesis “Words occurring in similar (linguistic) contexts tend to be semantically similar”  He filled the wampimuk with the substance, passed it around and we all drunk some  We found a little, hairy wampimuk sleeping behind the tree
  • 11. Weak and Strong DH (Lenci, 2008)  Weak DH: - Word meaning is reflected in linguistic distributions - By inspecting a sufficiently large number of distributional contexts we may have a useful surrogate representation of meaning.  Strong DH: - A cognitive hypothesis about the form and origin of semantic representations
  • 12. Contextual Representation  Abstract structure that accumulates encounters with the words in various (linguistic) contexts.  For our purposes … - Context is equated with linguistic context
  • 13. Distributional Semantic Models (DSMs) “The dog barked in the park. The owner of the dog put him on the leash since he barked.”
  • 14. Distributional Semantic Models (DSMs) “The dog barked in the park. The owner of the dog put him on the leash since he barked.” contexts = nouns and verbs in the same sentence
  • 15. Distributional Semantic Models (DSMs) “The dog barked in the park. The owner of the dog put him on the leash since he barked.” bark dog park leash contexts = nouns and verbs in the same sentence bark : 2 park : 1 leash : 1 owner : 1
  • 16. Distributional Semantic Models (DSMs) distributional matrix = targets x contexts contexts targets Vector Space Model (VSM)
  • 17. Semantic Similarity & Relatedness θ car dog cat bark run leash
  • 18. Semantic Similarity & Relatedness  Semantic similarity - two words sharing a high number of salient - features (attributes) - synonymy (car/automobile) - hyperonymy (car/vehicle) - co-hyponymy (car/van/truck)  Semantic relatedness (Budanitsky & Hirst 2006) - two words semantically associated without being necessarily similar - function (car/drive) - meronymy (car/tyre) - location (car/road) - attribute (car/fast)
  • 19. Distributional Semantic Models (DSMs)  Computational models that build contextual semantic representations from corpus data  Semantic context is represented by a vector  Vectors are obtained through the statistical analysis of the linguistic contexts of a word  Salience of contexts (cf. context weighting scheme)  Semantic similarity/relatedness as the core operation over the model
  • 20. DSMs as Commonsense Reasoning Commonsense is here θ car dog cat bark run leash
  • 21. DSMs as Commonsense Reasoning
  • 22. DSMs as Commonsense Reasoning θ car dog cat bark run leash ... vs. Semantic best-effort
  • 24. Applications  Applications - Semantic search - Question answering - Approximate semantic inference - Word sense disambiguation - Paraphrase detection - Text entailment - Semantic anomaly detection ...
  • 25. Alternative Names for DSMs  Corpus-based semantics  Statistical semantics  Geometrical models of meaning  Vector semantics  Word (semantic) space models
  • 27. Building a DSM  Pre-process a corpus (target, context)  Count the target-context co-occurrences  Weight the contexts (optional)  Build the distributional matrix  Reduce the matrix dimensions (optional)  Parameters - Corpus - Context type - Weighting scheme - Similarity measure - Number of dimensions  A parameter configuration determines the DSM: (LSA, ESA, …)
  • 28. Parameters  Corpus pre-processing - Stemming/lemmatization - POS tagging - Syntactic Dependencies  Context - Document - Paragraph - Passage - Word windows - Words - Linguistic features - Lingustic patterns - Verbs : contexts nouns - Verbs : contexts adverbs - etc. - Size - Shape Context Engineering
  • 30. Context Weighting  Smoothing frequency differences: From raw counts to log- frequency.  Association measures (Evert 2005): are used to give more weight to contexts that are more significantly associated with a target word
  • 33. What is the best parameter configuration?  The best parameter configuration depends on the task.  Systematic exploration of the parameters
  • 34. DSM Instances  Latent Semantic Analysis (Landauer & Dumais 1996)  Hyperspace Analogue to Language (Lund & Burgess 1996)  Infomap NLP (Widdows 2004)  Random Indexing (Karlgren & Salhgren 2001)  Dependency Vectors (Pad´o & Lapata 2007)  Explicit Semanitc Analysis (Gabrilovich & Markovitch, 2008)  Distributional Memory (Baroni & Lenci 2009)
  • 36. Paraphrase Detection I find it rather odd that people are already trying to tie the Commission's hands in relation to the proposal for a directive, while at the same calling on it to present a Green Paper on the current situation with regard to optional and supplementary health insurance schemes. I find it a little strange to now obliging the Commission to a motion for a resolution and to ask him at the same time to draw up a Green Paper on the current state of voluntary insurance and supplementary sickness insurance. =?
  • 37. Compositional Semantics  Can we extend DS to account for the meaning of phrases and sentences?  Compositionality: The meaning of a complex expression is a function of the meaning of its constituent parts.
  • 38. Compositional Semantics Words in which the meaning is directly determined by their distributional behaviour (e.g., nouns). Words that act as functions transforming the distributional profile of other words (e.g., verbs, adjectives, …).
  • 40. Compositional Semantics  Take the syntactic structure to constitute the backbone guiding the assembly of the semantic representations of phrases. (CHASE × cats) × dogs. 3rd order tensor vector vector (CHASE × cats) Baroni et al., 2012
  • 41. Formal Model  Distributional Semantics & Category Theory
  • 42. Take-away message  Low acquisition effort  Simple way to build a commonsense KB  Semantic approximation as a built-in construct  Semantic best-effort  Simple to use  DSMs are evolving fast (compositional and formal grounding)  Distributional semantics brings a promising approach for building semantic models that work in the real world
  • 43. Great Introductory References  Evert & Lenci ESSLLI Tutorial on Distributional Semantics, 2009. (many slides were taken or adapted from this great tutorial).  Turney & Pantel, From Frequency to Meaning:Vector Space Models of Semantics, 2010.  Baroni et al., Frege in Space: A Program for Compositional Distributional Semantics, 2012.  Kiela & Clark: A Systematic Study of Semantic Vector Space Model Parameters, 2014.