SlideShare a Scribd company logo
A Semantic Relatedness Measure Based on Co-occurrence Network and Graph Kernel Kyungpook National University 노태길 (Tae-Gil Noh) tailblues@me.com 2011년 1월 20일,ACL 유치 기념 워크샵
Overview A new semantic relatedness measure From co-occurrence observations on raw corpus For words and phrases Improving Vector space model with Network representations Similarity measure in a kernel space  Co-occurrence observations compared by kernel R-convolution Kernel and Graph Kernel
Introduction: Semantic relatedness measure Measuring semantic distance between two terms/phrases Also known as semantic similarity, or semantic distance. A tool that can be used in various NLP situations. Examples  Which is semantically closer to “orange juice”?  음료수(Drinks) 향신료(Spice) Which sense describes the term better in the context?  1) “Apple launched a new device …”,  2) “Apple is my favorite fruit, second only to …”  A) A company famous with its iPhone and iPod. B) A fruit with lots of Vitamin-C, shiny red or green skins, …
Semantic Relatedness, with Lexical Resources By Lexical Resources WordNet, Thesauri, Ontologies, … Pros Reliable data generated by lexicographers.  Detailed relationships between lexicons.   Cons Generated by human, high cost.  Not readily available for minor languages.  There are always new / unlisted entries.
Semantic Relatedness, with Corpus Semantic Relatedness based on corpus By observationson unlabeled corpus.  Measuring relatedness to a numerical value Various methods Co-occurrence Vectors  Mutual Information (PMI) Rank reducing (LSA, random projection, ESA)  Topic Models (PLSA, LDA, CTM)
Semantic Relatedness, with Corpus Corpus based methods generally approaches  Two terms are “semantically close” if; “occurs in similar documents” Shares similar distributions among documents “co-occurs with similar terms” Shares common co-occurring terms Occurrences / co-occurrences are generally expressed as vectors;  Vectors themselves are used as representations,  Or refined by mathematical/statistical methods  Weighting schemes, rank-reduce, higher-order vector, random-projection, topic estimation, etc
Generating “Semantic Space”
Motivation Network has “more structure” A network of terms can be seen as a relaxation of “Bag-of-words” (independent) assumption.  Previous work showed:  structure of a co-occurrence network can be used to induce senses [Veronis 2005].  However, network itself was never used as a representation before. What if, co-occurrence vectors are replaced by co-occurrence networks? What tools are needed? Can we gain some performance improvements?
An example of capturing co-occurrences;  as a vector, or as a network.  A data disc can contain anything; system files,, ...  Eject the system disc by pressing  ...  This is their best concert on disc.   On the double disc soundtrack, the orchestra have ...  Disc of the year & best orchestra winner is announced by ...  concert data disc-{ data, system, files } disc-{ system, } disc-{ concert } disc-{ soundtrack, orchestra}  disc-{ year, orchestra, winner } 1 1 1 soundtrack system 1 disc 1 2 1 1 winner 1 1 2 1 1 files concert, data, files, orchestra, soundtrack, system, winner, year ( 1,    1,   1,   2,       1,         2,     1,     1  ) 1 orchestra year
Replacing co-occurrence vectors with co-occurrence networks A Co-occurrence Patterns CosSim(A,B) L1dist(A,B) EucDist(A,B) B Vector Representation Applying  Similarity/Distances functions of vector domain  A Co-occurrence Patterns B NetworkSIM(A,B) Applying  Network similarity measure  Network representation
Using co-occurrence network as a direct representation Evaluating gains on on some NLPtasks  Compare the performance with vector-basedbaselines, unsupervised state-of-the-arts Tasks like  Synonym finding TOEFL-synonym test set Word sense disambiguation General domain & Biomedical domain Annotation translation Automatic translation of FLICKR tags
Two basic issues of using network representations Expressing phrases How to compose an expression of a phrase? In vector semantic spaces, vector summation/multiplications are used to represent phrases.  Equivalent network operations must be defined. Comparing two networks  Given two network representations, how their similarity can be calculated?  A network similarity measure equivalent to cosine similarity is needed.
An example WSD setup  LSO +   disc =  context2 Microsoft +   disc =  context1 A WSD setup from  [Wilks, 1990] & [Schütze,  1998]  This is also sometimes called as a modified Lesk algorithm.    Vsense1 (vdisc+vdiskette+vmagnetic disc) Vcontext1 “Microsoft will replace your disc, if its within ...” {Microsoft, disc} θ Vcontext2 “Previn and the LSO on the front of any disc were ..." {LSO, disc} Vsense2 (vphonograph+vrecord+vsound recording) WordNet Senses (synsets)  Disc sense-1: {disk, diskette, magnetic disc}  Disc sense-2: {phonograph, record, sound recording}
The two issues in WSD setup(network case)  disc sense - 1 network of  {Microsoft, disc} {Microsoft, disc} Microsoft disc network of  {disk, diskette, magnetic disc} (+) (2)? (1)? disc sense - 2 LSO disc (+) network of  {LSO, disc} {LSO, disc} network of  {phonograph, record, sound  recording}
Issue #1Network operators Generating context (multi-term) networks from single-term networks.  Two network operations;  Network Union  Equivalent to vector summation  Network Intersection Similar to vector multiplications in effect  Defined as matrix operations  Since networks are represented as adjacency matrices
Issue #1Network operators 1 1 3 2 1 2 3 1 A A 4 1 4 1 2 4 1 2 B 1 2 4 B 1 2 1
Network of Disc Network of Disc & LSO Network of Disc & Microsoft
Issue #2Similarity measure for networks Cosine similarity is a normalized dot product Graph kernel  Dot product of two “Graph structure”. Graph kernels have been used in biomedical domains to compare proteins and genes. A R-convolution kernel A way to systemically define kernels for structures. In language processing, tree kernels are the most widely used case of R-convolution kernel.
Random walk graph kernel The most widely used graph kernel. It compares two graphs by  Measuring common random walks numerically   The result is a dot product value in an infinitely high dimension, where each dimension is each possible random walk  It has “tottering” issue  A well known problem: kernel effectiveness is severely limited by counting cycles again and again.  I have proposed an efficient acyclic version, that can be used if all node labels are unique.  i.e. co-occurrence networks
Network similarity by comparing every possible walks
Walks to steps, steps to  nodes & edges... Graph -> Walks Walks -> Steps  Steps -> Node/Edge   R-convolution kernel [Haussler, 1999]
Simplest possible sub kernels Nodekernel  Delta Kernel (exact match) Edge kernel Brownian Bridge Kernel
Two issues solved; previous WSD setup  Networks of  Candidate senses (union & intersection) Network of  Context Terms observed in  Context of target term t1 t2 tn ... (1) Network  Operations (union) (2) Network  kernel (similarity function) ....
 Synonym Test Synonym Test Finding synonym from given candidates ex) grin:   {exercise, rest, joke, smile}  applying Selecting the most similar candidate in terms of normalized dot product.  Testset TOEFL Synonym test set (Landauer 1997)  Training corpus British National Corpus (BNC-XML)
Raw frequency  PMI weighting (positive PMI)
 Synonym Test Summary  Within same conditions, various parameters Same corpus, same sampling method. Network performs about 3 points better in average. But statistically insignificant.  Only 80 tests in this test set. Network similarity is less sensitive to the context window size.
Word Sense Disambiguation WSDis, Sense disambiguation example: term “disc” Phrase 1) Previn and the LSO on the front of any disc was... Phrase2) Microsoft will replace your disc, if it’s within …  Sense candidates  Sense1) Disc as “Phonograph, record, recording” Sense2) Disc as “Magnetic disc”  A task to assign a sense from the candidates  Again, selecting the most similar sense candidate in terms of kernel similarity value.
Word Sense Disambiguation General Domain Test set: SensEval-3 lexical sample data Sense candidates: WordNet Senses Corpus: BNC-XML Sense expressions SynsetUnion/Intersection Context expressions Union of phrase terms 4+ point performance gain Statistically significant  Network version is comparable to state-of-the-art unsupervised WSD Supervised Unsupervised
Word Sense Disambiguation WSD Accuracy on   Biomedical WSD test set Biomedical Domain Test set: extended NLM Dataset   Corpus: PubMed open subset  Same representation for senses and context. Sense candidates from UMLS- Metathesaurus Average number of senses were: 2.4  Outperformed baseline vector method nearly 10+points
Flickr tag translation Tag translation Translation Disambiguation  Finding proper translation for given term Spring, Field, Flowers  {spring of season=(봄, Frühjahr), spring as a mechanical device=(스프링, Sprungfeder), hot/water springs=(샘, Brunnen ) … } Experiments on  MIRFLICK 25000 Image  Translating English tags in image number 1 to 1000, from English to German. Baseline method (state-of-the-art)  Coherence (Mutual Information) based method. A method that selects the most co-occurring translation candidates in the target language corpus. {spring, field, flowers}
Tag translation wood : Holz (wood as material), Wald (forest) desk : Schalter (a counter), Schreibtisch (a desk for reading/writing), Tisch (a table) {wood, desk}  {Holz, Schalter},{Holz, Schreibtisch}, {Holz, Tisch}, {Wald, Schalter},{Wald, Schreibtisch},{Wald, Tisch} (1) (2) {Holz, Schreibtisch} {Wald, Schreibtisch} {wood, desk} {Holz, Tisch}
Tag translation Candidates are notsenses, but target language networks Incompatible node labels! Target network nodes have German labels Adopting a node kernel with machine-readable dictionary
Tag translation result Targets 3696 tags that are listed in the dictionary, among 5899 unique tags. 965 among 3696 only had single translation. Outperformed the coherent based translation nearly 5%.
Summary Network as Semantic Representation Co-occurrence Network to replace co-occurrence vectors Performance gains  In some NLP tasks that needs semantic relatedness measures, the network-based representations constantly outperformed equivalent vector representations. Co-occurrence network and the associated kernel  They can be used in applications that uses co-occurrence vectors and cosine similarity Language resources can be adopted to the kernel with minimal impact, by modifying sub-kernels. One notable shortcoming is that the kernel operation is so much slower than the cosine similarity calculation.
Please remember this, even if you forget everything else! (Not true) Data mined from corpus should be represented as vectors.   There are well established mathematical methods to compare data captured in forms of structures.    R-Convolution Kernel (Not true) Kernels are for kernel machines. A kernel is just a dot product in a higher dimensional space.  without explicitly generating that high dimension – kernel trick  Kernels are essential in kernel machines (i.e. SVMs), but a kernel can be useful just as a dot product itself.

More Related Content

What's hot

Introduction to Tree-LSTMs
Introduction to Tree-LSTMsIntroduction to Tree-LSTMs
Introduction to Tree-LSTMs
Daniel Perez
 
Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017
Balázs Hidasi
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
Benjamin Le
 
TypeScript and Deep Learning
TypeScript and Deep LearningTypeScript and Deep Learning
TypeScript and Deep Learning
Oswald Campesato
 
Introduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNNIntroduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNN
Hye-min Ahn
 
[Introduction] Neural Network-Based Abstract Generation for Opinions and Argu...
[Introduction] Neural Network-Based Abstract Generation for Opinions and Argu...[Introduction] Neural Network-Based Abstract Generation for Opinions and Argu...
[Introduction] Neural Network-Based Abstract Generation for Opinions and Argu...
Kodaira Tomonori
 
Deep Learning Tutorial
Deep Learning Tutorial Deep Learning Tutorial
Deep Learning Tutorial
Ligeng Zhu
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model
佳蓉 倪
 
Convolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in TheanoConvolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in Theano
Seongwon Hwang
 
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Lecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural NetworksLecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural Networks
Sang Jun Lee
 
Machine Learning - Introduction to Convolutional Neural Networks
Machine Learning - Introduction to Convolutional Neural NetworksMachine Learning - Introduction to Convolutional Neural Networks
Machine Learning - Introduction to Convolutional Neural Networks
Andrew Ferlitsch
 
Recent Progress in RNN and NLP
Recent Progress in RNN and NLPRecent Progress in RNN and NLP
Recent Progress in RNN and NLP
hytae
 
Synthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningSynthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep Learning
S N
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
ananth
 
Score based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential EquationsScore based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential Equations
Sungchul Kim
 
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Universitat Politècnica de Catalunya
 
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Universitat Politècnica de Catalunya
 
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Balázs Hidasi
 
Modeling Electronic Health Records with Recurrent Neural Networks
Modeling Electronic Health Records with Recurrent Neural NetworksModeling Electronic Health Records with Recurrent Neural Networks
Modeling Electronic Health Records with Recurrent Neural Networks
Josh Patterson
 

What's hot (20)

Introduction to Tree-LSTMs
Introduction to Tree-LSTMsIntroduction to Tree-LSTMs
Introduction to Tree-LSTMs
 
Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
 
TypeScript and Deep Learning
TypeScript and Deep LearningTypeScript and Deep Learning
TypeScript and Deep Learning
 
Introduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNNIntroduction For seq2seq(sequence to sequence) and RNN
Introduction For seq2seq(sequence to sequence) and RNN
 
[Introduction] Neural Network-Based Abstract Generation for Opinions and Argu...
[Introduction] Neural Network-Based Abstract Generation for Opinions and Argu...[Introduction] Neural Network-Based Abstract Generation for Opinions and Argu...
[Introduction] Neural Network-Based Abstract Generation for Opinions and Argu...
 
Deep Learning Tutorial
Deep Learning Tutorial Deep Learning Tutorial
Deep Learning Tutorial
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model
 
Convolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in TheanoConvolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in Theano
 
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
 
Lecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural NetworksLecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural Networks
 
Machine Learning - Introduction to Convolutional Neural Networks
Machine Learning - Introduction to Convolutional Neural NetworksMachine Learning - Introduction to Convolutional Neural Networks
Machine Learning - Introduction to Convolutional Neural Networks
 
Recent Progress in RNN and NLP
Recent Progress in RNN and NLPRecent Progress in RNN and NLP
Recent Progress in RNN and NLP
 
Synthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep LearningSynthetic dialogue generation with Deep Learning
Synthetic dialogue generation with Deep Learning
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
Score based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential EquationsScore based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential Equations
 
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
Convolutional Neural Networks (DLAI D5L1 2017 UPC Deep Learning for Artificia...
 
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
Attention is all you need (UPC Reading Group 2018, by Santi Pascual)
 
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
 
Modeling Electronic Health Records with Recurrent Neural Networks
Modeling Electronic Health Records with Recurrent Neural NetworksModeling Electronic Health Records with Recurrent Neural Networks
Modeling Electronic Health Records with Recurrent Neural Networks
 

Similar to Tg noh jeju_workshop

Deep Learning and Watson Studio
Deep Learning and Watson StudioDeep Learning and Watson Studio
Deep Learning and Watson Studio
Sasha Lazarevic
 
nlp dl 1.pdf
nlp dl 1.pdfnlp dl 1.pdf
nlp dl 1.pdf
nyomans1
 
CMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics Ibutest
 
Ch03 Mining Massive Data Sets stanford
Ch03 Mining Massive Data Sets  stanfordCh03 Mining Massive Data Sets  stanford
Ch03 Mining Massive Data Sets stanford
Sakthivel C R
 
Vitus Masters Defense
Vitus Masters DefenseVitus Masters Defense
Vitus Masters Defense
derDoc
 
(Talk in Powerpoint Format)
(Talk in Powerpoint Format)(Talk in Powerpoint Format)
(Talk in Powerpoint Format)butest
 
My invited talk at the 23rd International Symposium of Mathematical Programmi...
My invited talk at the 23rd International Symposium of Mathematical Programmi...My invited talk at the 23rd International Symposium of Mathematical Programmi...
My invited talk at the 23rd International Symposium of Mathematical Programmi...
Anirbit Mukherjee
 
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Wanjin Yu
 
INC 2004: An Efficient Mechanism for Adaptive Resource Discovery in Grids
INC 2004: An Efficient Mechanism for Adaptive Resource Discovery in GridsINC 2004: An Efficient Mechanism for Adaptive Resource Discovery in Grids
INC 2004: An Efficient Mechanism for Adaptive Resource Discovery in Grids
James Salter
 
Seattle Scalability Meetup 6-26-13
Seattle Scalability Meetup 6-26-13Seattle Scalability Meetup 6-26-13
Seattle Scalability Meetup 6-26-13
specialk29
 
Presentation on Text Classification
Presentation on Text ClassificationPresentation on Text Classification
Presentation on Text Classification
Sai Srinivas Kotni
 
Distributed Coordination
Distributed CoordinationDistributed Coordination
Distributed Coordination
Luis Galárraga
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
Oswald Campesato
 
Semeval Deep Learning In Semantic Similarity
Semeval Deep Learning In Semantic SimilaritySemeval Deep Learning In Semantic Similarity
Semeval Deep Learning In Semantic Similarity
Enterprise Search Warsaw Meetup
 
CLIM Program: Remote Sensing Workshop, An Introduction to Systems and Softwar...
CLIM Program: Remote Sensing Workshop, An Introduction to Systems and Softwar...CLIM Program: Remote Sensing Workshop, An Introduction to Systems and Softwar...
CLIM Program: Remote Sensing Workshop, An Introduction to Systems and Softwar...
The Statistical and Applied Mathematical Sciences Institute
 
Recursive Neural Networks
Recursive Neural NetworksRecursive Neural Networks
Recursive Neural Networks
Sangwoo Mo
 
Wsd as distributed constraint optimization problem
Wsd as distributed constraint optimization problemWsd as distributed constraint optimization problem
Wsd as distributed constraint optimization problemlolokikipipi
 
Speech Processing with deep learning
Speech Processing  with deep learningSpeech Processing  with deep learning
Speech Processing with deep learning
Mohamed Essam
 

Similar to Tg noh jeju_workshop (20)

Deep Learning and Watson Studio
Deep Learning and Watson StudioDeep Learning and Watson Studio
Deep Learning and Watson Studio
 
nlp dl 1.pdf
nlp dl 1.pdfnlp dl 1.pdf
nlp dl 1.pdf
 
CMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics I
 
Ch03 Mining Massive Data Sets stanford
Ch03 Mining Massive Data Sets  stanfordCh03 Mining Massive Data Sets  stanford
Ch03 Mining Massive Data Sets stanford
 
Vitus Masters Defense
Vitus Masters DefenseVitus Masters Defense
Vitus Masters Defense
 
(Talk in Powerpoint Format)
(Talk in Powerpoint Format)(Talk in Powerpoint Format)
(Talk in Powerpoint Format)
 
Eacl 2006 Pedersen
Eacl 2006 PedersenEacl 2006 Pedersen
Eacl 2006 Pedersen
 
My invited talk at the 23rd International Symposium of Mathematical Programmi...
My invited talk at the 23rd International Symposium of Mathematical Programmi...My invited talk at the 23rd International Symposium of Mathematical Programmi...
My invited talk at the 23rd International Symposium of Mathematical Programmi...
 
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
 
INC 2004: An Efficient Mechanism for Adaptive Resource Discovery in Grids
INC 2004: An Efficient Mechanism for Adaptive Resource Discovery in GridsINC 2004: An Efficient Mechanism for Adaptive Resource Discovery in Grids
INC 2004: An Efficient Mechanism for Adaptive Resource Discovery in Grids
 
Seattle Scalability Meetup 6-26-13
Seattle Scalability Meetup 6-26-13Seattle Scalability Meetup 6-26-13
Seattle Scalability Meetup 6-26-13
 
Presentation on Text Classification
Presentation on Text ClassificationPresentation on Text Classification
Presentation on Text Classification
 
Distributed Coordination
Distributed CoordinationDistributed Coordination
Distributed Coordination
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Semeval Deep Learning In Semantic Similarity
Semeval Deep Learning In Semantic SimilaritySemeval Deep Learning In Semantic Similarity
Semeval Deep Learning In Semantic Similarity
 
CLIM Program: Remote Sensing Workshop, An Introduction to Systems and Softwar...
CLIM Program: Remote Sensing Workshop, An Introduction to Systems and Softwar...CLIM Program: Remote Sensing Workshop, An Introduction to Systems and Softwar...
CLIM Program: Remote Sensing Workshop, An Introduction to Systems and Softwar...
 
Eurolan 2005 Pedersen
Eurolan 2005 PedersenEurolan 2005 Pedersen
Eurolan 2005 Pedersen
 
Recursive Neural Networks
Recursive Neural NetworksRecursive Neural Networks
Recursive Neural Networks
 
Wsd as distributed constraint optimization problem
Wsd as distributed constraint optimization problemWsd as distributed constraint optimization problem
Wsd as distributed constraint optimization problem
 
Speech Processing with deep learning
Speech Processing  with deep learningSpeech Processing  with deep learning
Speech Processing with deep learning
 

Recently uploaded

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 

Recently uploaded (20)

Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 

Tg noh jeju_workshop

  • 1. A Semantic Relatedness Measure Based on Co-occurrence Network and Graph Kernel Kyungpook National University 노태길 (Tae-Gil Noh) tailblues@me.com 2011년 1월 20일,ACL 유치 기념 워크샵
  • 2. Overview A new semantic relatedness measure From co-occurrence observations on raw corpus For words and phrases Improving Vector space model with Network representations Similarity measure in a kernel space Co-occurrence observations compared by kernel R-convolution Kernel and Graph Kernel
  • 3. Introduction: Semantic relatedness measure Measuring semantic distance between two terms/phrases Also known as semantic similarity, or semantic distance. A tool that can be used in various NLP situations. Examples Which is semantically closer to “orange juice”? 음료수(Drinks) 향신료(Spice) Which sense describes the term better in the context? 1) “Apple launched a new device …”, 2) “Apple is my favorite fruit, second only to …” A) A company famous with its iPhone and iPod. B) A fruit with lots of Vitamin-C, shiny red or green skins, …
  • 4. Semantic Relatedness, with Lexical Resources By Lexical Resources WordNet, Thesauri, Ontologies, … Pros Reliable data generated by lexicographers. Detailed relationships between lexicons. Cons Generated by human, high cost. Not readily available for minor languages. There are always new / unlisted entries.
  • 5. Semantic Relatedness, with Corpus Semantic Relatedness based on corpus By observationson unlabeled corpus. Measuring relatedness to a numerical value Various methods Co-occurrence Vectors Mutual Information (PMI) Rank reducing (LSA, random projection, ESA) Topic Models (PLSA, LDA, CTM)
  • 6. Semantic Relatedness, with Corpus Corpus based methods generally approaches Two terms are “semantically close” if; “occurs in similar documents” Shares similar distributions among documents “co-occurs with similar terms” Shares common co-occurring terms Occurrences / co-occurrences are generally expressed as vectors; Vectors themselves are used as representations, Or refined by mathematical/statistical methods Weighting schemes, rank-reduce, higher-order vector, random-projection, topic estimation, etc
  • 8. Motivation Network has “more structure” A network of terms can be seen as a relaxation of “Bag-of-words” (independent) assumption. Previous work showed: structure of a co-occurrence network can be used to induce senses [Veronis 2005]. However, network itself was never used as a representation before. What if, co-occurrence vectors are replaced by co-occurrence networks? What tools are needed? Can we gain some performance improvements?
  • 9. An example of capturing co-occurrences; as a vector, or as a network. A data disc can contain anything; system files,, ...  Eject the system disc by pressing  ...  This is their best concert on disc.   On the double disc soundtrack, the orchestra have ...  Disc of the year & best orchestra winner is announced by ...  concert data disc-{ data, system, files } disc-{ system, } disc-{ concert } disc-{ soundtrack, orchestra}  disc-{ year, orchestra, winner } 1 1 1 soundtrack system 1 disc 1 2 1 1 winner 1 1 2 1 1 files concert, data, files, orchestra, soundtrack, system, winner, year ( 1, 1, 1, 2, 1, 2, 1, 1 ) 1 orchestra year
  • 10. Replacing co-occurrence vectors with co-occurrence networks A Co-occurrence Patterns CosSim(A,B) L1dist(A,B) EucDist(A,B) B Vector Representation Applying Similarity/Distances functions of vector domain A Co-occurrence Patterns B NetworkSIM(A,B) Applying Network similarity measure Network representation
  • 11. Using co-occurrence network as a direct representation Evaluating gains on on some NLPtasks Compare the performance with vector-basedbaselines, unsupervised state-of-the-arts Tasks like Synonym finding TOEFL-synonym test set Word sense disambiguation General domain & Biomedical domain Annotation translation Automatic translation of FLICKR tags
  • 12. Two basic issues of using network representations Expressing phrases How to compose an expression of a phrase? In vector semantic spaces, vector summation/multiplications are used to represent phrases. Equivalent network operations must be defined. Comparing two networks Given two network representations, how their similarity can be calculated? A network similarity measure equivalent to cosine similarity is needed.
  • 13. An example WSD setup LSO + disc = context2 Microsoft + disc = context1 A WSD setup from [Wilks, 1990] & [Schütze, 1998] This is also sometimes called as a modified Lesk algorithm. Vsense1 (vdisc+vdiskette+vmagnetic disc) Vcontext1 “Microsoft will replace your disc, if its within ...” {Microsoft, disc} θ Vcontext2 “Previn and the LSO on the front of any disc were ..." {LSO, disc} Vsense2 (vphonograph+vrecord+vsound recording) WordNet Senses (synsets) Disc sense-1: {disk, diskette, magnetic disc} Disc sense-2: {phonograph, record, sound recording}
  • 14. The two issues in WSD setup(network case) disc sense - 1 network of {Microsoft, disc} {Microsoft, disc} Microsoft disc network of {disk, diskette, magnetic disc} (+) (2)? (1)? disc sense - 2 LSO disc (+) network of {LSO, disc} {LSO, disc} network of {phonograph, record, sound recording}
  • 15. Issue #1Network operators Generating context (multi-term) networks from single-term networks. Two network operations; Network Union Equivalent to vector summation Network Intersection Similar to vector multiplications in effect Defined as matrix operations Since networks are represented as adjacency matrices
  • 16. Issue #1Network operators 1 1 3 2 1 2 3 1 A A 4 1 4 1 2 4 1 2 B 1 2 4 B 1 2 1
  • 17. Network of Disc Network of Disc & LSO Network of Disc & Microsoft
  • 18. Issue #2Similarity measure for networks Cosine similarity is a normalized dot product Graph kernel Dot product of two “Graph structure”. Graph kernels have been used in biomedical domains to compare proteins and genes. A R-convolution kernel A way to systemically define kernels for structures. In language processing, tree kernels are the most widely used case of R-convolution kernel.
  • 19. Random walk graph kernel The most widely used graph kernel. It compares two graphs by Measuring common random walks numerically The result is a dot product value in an infinitely high dimension, where each dimension is each possible random walk It has “tottering” issue A well known problem: kernel effectiveness is severely limited by counting cycles again and again. I have proposed an efficient acyclic version, that can be used if all node labels are unique. i.e. co-occurrence networks
  • 20. Network similarity by comparing every possible walks
  • 21. Walks to steps, steps to nodes & edges... Graph -> Walks Walks -> Steps Steps -> Node/Edge R-convolution kernel [Haussler, 1999]
  • 22. Simplest possible sub kernels Nodekernel Delta Kernel (exact match) Edge kernel Brownian Bridge Kernel
  • 23. Two issues solved; previous WSD setup Networks of Candidate senses (union & intersection) Network of Context Terms observed in Context of target term t1 t2 tn ... (1) Network Operations (union) (2) Network kernel (similarity function) ....
  • 24. Synonym Test Synonym Test Finding synonym from given candidates ex) grin: {exercise, rest, joke, smile} applying Selecting the most similar candidate in terms of normalized dot product. Testset TOEFL Synonym test set (Landauer 1997) Training corpus British National Corpus (BNC-XML)
  • 25. Raw frequency PMI weighting (positive PMI)
  • 26. Synonym Test Summary Within same conditions, various parameters Same corpus, same sampling method. Network performs about 3 points better in average. But statistically insignificant. Only 80 tests in this test set. Network similarity is less sensitive to the context window size.
  • 27. Word Sense Disambiguation WSDis, Sense disambiguation example: term “disc” Phrase 1) Previn and the LSO on the front of any disc was... Phrase2) Microsoft will replace your disc, if it’s within … Sense candidates Sense1) Disc as “Phonograph, record, recording” Sense2) Disc as “Magnetic disc” A task to assign a sense from the candidates Again, selecting the most similar sense candidate in terms of kernel similarity value.
  • 28. Word Sense Disambiguation General Domain Test set: SensEval-3 lexical sample data Sense candidates: WordNet Senses Corpus: BNC-XML Sense expressions SynsetUnion/Intersection Context expressions Union of phrase terms 4+ point performance gain Statistically significant Network version is comparable to state-of-the-art unsupervised WSD Supervised Unsupervised
  • 29. Word Sense Disambiguation WSD Accuracy on Biomedical WSD test set Biomedical Domain Test set: extended NLM Dataset Corpus: PubMed open subset Same representation for senses and context. Sense candidates from UMLS- Metathesaurus Average number of senses were: 2.4 Outperformed baseline vector method nearly 10+points
  • 30. Flickr tag translation Tag translation Translation Disambiguation Finding proper translation for given term Spring, Field, Flowers  {spring of season=(봄, Frühjahr), spring as a mechanical device=(스프링, Sprungfeder), hot/water springs=(샘, Brunnen ) … } Experiments on MIRFLICK 25000 Image Translating English tags in image number 1 to 1000, from English to German. Baseline method (state-of-the-art) Coherence (Mutual Information) based method. A method that selects the most co-occurring translation candidates in the target language corpus. {spring, field, flowers}
  • 31. Tag translation wood : Holz (wood as material), Wald (forest) desk : Schalter (a counter), Schreibtisch (a desk for reading/writing), Tisch (a table) {wood, desk} {Holz, Schalter},{Holz, Schreibtisch}, {Holz, Tisch}, {Wald, Schalter},{Wald, Schreibtisch},{Wald, Tisch} (1) (2) {Holz, Schreibtisch} {Wald, Schreibtisch} {wood, desk} {Holz, Tisch}
  • 32. Tag translation Candidates are notsenses, but target language networks Incompatible node labels! Target network nodes have German labels Adopting a node kernel with machine-readable dictionary
  • 33. Tag translation result Targets 3696 tags that are listed in the dictionary, among 5899 unique tags. 965 among 3696 only had single translation. Outperformed the coherent based translation nearly 5%.
  • 34. Summary Network as Semantic Representation Co-occurrence Network to replace co-occurrence vectors Performance gains In some NLP tasks that needs semantic relatedness measures, the network-based representations constantly outperformed equivalent vector representations. Co-occurrence network and the associated kernel They can be used in applications that uses co-occurrence vectors and cosine similarity Language resources can be adopted to the kernel with minimal impact, by modifying sub-kernels. One notable shortcoming is that the kernel operation is so much slower than the cosine similarity calculation.
  • 35. Please remember this, even if you forget everything else! (Not true) Data mined from corpus should be represented as vectors. There are well established mathematical methods to compare data captured in forms of structures. R-Convolution Kernel (Not true) Kernels are for kernel machines. A kernel is just a dot product in a higher dimensional space. without explicitly generating that high dimension – kernel trick Kernels are essential in kernel machines (i.e. SVMs), but a kernel can be useful just as a dot product itself.