Space-efficient Feature Maps for String Alignment Kernels

Yasuo Tabei
Yasuo TabeiResearcher, Software Developer at Japan Science and Technology Agency
Space-efficient Feature Maps
for String Alignment Kernels
Yasuo Tabei (RIKEN-AIP)
Joint work with
Yoshihiro Yamanishi (Kyutech)
Rasmus Pagh (IT University of Copenhagen)
ICDM’19@Beijing, Nov. 10th, 2019
Kernel methods
• Kernels are inner product in some feature space H:
𝐾 𝑥, 𝑥′ =< φ 𝑥 , φ 𝑥′ >
• Intuitively, kernel is a measure of similarity of x and x’
• x and x’ can be vectors, trees, graphs.
• x and x’ are strings in this talk.
• Kernels are useful for
- Classification (SVM), Regression, Feature selection, Two-sample
problems, etc.
String alignment kernels
• Typical string kernel uses substring (k-mers) features
• Alignment kernel uses string alignment (e.g., edit
distance) as a similarity measure
• It has a wide variety of applications in string processing
E.g.) text classification, remote homology detection for
proteins/DNA [BMCBioinfo.’06], etc
• Advantage: High prediction accuracy
• Drawback: Large computation complexity
Square time to the length of strings (dynamic programming)
Quadratic time in the number of training data
Feature maps (FMs) for kernel approximations
[A. Rahimi and B. Recht, NIPS, 2007]
• FMs map d-dimensional vector 𝑥 ∈ 𝑅 𝑑
into D-dimensional
vector φ(𝑥) ∈ 𝑅 𝐷
using O(d×D) memory and time
• It can approximate kernel function k(x,y) by the inner
product of compact vectors 𝜑(𝑥)・φ(𝑦)
• Linear model 𝑓𝑙 𝑥 = 𝑤・φ 𝑥 has approximately the same
functionality as nonlinear model 𝑓 𝑛 𝑥 = 𝑖 𝑘(𝑥, 𝑦𝑖)
• Advantage: can enhance the scalability of kernel methods
(i)Input vectors (ii)Compact vectors (iii)Linear model
Map Learn model weight w
𝑓𝑙 𝑥 = 𝑤・φ 𝑥
Existing feature maps (FMs)
• Several FMs with different input formats and kernel
similarities have been proposed
• No previous work has been able to approximate
alignment kernels
Method Kernel Input
b-bit MinHash [Li'11] Jaccard Binary vector
Tensor Sketching [Pham'13] Polynomial Real vector
0-bit CWS [Li’15] Min-Max Real vector
C-Hash [Mu'12] Cosine Real vector
Random Feature [Rahimi'07] RBF Real vector
RFM [Kar'12] Dot product Real vector
PWLSGD [Maji'09] Intersection Real vector
Space-efficient feature maps
for string alignment kernels
• Basic idea : use two hash functions of (i) edit-sensitive
parsing (ESP) and (ii) feature maps (FMs) for
Laplacian kernel
• Feature maps consumes a large amount of memory
O(d×D) memory for input dimension d and output
dimension D
• Present space-efficient FMs using O(d) memory
• Can achieve high classification accuracy by training
linear SVM with compact vectors
S1 = ABRACADA
S2 = ABRA
S3 = ABRACA
S4 = ATGCAGA
S5 = BARACR
x1 = (3, 1, 3)
x2 = (2, 4, 1)
x3 = (5, 1, 2)
x4 = (1, 0, 0)
x5 = (2, 2, 1)
z1 = (1.2,0.1,1,2)
z2 = (2,1,1.2,3.4)
z3 = (-1.2,0,2.2,3)
z4 = (-3.2,0,2.2,1)
z5 = (2, 2, -1.2, 0)
F(zi)=wzi
(i) ESP (ii) FMs (iii) Learn linear model
Edit-sensitive parsing (ESP)
[G.Cormode and S.Mushukrishnan, 2007]
• Build a single parse tree from input string S
• Build a parse tree from the bottom (leaves) to the top (root)
• Nodes with the same node label are built from pairs of the
same symbol pairs
• Can be used for mapping string S into integer vector x
– Each element of x is the number of appearances of node labels
• Can approximate edit distance with moves EDM as L1 distance
between mapped vectors (i.e., EDM(Si,Sj)≒||xi-xj||1)
• Computational time is linear to the length of S
ABBAABA B B
A B X1 X2 X3 X4 X5X6
x = (4, 5, 2, 1, 1, 1, 1, 1)
ESP for mapping strings
into integer vectors
Step1
Given S and S’, make vectors V(S)
and V(S’) each dimension of which is
the number of each characters in S
and S’.
A B
S=ABABABBAB → V(S) = (4,5)
S’=ABBABAB → V(S’) = (3,4)
ABBAABA B B Level1
Level2
A BBABA B Level1
Level2
Step2
Assign each pair or triple to the
same non-terminal symbol
Step3
Count the number of each node
label and update vectors V(S) and
V(S’)
A B X1 X2 X3 X4
V(S) = (4, 5, 2, 1, 1, 0)
V(S’)= (5, 4, 2, 0, 0, 1)
Step4
Replace strings by the sequence of
node labels in level2
S = X1X2X3X1
S’= X1X4X1
Goto step2
Feature maps for string alignment
kernels
• EDM is approximated by L1-distance with mapped vectors as
EDM(Si,Sj)≒||xi-xj||1
• Alignment kernel is defined as
K(Si,Sj) = exp(-||xi-xj||1/β)) (Laplacian kernel)
• Feature maps (FMs) can approximate Laplacian kernel as
exp(-||xi-xj||1/β)≒<zi,zj>
• FMs are space-inefficient with O(dD) memory for input
dimension d and output dimension D
Fast food approach [ICML’13] can approximate feature
maps for RBF kernels
S1 = ABRACADA
S2 = ABRA
S3 = ABRACA
x1 = (3, 1, 3)
x2 = (2, 4, 1)
x3 = (5, 1, 2)
z1 = (1.2,0.1,1,2)
z2 = (2,1,1.2,3.4)
z3 = (-1.2,0,2.2,3)
F(zi)=wzi
(i)ESP (ii) FMs
(iii) Learn
linear model
Space-efficient FMs (Beliefly)
• Basic idea: reduce random matrix R of size D×d in
standard FMs to random matrix M of size t×d
• Approximate R[i,j] element using polynomial equation:
R[i, j] ≒ M[i,1] + M[i,2]・j1 + ・・・+ M[i,t-1]・jt-1
t-wise independent family distribution
• Theoretical gurantee (concentration bound):
Pr[|𝑧 𝑥 ′
𝑧 𝑦 − 𝑘(𝑥, 𝑦)| ≥ ε] ≤ 2/(ε2
/𝐷)
Rd
D
Md
t
Random matrix R for
standard FMs
O(D×d)
memory
Random matrix M for
Space-efficient FMs
O(t×d) memory
Experiments
• 5 massive string datasets in real world
• Competitors
 5 SVMs with string kernels: LAK [Bioinfo’08], GAK
[ICML’11], ESP+Kernel, CGK+Kernel, stk17 [NIPS’17]
 FMs for alignment kernels: D2KE [KDD’19]
 SFMEDM: proposed
Training time in second
Memory
Classification accuracy in AUC score
Summary
• Space-efficient feature maps for string alignment kernels
• Use two hash functions
– ESP: maps strings into integer vectors
– Feature maps: maps integer vectors into feature vectors
• Linear SVMs are trained on feature vectors
– Linear SVMs behaves such as non-linear SVM with alignment
kernels
• Advantage: highly scalable
• Code and datasets are available:
https://sites.google.com/ view/alignmentkernels/home
1 of 15

Recommended

Machine learning session 9 by
Machine learning session 9Machine learning session 9
Machine learning session 9NirsandhG
53 views9 slides
Understand Manifolds using MATLAB by
Understand Manifolds using MATLAB Understand Manifolds using MATLAB
Understand Manifolds using MATLAB Pranav Challa
2.7K views10 slides
"Moving CNNs from Academic Theory to Embedded Reality," a Presentation from S... by
"Moving CNNs from Academic Theory to Embedded Reality," a Presentation from S..."Moving CNNs from Academic Theory to Embedded Reality," a Presentation from S...
"Moving CNNs from Academic Theory to Embedded Reality," a Presentation from S...Edge AI and Vision Alliance
72 views22 slides
Svm vs ls svm by
Svm vs ls svmSvm vs ls svm
Svm vs ls svmPulipaka Sai Ravi Teja
5.3K views32 slides
ECCV2010: feature learning for image classification, part 2 by
ECCV2010: feature learning for image classification, part 2ECCV2010: feature learning for image classification, part 2
ECCV2010: feature learning for image classification, part 2zukun
610 views28 slides
유한요소법(FEM)을 이용한 구조해석 by
유한요소법(FEM)을 이용한 구조해석유한요소법(FEM)을 이용한 구조해석
유한요소법(FEM)을 이용한 구조해석chimco.net
2.4K views11 slides

More Related Content

What's hot

upgrade2013 by
upgrade2013upgrade2013
upgrade2013Rhian Davies
272 views13 slides
Manifold learning with application to object recognition by
Manifold learning with application to object recognitionManifold learning with application to object recognition
Manifold learning with application to object recognitionzukun
10K views33 slides
Compressing column-oriented indexes by
Compressing column-oriented indexesCompressing column-oriented indexes
Compressing column-oriented indexesDaniel Lemire
3.1K views36 slides
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering by
Convolutional Neural Networks on Graphs with Fast Localized Spectral FilteringConvolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Convolutional Neural Networks on Graphs with Fast Localized Spectral FilteringSOYEON KIM
3.1K views16 slides
Presenter name aizaz ali by
Presenter name aizaz aliPresenter name aizaz ali
Presenter name aizaz aliAizazAli21
23 views18 slides
Basics of Image Processing using MATLAB by
Basics of Image Processing using MATLABBasics of Image Processing using MATLAB
Basics of Image Processing using MATLABvkn13
55.3K views27 slides

What's hot(20)

Manifold learning with application to object recognition by zukun
Manifold learning with application to object recognitionManifold learning with application to object recognition
Manifold learning with application to object recognition
zukun10K views
Compressing column-oriented indexes by Daniel Lemire
Compressing column-oriented indexesCompressing column-oriented indexes
Compressing column-oriented indexes
Daniel Lemire3.1K views
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering by SOYEON KIM
Convolutional Neural Networks on Graphs with Fast Localized Spectral FilteringConvolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
SOYEON KIM3.1K views
Presenter name aizaz ali by AizazAli21
Presenter name aizaz aliPresenter name aizaz ali
Presenter name aizaz ali
AizazAli2123 views
Basics of Image Processing using MATLAB by vkn13
Basics of Image Processing using MATLABBasics of Image Processing using MATLAB
Basics of Image Processing using MATLAB
vkn1355.3K views
IOEfficientParalleMatrixMultiplication_present by Shubham Joshi
IOEfficientParalleMatrixMultiplication_presentIOEfficientParalleMatrixMultiplication_present
IOEfficientParalleMatrixMultiplication_present
Shubham Joshi94 views
ENBIS 2018 presentation on Deep k-Means by tthonet
ENBIS 2018 presentation on Deep k-MeansENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-Means
tthonet90 views
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18 by Olga Zinkevych
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Olga Zinkevych142 views
Graph Convolutional Network by SEMINARGROOT
Graph  Convolutional NetworkGraph  Convolutional Network
Graph Convolutional Network
SEMINARGROOT108 views
Deep Learning for Image Analysis by Levan Tsinadze
Deep Learning for Image AnalysisDeep Learning for Image Analysis
Deep Learning for Image Analysis
Levan Tsinadze73 views
Fundamentals of Image Processing & Computer Vision with MATLAB by Ali Ghanbarzadeh
Fundamentals of Image Processing & Computer Vision with MATLABFundamentals of Image Processing & Computer Vision with MATLAB
Fundamentals of Image Processing & Computer Vision with MATLAB
Ali Ghanbarzadeh2.7K views
The world of loss function by 홍배 김
The world of loss functionThe world of loss function
The world of loss function
홍배 김3.2K views
Artificial Intelligence by butest
Artificial Intelligence Artificial Intelligence
Artificial Intelligence
butest1.1K views
BALANCING BOARD MACHINES by butest
BALANCING BOARD MACHINESBALANCING BOARD MACHINES
BALANCING BOARD MACHINES
butest1.1K views
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu... by Susang Kim
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
[Paper] GIRAFFE: Representing Scenes as Compositional Generative Neural Featu...
Susang Kim262 views

Similar to Space-efficient Feature Maps for String Alignment Kernels

Programming in python by
Programming in pythonProgramming in python
Programming in pythonIvan Rojas
29 views18 slides
Lecture12 by
Lecture12Lecture12
Lecture12tt_aljobory
385 views31 slides
Lecture1_computer vision-2023.pdf by
Lecture1_computer vision-2023.pdfLecture1_computer vision-2023.pdf
Lecture1_computer vision-2023.pdfssuserff72e4
17 views39 slides
Lec3 by
Lec3Lec3
Lec3Amba Research
580 views39 slides
Lecture 02 visualization and programming by
Lecture 02   visualization and programmingLecture 02   visualization and programming
Lecture 02 visualization and programmingSmee Kaem Chann
188 views39 slides
VoxelNet by
VoxelNetVoxelNet
VoxelNettaeseon ryu
16 views21 slides

Similar to Space-efficient Feature Maps for String Alignment Kernels(20)

Programming in python by Ivan Rojas
Programming in pythonProgramming in python
Programming in python
Ivan Rojas29 views
Lecture1_computer vision-2023.pdf by ssuserff72e4
Lecture1_computer vision-2023.pdfLecture1_computer vision-2023.pdf
Lecture1_computer vision-2023.pdf
ssuserff72e417 views
Lecture 02 visualization and programming by Smee Kaem Chann
Lecture 02   visualization and programmingLecture 02   visualization and programming
Lecture 02 visualization and programming
Smee Kaem Chann188 views
Text classification using Text kernels by Dev Nath
Text classification using Text kernelsText classification using Text kernels
Text classification using Text kernels
Dev Nath3.6K views
Recent Advances in Kernel-Based Graph Classification by Christopher Morris
Recent Advances in Kernel-Based Graph ClassificationRecent Advances in Kernel-Based Graph Classification
Recent Advances in Kernel-Based Graph Classification
Christopher Morris552 views
Practical spherical harmonics based PRT methods.ppsx by MannyK4
Practical spherical harmonics based PRT methods.ppsxPractical spherical harmonics based PRT methods.ppsx
Practical spherical harmonics based PRT methods.ppsx
MannyK414 views
Practical Spherical Harmonics Based PRT Methods by Naughty Dog
Practical Spherical Harmonics Based PRT MethodsPractical Spherical Harmonics Based PRT Methods
Practical Spherical Harmonics Based PRT Methods
Naughty Dog4.2K views
A survey on graph kernels by vincyy
A survey on graph kernelsA survey on graph kernels
A survey on graph kernels
vincyy93 views
Distributed Computing with Apache Hadoop. Introduction to MapReduce. by Konstantin V. Shvachko
Distributed Computing with Apache Hadoop. Introduction to MapReduce.Distributed Computing with Apache Hadoop. Introduction to MapReduce.
Distributed Computing with Apache Hadoop. Introduction to MapReduce.
Towards typesafe deep learning in scala by Tongfei Chen
Towards typesafe deep learning in scalaTowards typesafe deep learning in scala
Towards typesafe deep learning in scala
Tongfei Chen474 views

More from Yasuo Tabei

SISAP17 by
SISAP17SISAP17
SISAP17Yasuo Tabei
325 views32 slides
Scalable Partial Least Squares Regression on Grammar-Compressed Data Matrices by
Scalable Partial Least Squares Regression on Grammar-Compressed Data MatricesScalable Partial Least Squares Regression on Grammar-Compressed Data Matrices
Scalable Partial Least Squares Regression on Grammar-Compressed Data MatricesYasuo Tabei
4.4K views26 slides
Kdd2015reading-tabei by
Kdd2015reading-tabeiKdd2015reading-tabei
Kdd2015reading-tabeiYasuo Tabei
1.1K views18 slides
DCC2014 - Fully Online Grammar Compression in Constant Space by
DCC2014 - Fully Online Grammar Compression in Constant SpaceDCC2014 - Fully Online Grammar Compression in Constant Space
DCC2014 - Fully Online Grammar Compression in Constant SpaceYasuo Tabei
1.1K views21 slides
NIPS2013読み会: Scalable kernels for graphs with continuous attributes by
NIPS2013読み会: Scalable kernels for graphs with continuous attributesNIPS2013読み会: Scalable kernels for graphs with continuous attributes
NIPS2013読み会: Scalable kernels for graphs with continuous attributesYasuo Tabei
8.3K views22 slides
GIW2013 by
GIW2013GIW2013
GIW2013Yasuo Tabei
1.1K views26 slides

More from Yasuo Tabei(19)

Scalable Partial Least Squares Regression on Grammar-Compressed Data Matrices by Yasuo Tabei
Scalable Partial Least Squares Regression on Grammar-Compressed Data MatricesScalable Partial Least Squares Regression on Grammar-Compressed Data Matrices
Scalable Partial Least Squares Regression on Grammar-Compressed Data Matrices
Yasuo Tabei4.4K views
Kdd2015reading-tabei by Yasuo Tabei
Kdd2015reading-tabeiKdd2015reading-tabei
Kdd2015reading-tabei
Yasuo Tabei1.1K views
DCC2014 - Fully Online Grammar Compression in Constant Space by Yasuo Tabei
DCC2014 - Fully Online Grammar Compression in Constant SpaceDCC2014 - Fully Online Grammar Compression in Constant Space
DCC2014 - Fully Online Grammar Compression in Constant Space
Yasuo Tabei1.1K views
NIPS2013読み会: Scalable kernels for graphs with continuous attributes by Yasuo Tabei
NIPS2013読み会: Scalable kernels for graphs with continuous attributesNIPS2013読み会: Scalable kernels for graphs with continuous attributes
NIPS2013読み会: Scalable kernels for graphs with continuous attributes
Yasuo Tabei8.3K views
CPM2013-tabei201306 by Yasuo Tabei
CPM2013-tabei201306CPM2013-tabei201306
CPM2013-tabei201306
Yasuo Tabei4.5K views
SPIRE2013-tabei20131009 by Yasuo Tabei
SPIRE2013-tabei20131009SPIRE2013-tabei20131009
SPIRE2013-tabei20131009
Yasuo Tabei4.9K views
WABI2012-SuccinctMultibitTree by Yasuo Tabei
WABI2012-SuccinctMultibitTreeWABI2012-SuccinctMultibitTree
WABI2012-SuccinctMultibitTree
Yasuo Tabei4.3K views
Mlab2012 tabei 20120806 by Yasuo Tabei
Mlab2012 tabei 20120806Mlab2012 tabei 20120806
Mlab2012 tabei 20120806
Yasuo Tabei2.8K views
Ibisml2011 06-20 by Yasuo Tabei
Ibisml2011 06-20Ibisml2011 06-20
Ibisml2011 06-20
Yasuo Tabei896 views
Gwt presen alsip-20111201 by Yasuo Tabei
Gwt presen alsip-20111201Gwt presen alsip-20111201
Gwt presen alsip-20111201
Yasuo Tabei631 views
Lgm pakdd2011 public by Yasuo Tabei
Lgm pakdd2011 publicLgm pakdd2011 public
Lgm pakdd2011 public
Yasuo Tabei995 views
Gwt sdm public by Yasuo Tabei
Gwt sdm publicGwt sdm public
Gwt sdm public
Yasuo Tabei2.6K views
Lgm saarbrucken by Yasuo Tabei
Lgm saarbruckenLgm saarbrucken
Lgm saarbrucken
Yasuo Tabei1.2K views
Sketch sort sugiyamalab-20101026 - public by Yasuo Tabei
Sketch sort sugiyamalab-20101026 - publicSketch sort sugiyamalab-20101026 - public
Sketch sort sugiyamalab-20101026 - public
Yasuo Tabei632 views
Sketch sort ochadai20101015-public by Yasuo Tabei
Sketch sort ochadai20101015-publicSketch sort ochadai20101015-public
Sketch sort ochadai20101015-public
Yasuo Tabei1.4K views

Recently uploaded

Guess Papers ADC 1, Karachi University by
Guess Papers ADC 1, Karachi UniversityGuess Papers ADC 1, Karachi University
Guess Papers ADC 1, Karachi UniversityKhalid Aziz
83 views17 slides
MercerJesse2.1Doc.pdf by
MercerJesse2.1Doc.pdfMercerJesse2.1Doc.pdf
MercerJesse2.1Doc.pdfjessemercerail
301 views5 slides
Narration lesson plan by
Narration lesson planNarration lesson plan
Narration lesson planTARIQ KHAN
69 views11 slides
Retail Store Scavenger Hunt.pptx by
Retail Store Scavenger Hunt.pptxRetail Store Scavenger Hunt.pptx
Retail Store Scavenger Hunt.pptxjmurphy154
52 views10 slides
Career Building in AI - Technologies, Trends and Opportunities by
Career Building in AI - Technologies, Trends and OpportunitiesCareer Building in AI - Technologies, Trends and Opportunities
Career Building in AI - Technologies, Trends and OpportunitiesWebStackAcademy
41 views44 slides
11.30.23A Poverty and Inequality in America.pptx by
11.30.23A Poverty and Inequality in America.pptx11.30.23A Poverty and Inequality in America.pptx
11.30.23A Poverty and Inequality in America.pptxmary850239
86 views18 slides

Recently uploaded(20)

Guess Papers ADC 1, Karachi University by Khalid Aziz
Guess Papers ADC 1, Karachi UniversityGuess Papers ADC 1, Karachi University
Guess Papers ADC 1, Karachi University
Khalid Aziz83 views
Narration lesson plan by TARIQ KHAN
Narration lesson planNarration lesson plan
Narration lesson plan
TARIQ KHAN69 views
Retail Store Scavenger Hunt.pptx by jmurphy154
Retail Store Scavenger Hunt.pptxRetail Store Scavenger Hunt.pptx
Retail Store Scavenger Hunt.pptx
jmurphy15452 views
Career Building in AI - Technologies, Trends and Opportunities by WebStackAcademy
Career Building in AI - Technologies, Trends and OpportunitiesCareer Building in AI - Technologies, Trends and Opportunities
Career Building in AI - Technologies, Trends and Opportunities
WebStackAcademy41 views
11.30.23A Poverty and Inequality in America.pptx by mary850239
11.30.23A Poverty and Inequality in America.pptx11.30.23A Poverty and Inequality in America.pptx
11.30.23A Poverty and Inequality in America.pptx
mary85023986 views
Create a Structure in VBNet.pptx by Breach_P
Create a Structure in VBNet.pptxCreate a Structure in VBNet.pptx
Create a Structure in VBNet.pptx
Breach_P82 views
12.5.23 Poverty and Precarity.pptx by mary850239
12.5.23 Poverty and Precarity.pptx12.5.23 Poverty and Precarity.pptx
12.5.23 Poverty and Precarity.pptx
mary850239162 views
NodeJS and ExpressJS.pdf by ArthyR3
NodeJS and ExpressJS.pdfNodeJS and ExpressJS.pdf
NodeJS and ExpressJS.pdf
ArthyR347 views
Pharmaceutical Inorganic Chemistry Unit IVMiscellaneous compounds Expectorant... by Ms. Pooja Bhandare
Pharmaceutical Inorganic Chemistry Unit IVMiscellaneous compounds Expectorant...Pharmaceutical Inorganic Chemistry Unit IVMiscellaneous compounds Expectorant...
Pharmaceutical Inorganic Chemistry Unit IVMiscellaneous compounds Expectorant...
Ms. Pooja Bhandare194 views
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE... by Nguyen Thanh Tu Collection
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...
Creative Restart 2023: Atila Martins - Craft: A Necessity, Not a Choice by Taste
Creative Restart 2023: Atila Martins - Craft: A Necessity, Not a ChoiceCreative Restart 2023: Atila Martins - Craft: A Necessity, Not a Choice
Creative Restart 2023: Atila Martins - Craft: A Necessity, Not a Choice
Taste41 views
Class 9 lesson plans by TARIQ KHAN
Class 9 lesson plansClass 9 lesson plans
Class 9 lesson plans
TARIQ KHAN68 views
JQUERY.pdf by ArthyR3
JQUERY.pdfJQUERY.pdf
JQUERY.pdf
ArthyR3103 views
CUNY IT Picciano.pptx by apicciano
CUNY IT Picciano.pptxCUNY IT Picciano.pptx
CUNY IT Picciano.pptx
apicciano60 views

Space-efficient Feature Maps for String Alignment Kernels

  • 1. Space-efficient Feature Maps for String Alignment Kernels Yasuo Tabei (RIKEN-AIP) Joint work with Yoshihiro Yamanishi (Kyutech) Rasmus Pagh (IT University of Copenhagen) ICDM’19@Beijing, Nov. 10th, 2019
  • 2. Kernel methods • Kernels are inner product in some feature space H: 𝐾 𝑥, 𝑥′ =< φ 𝑥 , φ 𝑥′ > • Intuitively, kernel is a measure of similarity of x and x’ • x and x’ can be vectors, trees, graphs. • x and x’ are strings in this talk. • Kernels are useful for - Classification (SVM), Regression, Feature selection, Two-sample problems, etc.
  • 3. String alignment kernels • Typical string kernel uses substring (k-mers) features • Alignment kernel uses string alignment (e.g., edit distance) as a similarity measure • It has a wide variety of applications in string processing E.g.) text classification, remote homology detection for proteins/DNA [BMCBioinfo.’06], etc • Advantage: High prediction accuracy • Drawback: Large computation complexity Square time to the length of strings (dynamic programming) Quadratic time in the number of training data
  • 4. Feature maps (FMs) for kernel approximations [A. Rahimi and B. Recht, NIPS, 2007] • FMs map d-dimensional vector 𝑥 ∈ 𝑅 𝑑 into D-dimensional vector φ(𝑥) ∈ 𝑅 𝐷 using O(d×D) memory and time • It can approximate kernel function k(x,y) by the inner product of compact vectors 𝜑(𝑥)・φ(𝑦) • Linear model 𝑓𝑙 𝑥 = 𝑤・φ 𝑥 has approximately the same functionality as nonlinear model 𝑓 𝑛 𝑥 = 𝑖 𝑘(𝑥, 𝑦𝑖) • Advantage: can enhance the scalability of kernel methods (i)Input vectors (ii)Compact vectors (iii)Linear model Map Learn model weight w 𝑓𝑙 𝑥 = 𝑤・φ 𝑥
  • 5. Existing feature maps (FMs) • Several FMs with different input formats and kernel similarities have been proposed • No previous work has been able to approximate alignment kernels Method Kernel Input b-bit MinHash [Li'11] Jaccard Binary vector Tensor Sketching [Pham'13] Polynomial Real vector 0-bit CWS [Li’15] Min-Max Real vector C-Hash [Mu'12] Cosine Real vector Random Feature [Rahimi'07] RBF Real vector RFM [Kar'12] Dot product Real vector PWLSGD [Maji'09] Intersection Real vector
  • 6. Space-efficient feature maps for string alignment kernels • Basic idea : use two hash functions of (i) edit-sensitive parsing (ESP) and (ii) feature maps (FMs) for Laplacian kernel • Feature maps consumes a large amount of memory O(d×D) memory for input dimension d and output dimension D • Present space-efficient FMs using O(d) memory • Can achieve high classification accuracy by training linear SVM with compact vectors S1 = ABRACADA S2 = ABRA S3 = ABRACA S4 = ATGCAGA S5 = BARACR x1 = (3, 1, 3) x2 = (2, 4, 1) x3 = (5, 1, 2) x4 = (1, 0, 0) x5 = (2, 2, 1) z1 = (1.2,0.1,1,2) z2 = (2,1,1.2,3.4) z3 = (-1.2,0,2.2,3) z4 = (-3.2,0,2.2,1) z5 = (2, 2, -1.2, 0) F(zi)=wzi (i) ESP (ii) FMs (iii) Learn linear model
  • 7. Edit-sensitive parsing (ESP) [G.Cormode and S.Mushukrishnan, 2007] • Build a single parse tree from input string S • Build a parse tree from the bottom (leaves) to the top (root) • Nodes with the same node label are built from pairs of the same symbol pairs • Can be used for mapping string S into integer vector x – Each element of x is the number of appearances of node labels • Can approximate edit distance with moves EDM as L1 distance between mapped vectors (i.e., EDM(Si,Sj)≒||xi-xj||1) • Computational time is linear to the length of S ABBAABA B B A B X1 X2 X3 X4 X5X6 x = (4, 5, 2, 1, 1, 1, 1, 1)
  • 8. ESP for mapping strings into integer vectors Step1 Given S and S’, make vectors V(S) and V(S’) each dimension of which is the number of each characters in S and S’. A B S=ABABABBAB → V(S) = (4,5) S’=ABBABAB → V(S’) = (3,4) ABBAABA B B Level1 Level2 A BBABA B Level1 Level2 Step2 Assign each pair or triple to the same non-terminal symbol Step3 Count the number of each node label and update vectors V(S) and V(S’) A B X1 X2 X3 X4 V(S) = (4, 5, 2, 1, 1, 0) V(S’)= (5, 4, 2, 0, 0, 1) Step4 Replace strings by the sequence of node labels in level2 S = X1X2X3X1 S’= X1X4X1 Goto step2
  • 9. Feature maps for string alignment kernels • EDM is approximated by L1-distance with mapped vectors as EDM(Si,Sj)≒||xi-xj||1 • Alignment kernel is defined as K(Si,Sj) = exp(-||xi-xj||1/β)) (Laplacian kernel) • Feature maps (FMs) can approximate Laplacian kernel as exp(-||xi-xj||1/β)≒<zi,zj> • FMs are space-inefficient with O(dD) memory for input dimension d and output dimension D Fast food approach [ICML’13] can approximate feature maps for RBF kernels S1 = ABRACADA S2 = ABRA S3 = ABRACA x1 = (3, 1, 3) x2 = (2, 4, 1) x3 = (5, 1, 2) z1 = (1.2,0.1,1,2) z2 = (2,1,1.2,3.4) z3 = (-1.2,0,2.2,3) F(zi)=wzi (i)ESP (ii) FMs (iii) Learn linear model
  • 10. Space-efficient FMs (Beliefly) • Basic idea: reduce random matrix R of size D×d in standard FMs to random matrix M of size t×d • Approximate R[i,j] element using polynomial equation: R[i, j] ≒ M[i,1] + M[i,2]・j1 + ・・・+ M[i,t-1]・jt-1 t-wise independent family distribution • Theoretical gurantee (concentration bound): Pr[|𝑧 𝑥 ′ 𝑧 𝑦 − 𝑘(𝑥, 𝑦)| ≥ ε] ≤ 2/(ε2 /𝐷) Rd D Md t Random matrix R for standard FMs O(D×d) memory Random matrix M for Space-efficient FMs O(t×d) memory
  • 11. Experiments • 5 massive string datasets in real world • Competitors  5 SVMs with string kernels: LAK [Bioinfo’08], GAK [ICML’11], ESP+Kernel, CGK+Kernel, stk17 [NIPS’17]  FMs for alignment kernels: D2KE [KDD’19]  SFMEDM: proposed
  • 15. Summary • Space-efficient feature maps for string alignment kernels • Use two hash functions – ESP: maps strings into integer vectors – Feature maps: maps integer vectors into feature vectors • Linear SVMs are trained on feature vectors – Linear SVMs behaves such as non-linear SVM with alignment kernels • Advantage: highly scalable • Code and datasets are available: https://sites.google.com/ view/alignmentkernels/home

Editor's Notes

  1. Thank you for your kind introduction. Today I’m going to talk about feature maps for string alignment kernels. Our method can solve large-scale machine learning problems on strings. This is joint work with Yoshihiro Yamanishi from Kyushu Institute of Technology and Ramus Pagh from IT University of Copenhagen.
  2. First, I will present a brief introduction of kernel methods.
  3. it can approximate non-linear function or decision boundary well with enough training data and can achieve high prediction accuracy
  4. To solve the scalability issue on kernel methods, feature maps for kernel approximations have been proposed by A. Rahimi and B. Recht, NIPS, 2007. FMs map d-dimensional vector x into D-dimensional vector φ(x) ∈ RD It can approximate kernel function k(x,y) by the inner product of mapped vectors Thus, linear model has approximately the same functionality as nonlinear model Advantage is that it can enhance the scalability of kernel methods.
  5. Several FMs with different input formats and kernel similarities have been proposed. No previous work has been able to approximate string alignment kernels.
  6. That’s why we present large-scale string classification by presenting space-efficient FMs for string alignment kernels. We use two hash functions of edit-sensitive parsing (ESP) and feature maps for (FMs) for an Laplacian kernel. We also present space-efficient feature maps reducing space-usage of FMs from O(dD) to O(d) for input dimension d. Can achieve high classification accuracy by training linear SVM with mapped vectors
  7. That’s why we present large-scale string classification by presenting space-efficient FMs for string alignment kernels. We use two hash functions of edit-sensitive parsing (ESP) and feature maps for (FMs) for an Laplacian kernel. We also present space-efficient feature maps reducing space-usage of FMs from O(dD) to O(d) for input dimension d. Can achieve high classification accuracy by training linear SVM with mapped vectors
  8. The first result shows the training time for each method in second. Kernel methods cannot finish within 48hours for large dataset of sports and compounds. Methods using feature hashing finished within 48hours in all the datasets. For example, our SFMEDM finished with 9hours for compound datasets and D=16 thousand dimension.
  9. Next result shows that memory usage in megabytes. Kernel methods consumed a large amount of memory of 654GB and 1.3TB. On the hand, The space used in methods using FMs is at least one order of magnitude smaller than string kernels. Our SFMEDM consumed less than 160GB for sports and compound
  10. The final figure shows that the