Kernel Methods and Relational Learning in Computational Biology

1. Kernel Methods and Relational Learning in Computational Biology ir. Michiel Stock Faculty of Bioscience Engineering Ghent University November 2014 KERMIT Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 1 / 36

2. Outline 1 Introduction 2 Kernel methods Theoretical overview Dealing with sequences Dealing with graphs Other kernels 3 Learning relations Kronecker kernels Conditional ranking 4 Predicting enzyme function De

3. ning the problem Results 5 Conclusions Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 2 / 36

4. Introduction Introduction Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 3 / 36

5. Introduction Introductory example: drug design Strategy for curing Alzheimer's disease Find compounds with good ADMET properties that selectively bind cholinesterase and amyloid precursor protein Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 4 / 36

6. Introduction Labels: known protein-ligand interaction D F G T U Y Z A X V .2 .6 B .5 E .6 .8 .3 W .3 1 C Proteins Ligands Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 5 / 36

7. Introduction The targets: features for proteins Possible representations: amino acid sequence 3D structure gene expression cellular location phylogenetic pro

8. les ... Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 6 / 36

9. Introduction The ligands: features for compounds Possible representations: SMILE format and other text-based representations coloured graph representation

10. ngerprints based on physicochemical descriptors ... Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 7 / 36

11. Introduction Computational biology deals with interesting problems We deal with objects that are: in large dimension (e.g. microarrays or proteomics data) structured (e.g. gene sequences, small molecules, interaction networks, phylogenetic trees...) heterogeneous (e.g. vectors, sequences, graphs to describe the same protein) in large quantities (e.g. more than 106 known protein sequences) noisy (e.g. many features are not relevant) Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 8 / 36

12. Introduction Computational biology often deals with interactions Relational learning Predicting properties of two objects, which can be of a dierent type. Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 9 / 36

13. Kernel methods Kernel methods Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 10 / 36

14. Kernel methods Theoretical overview Formal de

15. nition of a kernel Kernels are non-linear functions de

16. ned over objects x 2 X. De

17. nition A function k : X X ! R is called a positive de

18. nite kernel if it is symmetric, that is, k(x; x0) = k(x0; x) for any two objects x; x0 2 X, and positive semi-de

19. nite, that is, XN i=1 XN j=1 ci cjk(xi ; xj ) 0 for any N 0, any choice of N objects x1; : : : ; xN 2 X, and any choice of real numbers c1; : : : ; cN 2 R. Can be seen as generalized covariances. Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 11 / 36

20. Kernel methods Theoretical overview Interpretation of kernels Suppose an object x has an implicit feature representation (x) 2 F. A kernel function can be seen as a dot product in this feature space: k(x; x0) = h(x); (x0)i Linear models in this feature space F can be made: y(x) = wT(x) = X n ank(xn; x) X F k h(x), (x0)i dinsdag, 10 april 2012 Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 12 / 36

21. Kernel methods Theoretical overview Many kernel methods exist Examples of popular kernel methods: Support vector machine (SVM) Regularized least squares (RLS) Kernel principal component analysis (KPCA) Learning algorithm is independent of the kernel representation! SVM KPCA Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 13 / 36

22. Kernel methods Dealing with sequences Kernels using sequence alignment sequence alignment optimises a score of how well the residues match use this score as a kernel value (similarity for sequences) Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 14 / 36

23. Kernel methods Dealing with sequences Kernels using substrings Spectrum kernel (SK) The SK considers the number of k-mers m two sequences si and sj have in common. SKk (si ; sj ) = X m2k N(m; si )N(m; sj ) with N(m; s) the number of k-mers m in sequence s. Many modi

24. cations exist. Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 15 / 36

25. Kernel methods Dealing with graphs What is a graph? Graph Graphs are a set of interconnected objects, called vertices (or nodes), that are connected through edges. Graphs can show the structure of an object or interactions between dierent objects. Graph are important in bioinformatics! Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 16 / 36

26. Kernel methods Dealing with graphs Comparing nodes within a graph Diusion kernel Constructing a similarity between vertices within the same graph. Based on performing a random walk on a graph. Captures the long-range relationships between vertices. Inspired by the heat equation. The kernel quanti

27. es how quickly `heat' can spread from one node to another. Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 17 / 36

28. Kernel methods Dealing with graphs Comparing two separate graphs Graph kernel Constructing a similarity between graphs. Also based on performing a random walk on both graphs and counting the number of matching walks. Usually very computationally demanding! In chemoinformatics: In structural bioinformatics: A Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 18 / 36

29. Kernel methods Other kernels Kernels for

30. ngerprints Objects that can be described by a long binary vector x can be represented by the Tanimoto kernel: KTan(xm; xn) = hxm; xni hxm; xmi + hxn; xni hxm; xni : Fingerprint representation of a molecule: Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 19 / 36

31. Kernel methods Other kernels Kernels for other objects Kernels for texts: often based on word count (example: medical papers) Kernels for point clouds (example: using 3D structure of proteins) Fisher kernels: use information of a generative model (example: using a Hidden Markov Model) Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 20 / 36

32. Learning relations Learning relations Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 21 / 36

33. Learning relations Kronecker kernels A little math... A = a11 a12 a21 a22 and B = b11 b12 b21 b22 : (1) We de

34. ne the Vectorization operator: vec(A) = 2 a11 a12 a21 a22 664 3 775 And the Kronecker product: A B = 2 a11b11 a11b12 a12b11 a12b12 a11b21 a11b22 a12b21 a12b22 a21b11 a21b12 a22b11 a22b12 a21b21 a21b22 a22b21 a22b22 664 3 775 Key equation: (BT A)vec(X) = vec(AXB) Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 22 / 36

35. Learning and Ranking Algorithms Learning relations Kronecker for kernels Bioinformatics Introductory example: Applications Kernels for pairs of objects chemogenomics Michiel Stock, Willem Waegeman, Bernard De Baets KERMIT, Department of Mathematical Modelling, Statistics and Bioinformatics Pairwise kernel Combine the kernel matrices of the individual objects to construct a kernel matrix for pairs of objects. set of proteins and a database of ligands to aid the process of drug statistical model based on a data set. Kernel methods allow for the protein and a ligand. Introductory example: chemogenomics binding interactions between a set of proteins and a database of ligands to aid the process of drug used to model pairwise relations between different types of objects. Ligands ( , ) ( , ) ( , ) By optimizing a ranking loss, our algorithms can also be used for conditional ranking, as shown on the right. In short, our framework is ideally suited for bioinformatics ... challenges: ( , ) - efficient learning ( , ) process - can handle complex objects (graphs, trees, sequences...) - ability to deal with information retrieval problems Object kernels Pairwise kernel SVM RLS ... Learning algorithm Kronecker kernel: K = K K our algorithms can also be used for the right. ideally suited for bioinformatics relevant relevant Object kernels Data set Conditional ranking algorithm Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 23 / 36

36. Learning relations Kronecker kernels Kernel ridge regression for relations set y = vec(Y ) and K = K K We can just use the usual kernel ridge regression: arg min a (yKa)T (yKa)+ aTKa This is equivalent to solving the following linear system: (K + INMNM)a = y N objects of type U (e.g. proteins) M objects of type V (e.g. ligands) Y : N M label matrix (e.g. molecular interaction) K: N N kernel matrix for objects of type U K : M M kernel matrix for objects of type V Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 24 / 36

37. Learning relations Conditional ranking ( , ) ( , ) Conditional ranking ... ( , ) ( , ) Motivation Suppose one is not particularly interested in the exact value of the interaction but in the order of the proteins for a given ligand. kernels Pairwise kernel SVM RLS ... Learning algorithm used for bioinformatics More relevant Query 1 Query 2 Database objects More relevant Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 25 / 36

38. Learning relations Conditional ranking Conditional ranking Suppose: e = (u; v) 2 E = (U V) Train the model: h(e) = wT (e) = X e2E aeK(e; e) by solving: A(T) = arg min h2H L(h;T)+khk2 H: Where we use a ranking loss: X L(h;T) = u;u02U X v;v02V preference graph: Figure 1 Example of a multi-graph. If this graph, on the left, would be used for conditioned on C, then A scores better than E, which ranks higher than E, which higher than D and D ranks higher than B. There is no information about the relation and G, respectively, our model could be used to include these two instances in are available. Notice that in this setting unconditional ranking of these objects graph is obviously intransitive. Figure reproduced from (Pahikkala et al., 2010). (yu;vyu0;v0h(u; v)+h(u0; v0))2: The proposed framework is based on the Kronecker product kernel implicit joint feature representations of queries and the sets of objects Exactly this kernel construction will allow a straightforward existing framework to dyadic relations and multi-task learning Michiel Stock (KERMIT) Kernels for Computational Biology (Objectives 1 and 2). It has November been proposed 2014 independently 26 by / 36 three

39. Predicting enzyme function Predicting enzyme function Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 27 / 36

40. Predicting enzyme function The data set Data: two data sets of ca. 1600 enzymes with 21 dierent functions

41. ve dierent similarity measures of the active site active site of an enzyme: Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 28 / 36

42. Predicting enzyme function The enzyme commission number Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 29 / 36

43. Predicting enzyme function De

44. ning the problem Quantifying enzyme function similarity EC 2.7.7.12 EC 4.2.3.90 EC ?.?.?.? EC 2.7.7.34 EC 4.6.1.11 EC 2.7.1.12 1 0 0 3 0 2 0 2 0 Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 30 / 36

46. ning the problem Conditional ranking of enzymes Ranking enzymes For an unannotated enzyme, rank the annotated enzymes so that the top has a similar function w.r.t. the query. Minimize ranking error: number of switches needed for a perfect ranking Example: suppose one has an enzyme with unknown function: EC ?.?.?.? 1 EC 2.7.7.12 2 EC 2.7.7.12 3 EC 2.7.7.34 4 EC 2.7.1.12 5 EC 2.7.7.34 6 EC 4.2.3.90 7 EC 1.14.11 8 EC 4.6.1.11 ) EC 2.7.7.12 Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 31 / 36

48. ning the problem Learning the catalytic similarity pair of enzymes: e = (v; v0) label ye 2 f0; 1; 2; 3; 4g: the catalytic similarity

49. ve dierent structural similarities: K(v; v0) Enzymes A B C D E F G A 4 4 0 0 0 B 4 4 0 0 0 C 0 0 4 2 1 D 0 0 2 4 3 E 0 0 1 3 4 F G Enzymes Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 32 / 36

50. Predicting enzyme function Results Qualitative improvement in the enzyme similarities Example for CavBase structural similarity: Unsupervised Supervised Ground truth Lighter color = higher similarity Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 33 / 36

51. Predicting enzyme function Results Improvement of the ROC curves ROC curves for the

52. ve dierent structural similarity measures: unsupervised and supervised ROC curve for the different enzyme similarity False positive rate Average true positive rate 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 CB sup. FP sup. LPCS sup. MCS sup. SW sup. CB unsup. FP unsup. LPCS unsup. MCS unsup. SW unsup. measurements of data set I Improvement Increase of AUC from ca. 0.7 to more than 0.8! Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 34 / 36

53. Conclusions Conclusions kernels can be used to work with structured objects... ... and can encode your prior knowledge many problems in computational biology can be seen as `learning relations' relations between objects can be learned elegantly and eciently using Kronecker kernels Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 35 / 36

54. Conclusions Kernel Methods and Relational Learning in Computational Biology ir. Michiel Stock Faculty of Bioscience Engineering Ghent University November 2014 KERMIT Michiel Stock (KERMIT) Kernels for Computational Biology November 2014 36 / 36

Kernel Methods and Relational Learning in Computational Biology

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Kernel Methods and Relational Learning in Computational Biology

Similar to Kernel Methods and Relational Learning in Computational Biology (20)

More from Michiel Stock

More from Michiel Stock (13)

Recently uploaded

Recently uploaded (20)

Kernel Methods and Relational Learning in Computational Biology