Bioinformatics kernels relations

458 views
410 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
458
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Bioinformatics kernels relations

  1. 1. Kernel Methods and Relational Learning in Bioinformatics ir. Michiel Stock Dr. Willem Waegeman Prof. dr. Bernard De Baets Faculty of Bioscience Engineering Ghent University November 2012 KERMITir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 1 / 40
  2. 2. Outline1 Introduction2 Kernel methods3 Learning relations4 Case studies Enzyme function prediction Protein-ligand interactions Microbial ecology5 Conclusions ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 2 / 40
  3. 3. IntroductionIntroductory exampleProblem statementPredict protein-protein interactions based on high-throughput data. Based on a gold standard Typical features that can be used: Yeast two-hybrid Pfam profile Phylogenetic profile Localization PSI-BLAST Expression ... ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 3 / 40
  4. 4. IntroductionMachine learning is widelyagaused in bioinformatics 88 Larran‹ et al. Downloaded from bib.oxfordjournals.org at Biomedische Bibliotheek o Figure 1: Classification of the topics where machine learning methods are applied. ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 4 / 40
  5. 5. IntroductionBioinformatics deals with complex dataBioinformatics data is typically: in large dimension (e.g., microarrays or proteomics data) structured (e.g., gene sequences, small molecules, interaction networks, phylogenetic trees...) heterogeneous (e.g., vectors, sequences, graphs to describe the same protein) in large quantities (e.g., more than 106 known protein sequences) noisy (e.g., many features are not relevant) ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 5 / 40
  6. 6. Kernel methodsFormal definition of a kernelKernels are non-linear functions defined over objects x ∈ X .DefinitionA function k : X × X → R is called a positive definite kernel if it issymmetric, that is, k(x, x ) = k(x , x) for any two objects x, x ∈ X , andpositive semi-definite, that is, N N ci cj k(xi , xj ) ≥ 0 i=1 j=1for any N > 0, any choice of N objects x1 , . . . , xN ∈ X , and any choice ofreal numbers c1 , . . . , cN ∈ R.Can be seen as generalized covariances. ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 6 / 40
  7. 7. Kernel methodsInterpretation of kernels Suppose an object x has an implicit feature representation φ(x) ∈ F. A kernel function can be seen as a dot product in this feature space: X F k(x, x ) = φ(x), φ(x ) h (x), (x0 )i k Linear models in this feature space F can be made: dinsdag, 10 april 2012 T y (x) = w φ(x) = an k(xn , x) n ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 7 / 40
  8. 8. Kernel methodsMany kernel methods exist SVM Examples of popular kernel methods: Support vector machine (SVM) Regularized least squares (RLS) Kernel principal KPCA component analysis (KPCA) Learning algorithm is independent of the kernel representation! ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 8 / 40
  9. 9. Kernel methodsKernels for (protein) sequencesSpectrum kernel (SK)The SK considers the number of k-mers m two sequences si and sj have incommon. SKk (si , sj ) = N(m, si )∗N(m, sj ) m∈Σk with N(m, s) the number of k-mers m in sequence s. To predict structure, function... of DNA, RNA or proteins. A discriminative alternative for Hidden Markov Models. ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 9 / 40
  10. 10. Kernel methodsKernels for graphs (1)GraphGraphs are a set of interconnected objects, called vertices (or nodes), thatare connected through edges.Graphs can show the structure of an object or interactions betweendifferent objects. Graph are important in bioinformatics! ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 10 / 40
  11. 11. Kernel methodsKernels for graphs (2)Graph kernelConstructing a similarity between graphs. In chemoinformatics: Based on performing a random walk on both graphs and counting the number of In structural bioinformatics: matching walks. Usually very computationally demanding! A ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 11 / 40
  12. 12. Kernel methodsKernels for graphs (3)Diffusion kernelConstructing a similarity between vertices within the same graph. Also based on performing a random walk on a graph. Captures the long-range relationships between vertices. Inspired by the heat equation. The kernel quantifies how quickly ‘heat’ can spread from one node to another. ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 12 / 40
  13. 13. Kernel methodsKernels for fingerprints Fingerprint representation of Objects that can be described an object: by a long binary vector x can be represented by the Tanimoto kernel: KTan (xm , xn ) = xm , xn . xm , xm + xn , xn − xm , xn ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 13 / 40
  14. 14. Learning relationsKernels for pairs of objectsProblem statementPredict the binding interaction between a given protein and a ligand(small molecule). Learning Molecular docking. The problem deals with two types of objects: Proteins (graph kernel of structure, sequence kernel, fingerprints...) Ligand (fingerprints, graph kernel...) Label is for a pair of objects. ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 14 / 40
  15. 15. Learning relationsng and Ranking Algorithms for Bioinformatics example: pairs of objects Kernels for ApplicationsnomicsWillem Waegeman, Bernard De Baets Michiel Stock, Pairwise kernelIT, Department of Mathematical Modelling, Statistics and Bioinformaticsof Combine the kernel matrices of the individual the process of druga kernel proteins and a database of ligands to aid objects to constructistical model based objects. matrix for pairs of on a data set. Kernel methods allow for theroductory example: chemogenomicstein and a from individual kernels for the proteins and ligands: Starting ligand.ding interactions between a set of proteins and a database of ligands to aid the process of drugto model pairwise relations between different types of objects.s Data set Object kernels ( , ) By optimizing a ranking loss, our algorithms can also be used for ( , ) as shown on the right. conditional ranking, ( , ) SVM In short, our framework is ideally suited for bioinformatics RLS ... challenges: ( , ) - efficient learning process ( , ) ... - can handle complex objects (graphs, trees, sequences...) Pairwise kernel - ability to deal with information retrieval problems Object kernels Learning algorithm gorithms can also be used for ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 15 / 40
  16. 16. ( , ) Learning relations SVM Conditional ranking (1) RLS ... Motivation( , ) Suppose one is not ) ... ( , particularly interested in the exact value of the interaction but in the order of the proteins for a given ligand. Pairwise kernelrnels Learning algorithmed for More relevant More relevantmatics Query 1 Query 2 Database objects ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 16 / 40
  17. 17. Learning relationsConditional ranking (2) Based on a graph description, with e a pair of objects. Train the model: h(e) =< w, Φ(e) >= ae K Φ (e, e ) ¯ e∈E using the algorithm: 2 A(T ) = argmin L(h, T )+λ h H. h∈H Figure 1 Example of a multi-graph. If this graph, on the left, would be used fo conditioned on C, then A scores better than E, which ranks higher than E, w Where we use a ranking loss: higher than D and D ranks higher than B. There is no information about the re and G, respectively, our model could be used to include these two instances in are available. Notice that in this setting unconditional ranking of these objects graph is obviously intransitive. Figure reproduced from (Pahikkala et al., 2010). L(h, T ) = (ye −ye −h(e)+h(¯))2 . ¯ e The proposed framework is based on the Kronecker product ke v ∈V e,¯∈Ev e implicit joint feature representations of queries and the sets of ob Exactly this kernel construction will allow a straightforward existing framework to dyadic relations and multi-task l (Objectives 1 and 2). It has been proposed independently by three modeling pairwise inputs in different application domains (Basilico ir. Michiel Stock (KERMIT) Kernels for Bioinformatics et al. 2004, Ben-Hur et al. November a2012 2005). From different perspective, it h 17 / 40
  18. 18. Case studies Enzyme function predictionPredicting enzyme functionProblem statementPredict the function (EC number) of an enzyme using structuralinformation of the active site. Data: active site of an 1730 enzymes with 21 enzyme: different functions four different structural similarities CavBase maximum common subgraph labeled point cloud superposition fingerprints ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 18 / 40
  19. 19. Case studies Enzyme function predictionEC numbersEC numberA functional label of an enzyme, based on the reaction that is catalyzed.Example: EC 2.7.6.1 = ribose-phosphate diphosphokinase ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 19 / 40
  20. 20. Case studies Enzyme function predictionDefining catalytic similarityCatalytic similarityThe catalytic similarity is the number of successive equal digits in the ECnumber between two enzymes, starting from the first digit. 0 EC 2.7.7.34 EC ?.?.?.? 3 2 0 1 EC 4.2.3.90 0 0 0 EC 4.6.1.11 2 EC 2.7.1.12 EC 2.7.7.12 ir. Michiel Stock (KERMIT) Kernels for Bioinformatics November 2012 20 / 40

×