PMML for QSAR Model Exchange
Upcoming SlideShare
Loading in...5
×
 

PMML for QSAR Model Exchange

on

  • 1,214 views

 

Statistics

Views

Total Views
1,214
Views on SlideShare
1,213
Embed Views
1

Actions

Likes
1
Downloads
9
Comments
0

1 Embed 1

https://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    PMML for QSAR Model Exchange PMML for QSAR Model Exchange Presentation Transcript

    • PMML  for  QSAR  Model  Exchange     Rajarshi  Guha,  Ph.D.     NIH  Center  for  Advancing   TranslaEonal  Sciences   guhar@mail.nih.gov  /  h0p://rguha.net  
    • Background  •  CheminformaEcs     –  QSAR,  diversity  analysis,  virtual  screening,     fragments,  polypharmacology,  networks  •  RNAi  screening,  high  content  imaging  •  Extensive  use  of  machine  learning  •  All  Eed  together  with  soLware     development  (GUI’s,  libraries)  •  Contributed  pmml.lm  to  the  PMML   package  
    • QuanEtaEve  Structure  AcEvity   RelaEonships  
    • Why  is  QSAR  Useful?  •  Lets  us  predict  whether  a  chemical  is  likely  to   be  toxic,  avoiding  animal  tesEng  •  PrioriEze  molecules  from  a  high  throughput   screen  of  300K  molecules  •  Predict  whether  a  molecule  will  be  (sufficiently)   soluble  in  water  •  IdenEfy  molecules  with  anE-­‐malarial  properEes  •  Accurate,  predic-ve  models  can  save   significant  -me  and  money  (and  cute  bunnies)  
    • Lots  and  Lots  of  Models  •  Hundreds  of  such  models  published  in  the   literature   –  Usually  in  the  form  of  tables  of  regression   coefficients  (if  we’re  lucky)   –  If  the  paper  describes  an  SVM  model,  no  chance   of  reproducing  the  results  •  How  can  we  exchange  QSAR  models?  
    • QSAR  Model  Exchange  •  Build  models  in  ….,    •  Save  them  in  PMML  •  Distribute  •  …  •  Profit?   –  Not  always    The  bo0leneck  is  evalua:ng  descriptors  for  the  new  observa:ons  to  supply  to  the  model  
    • CheminformaEcs  in  R  •  rcdk  provides  cheminformaEcs  support  in  R   –  Load  and  parse  molecular  file  formats   –  Evaluate  numerical  descriptors  from  chemical   structures   rcdkCDK Jmol rpubchem rJava fingerprint XML R Programming Environment
    • CheminformaEcs  in  R  library(pmml)!library(rcdk)!data(bpdata)!mols <- parse.smiles(bpdata[, 1])!descNames <- unique(unlist(sapply(topological, ! get.desc.names)))!descs <- eval.desc(mols, descNames)!model <- lm(BP ~ khs.sCH3 + khs.sF + TopoPSA + VABC,data.frame(bpdata,descs))!pmml(model)!
    • R,  rcdk,  PMML  •  rcdk  provides  the  means  to  take  in  molecules   and  output  a  PMML  encoded  model  •  One  could  record  appropriate  funcEons/classes   in  the  document  and  use  that  info  to  evaluate   descriptor  for  new  observaEons  •  Since  rcdk  is  based  on  the  Java  CDK  library,   could  also  use  jpmml,  a  Java  API  for  PMML   documents