Cloud hosted APIs for
cheminformatics
designed for real time
user interfaces
Alex M. Clark, Ph.D.
March 2014
© 2014 Molecu...
MOLECULAR MATERIALS INFORMATICS
Data Regimes
• Differences in kind based on size:
- small: <1000 molecules; document-sized...
MOLECULAR MATERIALS INFORMATICS
Overview
• Describing a workflow for tuberculosis; doing
scaffold analysis, model building,...
MOLECULAR MATERIALS INFORMATICS
TB Mobile
• Begins with a mobile app:
- ~90 curated targets
- ~ 800 molecules
• TB inhibit...
MOLECULAR MATERIALS INFORMATICS
TB Mobile
• Begins with a mobile app:
- ~90 curated targets
- ~ 800 molecules
• TB inhibit...
MOLECULAR MATERIALS INFORMATICS
TB Mobile
• Begins with a mobile app:
- ~90 curated targets
- ~ 800 molecules
• TB inhibit...
MOLECULAR MATERIALS INFORMATICS
TB Mobile
• Begins with a mobile app:
- ~90 curated targets
- ~ 800 molecules
• TB inhibit...
MOLECULAR MATERIALS INFORMATICS
TB Mobile
• Begins with a mobile app:
- ~90 curated targets
- ~ 800 molecules
• TB inhibit...
MOLECULAR MATERIALS INFORMATICS
Scaffold Fragments
• What medicinally relevant scaffolds to use?
!5
157 related
compounds
...
MOLECULAR MATERIALS INFORMATICS
Filtering Scaffold Candidates
• Candidates analysed & trimmed
• Overall architecture is a ...
MOLECULAR MATERIALS INFORMATICS
Pipelining
• Not quite cloud (yet)
• Infrastructure for streaming
nodes together: build
wo...
MOLECULAR MATERIALS INFORMATICS
Fragmentation
• Consider each structure: break it into pieces,
enumerate scaffold-like fra...
MOLECULAR MATERIALS INFORMATICS
Decorating
• Have scaffoldy fragments, 5425 measurements
!9
• Do a trial matching:
templat...
MOLECULAR MATERIALS INFORMATICS
Scaffold Selection
!10
Assays
Filter
5425 molecules
Templates
Precursor
• Keep molecules b...
MOLECULAR MATERIALS INFORMATICS
SAR Table App
• Back to mobile apps: want to deliver the 225
compounds to iPad/iPhone…
- e...
MOLECULAR MATERIALS INFORMATICS
Import
• Launch datasheet, draw first scaffold…
!12
MOLECULAR MATERIALS INFORMATICS
Import
• Launch datasheet, draw first scaffold…
!12
MOLECULAR MATERIALS INFORMATICS
Scaffold Assignment
• Ask the webservice to assist: complex, fast
!13
MOLECULAR MATERIALS INFORMATICS
Scaffold Assignment
• Ask the webservice to assist: complex, fast
!13
MOLECULAR MATERIALS INFORMATICS
Scaffold Assignment
• Ask the webservice to assist: complex, fast
!13
MOLECULAR MATERIALS INFORMATICS
Multi-Scaffold Assignment
!14
• Assign scaffolds in bulk: complex, quite fast
MOLECULAR MATERIALS INFORMATICS
Multi-Scaffold Assignment
!14
• Assign scaffolds in bulk: complex, quite fast
MOLECULAR MATERIALS INFORMATICS
Multi-Scaffold Assignment
!14
• Assign scaffolds in bulk: complex, quite fast
MOLECULAR MATERIALS INFORMATICS
More Data
• Have scaffolds and
substituents assigned
• Can gain valuable
insight just from...
MOLECULAR MATERIALS INFORMATICS
ChemSpider
Searching
• Search for a template; optionally narrow
substituent values; want o...
MOLECULAR MATERIALS INFORMATICS
Results
• Results are marked up
• Uses existing
fragments for context
• No duplicate struc...
MOLECULAR MATERIALS INFORMATICS
Results
• Results are marked up
• Uses existing
fragments for context
• No duplicate struc...
MOLECULAR MATERIALS INFORMATICS
Model Building
• Use structures with known
activities to create a
structure-activity model...
MOLECULAR MATERIALS INFORMATICS
Model Application
• Predicted activities for looked-up compounds…
!19
MOLECULAR MATERIALS INFORMATICS
Matrix View
• Plot R1 vs R4: examine second order SAR
!20
MOLECULAR MATERIALS INFORMATICS
Filling in Blanks
• Each blank cell: create &
score chimeric structures
• Gather distribut...
MOLECULAR MATERIALS INFORMATICS
Matrix Predictions
• Shows measured, available & hypothetical…
!22
MOLECULAR MATERIALS INFORMATICS
Conclusion
• Mobile+cloud can accomplish many
sophisticated tasks
• Stateless webservices ...
Acknowledgments
http://molmatinf.com
http://molsync.com
http://cheminf20.org
!
@aclarkxyz
• Sean Ekins, Barry Bunin
& CDD
...
Upcoming SlideShare
Loading in …5
×

Cloud hosted APIs for cheminformatics on mobile devices (ACS Dallas 2014)

672 views

Published on

Mobile apps for cheminformatics are quite powerful on their own, but can be significantly boosted by connecting them with cloud-hosted functionality. This talk explores the range of functionality that can be covered simply by making use of apps with stateless webservices, i.e. anonymous access without persistent data.

Published in: Spiritual
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
672
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Cloud hosted APIs for cheminformatics on mobile devices (ACS Dallas 2014)

  1. 1. Cloud hosted APIs for cheminformatics designed for real time user interfaces Alex M. Clark, Ph.D. March 2014 © 2014 Molecular Materials Informatics, Inc.! http://molmatinf.com
  2. 2. MOLECULAR MATERIALS INFORMATICS Data Regimes • Differences in kind based on size: - small: <1000 molecules; document-sized - medium: <100K; filesystem, heavy duty - large: database servers; limited operations • Nimble client (mobile apps, web) either: - operate on small collections - limited window onto large collections • Workflows using medium data are tricky !2
  3. 3. MOLECULAR MATERIALS INFORMATICS Overview • Describing a workflow for tuberculosis; doing scaffold analysis, model building, open data • Split into: - mobile apps as the user interface - cloud-hosted algorithms for hard work and access to large data - desktop-based sections for medium data • Mobile+cloud very convenient for small data, and for well established tasks • Desktop still primary for method development !3
  4. 4. MOLECULAR MATERIALS INFORMATICS TB Mobile • Begins with a mobile app: - ~90 curated targets - ~ 800 molecules • TB inhibition data abundant, but mostly no target info • Want all the actives against the inhA target (157) • Generate leads using scaffold analysis !4
  5. 5. MOLECULAR MATERIALS INFORMATICS TB Mobile • Begins with a mobile app: - ~90 curated targets - ~ 800 molecules • TB inhibition data abundant, but mostly no target info • Want all the actives against the inhA target (157) • Generate leads using scaffold analysis !4
  6. 6. MOLECULAR MATERIALS INFORMATICS TB Mobile • Begins with a mobile app: - ~90 curated targets - ~ 800 molecules • TB inhibition data abundant, but mostly no target info • Want all the actives against the inhA target (157) • Generate leads using scaffold analysis !4
  7. 7. MOLECULAR MATERIALS INFORMATICS TB Mobile • Begins with a mobile app: - ~90 curated targets - ~ 800 molecules • TB inhibition data abundant, but mostly no target info • Want all the actives against the inhA target (157) • Generate leads using scaffold analysis !4
  8. 8. MOLECULAR MATERIALS INFORMATICS TB Mobile • Begins with a mobile app: - ~90 curated targets - ~ 800 molecules • TB inhibition data abundant, but mostly no target info • Want all the actives against the inhA target (157) • Generate leads using scaffold analysis !4
  9. 9. MOLECULAR MATERIALS INFORMATICS Scaffold Fragments • What medicinally relevant scaffolds to use? !5 157 related compounds scaffoldy! fragments TB activity structures templatey! scaffolds …
  10. 10. MOLECULAR MATERIALS INFORMATICS Filtering Scaffold Candidates • Candidates analysed & trimmed • Overall architecture is a stream !6 Read InhA Fragment Merge Sort PropertiesFilterWrite HeavyAtoms Isomorphisms Macrocycles Frequency 157 molecules 124 fragments
  11. 11. MOLECULAR MATERIALS INFORMATICS Pipelining • Not quite cloud (yet) • Infrastructure for streaming nodes together: build workflows using a script • Roadmap: build selected workflows, out of prepackaged nodes • Expose as webservices: for use by mobile apps !7 { "op":"com.mmi.core.op.CollapseUnique", "id":102, "name":"Collapse", "parameters": { "keyColumn":"Molecule", "countColumn":"Degeneracy", "collapseColumn":["Target"], "collapseOperator":[","] }, "inputs":[[101,1]], "outputs":1 }, { "op":"com.mmi.core.op.Sort", "id":103, "name":"Collapse", "parameters": { "columns":["Degeneracy"], "directions":[-1] }, "inputs":[[102,1]], "outputs":1 }, { "op":"com.mmi.core.op.MoleculeProperties", "id":104, "name":"Properties", "parameters": { "heavyAtoms":"HeavyAtoms", "isomorphisms":"Isomorphisms", "macrocycles":"Macrocycles" }, "inputs":[[103,1]], "outputs":1 }, { "op":"com.mmi.core.op.FilterProperties", "id":105, "name":"Filter", "parameters": { "name":["HeavyAtoms","Isomorphisms","Macrocycles"], "operator":[">=","<=","="], "value":[10,4,0] }, "inputs":[[104,1]], "outputs":1 },
  12. 12. MOLECULAR MATERIALS INFORMATICS Fragmentation • Consider each structure: break it into pieces, enumerate scaffold-like fragments !8
  13. 13. MOLECULAR MATERIALS INFORMATICS Decorating • Have scaffoldy fragments, 5425 measurements !9 • Do a trial matching: templates & stats
  14. 14. MOLECULAR MATERIALS INFORMATICS Scaffold Selection !10 Assays Filter 5425 molecules Templates Precursor • Keep molecules based on at least one template • Output is suitable for the next stage in the workflow 87 actives 138 inactives
  15. 15. MOLECULAR MATERIALS INFORMATICS SAR Table App • Back to mobile apps: want to deliver the 225 compounds to iPad/iPhone… - email - dropbox - web • SAR Table app designed for small documents: content creation, focused analysis, and cloud- assisted functions !11
  16. 16. MOLECULAR MATERIALS INFORMATICS Import • Launch datasheet, draw first scaffold… !12
  17. 17. MOLECULAR MATERIALS INFORMATICS Import • Launch datasheet, draw first scaffold… !12
  18. 18. MOLECULAR MATERIALS INFORMATICS Scaffold Assignment • Ask the webservice to assist: complex, fast !13
  19. 19. MOLECULAR MATERIALS INFORMATICS Scaffold Assignment • Ask the webservice to assist: complex, fast !13
  20. 20. MOLECULAR MATERIALS INFORMATICS Scaffold Assignment • Ask the webservice to assist: complex, fast !13
  21. 21. MOLECULAR MATERIALS INFORMATICS Multi-Scaffold Assignment !14 • Assign scaffolds in bulk: complex, quite fast
  22. 22. MOLECULAR MATERIALS INFORMATICS Multi-Scaffold Assignment !14 • Assign scaffolds in bulk: complex, quite fast
  23. 23. MOLECULAR MATERIALS INFORMATICS Multi-Scaffold Assignment !14 • Assign scaffolds in bulk: complex, quite fast
  24. 24. MOLECULAR MATERIALS INFORMATICS More Data • Have scaffolds and substituents assigned • Can gain valuable insight just from that • What about public databases: what else do our 3 scaffolds match? !15
  25. 25. MOLECULAR MATERIALS INFORMATICS ChemSpider Searching • Search for a template; optionally narrow substituent values; want only new compounds !16 initiate MetaSearch poll • Substructure searches farmed out to well known large data services • Middleware post-processes with scaffold analysis & assignment PubChem
  26. 26. MOLECULAR MATERIALS INFORMATICS Results • Results are marked up • Uses existing fragments for context • No duplicate structures • All compounds are known… • … can be made or purchased. !17
  27. 27. MOLECULAR MATERIALS INFORMATICS Results • Results are marked up • Uses existing fragments for context • No duplicate structures • All compounds are known… • … can be made or purchased. !17
  28. 28. MOLECULAR MATERIALS INFORMATICS Model Building • Use structures with known activities to create a structure-activity model !18 WebService data partial model final model • Slow calculation, small data
  29. 29. MOLECULAR MATERIALS INFORMATICS Model Application • Predicted activities for looked-up compounds… !19
  30. 30. MOLECULAR MATERIALS INFORMATICS Matrix View • Plot R1 vs R4: examine second order SAR !20
  31. 31. MOLECULAR MATERIALS INFORMATICS Filling in Blanks • Each blank cell: create & score chimeric structures • Gather distribution of activities • Total calculation: slow • Performance: overhead amortised in blocks (e.g. 10 cells per request) !21
  32. 32. MOLECULAR MATERIALS INFORMATICS Matrix Predictions • Shows measured, available & hypothetical… !22
  33. 33. MOLECULAR MATERIALS INFORMATICS Conclusion • Mobile+cloud can accomplish many sophisticated tasks • Stateless webservices very easy to deploy • Work on small datasets, use large databases • Medium sized data is problematic • Can fallback to desktop: facile communication • Apps & webservices very well suited to mature workflow tasks !23
  34. 34. Acknowledgments http://molmatinf.com http://molsync.com http://cheminf20.org ! @aclarkxyz • Sean Ekins, Barry Bunin & CDD • RSC & ChemSpider, PubChem, ChEBI • Inquiries to info@molmatinf.com

×