Your SlideShare is downloading. ×
Interactive Datamining of Large-Scale Screening Datasets
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Interactive Datamining of Large-Scale Screening Datasets

128
views

Published on

16th Darmstädter Molecular Modeling Workshop, Darmstadt, Germany, 2002

16th Darmstädter Molecular Modeling Workshop, Darmstadt, Germany, 2002


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
128
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Interactive Datamining of Large-Scale Screening Datasets Frank Oellien, Wolf D. Ihlenfeldt Computer-Chemie-Centrum University Erlangen-Nuremberg Klaus Engel, Thomas Ertl Visualization and Interactive Systems Group University StuttgartC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 2. Overview Multi-variate and multi-dimensional datasets • Motivation • Information Visualization Techniques • Examples (ChemCodes Inc., NCI) • DemoC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 3. Overview Multi-variate and multi-dimensional datasets • Motivation • Information Visualization Techniques • Examples (ChemCodes Inc., NCI) • DemoC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 4. Chemical data1800000016000000 Merck Katalog14000000 Synopsys PG12000000 ACX NCI DTP10000000 ChemInform8000000 Spresi6000000 Beilstein4000000 CAS2000000 Current datasets 0 C 3 © Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 5. Multi-Variate and Multi-Dimensional Numeric Datasets Today Change in chemical synthesis technology • new technologies (HTS, combinatorial synthesis) → experiments generate terabytes of data per year • development of data mining and visualization tools could not keep pace • most critical bottleneck in R&D today ! → tools for interactive mining and information visualization are neededC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 6. Tools for Interactive Visualization of Multi-Variate and Multi-Dimensional DataStandard applications • barchart, 2D and pseudo 3D scatter plots, molecular spreadsheets • limited to small subsets • platform-dependentOur goal: applications that are • simple to use • allow straightforward interpretation of results • generalized access to tabular numeric data • platform-independent C 3 © Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 7. Overview Multi-variate and multi-dimensional datasets • Motivation • Information Visualization Techniques • Examples (ChemCodes Inc., NCI) • DemoC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 8. 3D Tools for Interactive Information Visualization Information Visualization Applications that uses 3D capabilities of modern clients • Glyph-based InfVis approaches • Volume-based InfVis approachesC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 9. Glyph-based InfVis Tools • 3 orthogonal axes • color • shape • size • transparency • surface effects • animation • up to ~100 GlyphsC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 10. Java/Java3D InfVis AppletJava3D Tool PanelCanvas (filters, selection tools, details)ControlPanelC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 11. Java/Java3D InfVis Applet 3D Render Panel 3D Glyphs 3D BarchartC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 12. Java/Java3D InfVis Applet 3D Tool Panel Dynamic Filter Tools Selection Tools Detail ToolsC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 13. Java/Java3D InfVis Applet 3D Control PanelC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 14. Advantages of Volume-based InfVis ToolsDatabases with millions of data points – Glyph-based InfVis approaches • produce millions of geometric primitives • interactive visualization not possible – Volume-based InfVis approaches • can handle large number of data points • interactive visualization using low-cost graphics hardware is possibleC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 15. Overview Multi-variate and multi-dimensional datasets • Motivation • Information Visualization Techniques • Examples (ChemCodes Inc., NCI) • DemoC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 16. ChemCodes Reaction Database • 100 most important FGs ~75% chemistry • 100 standard reactions • Limits of standard reactions • Functional Group Compatibility • Generating Rules Goal: Analysis of the reaction spaceC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 17. ChemCodes - Reaction Optimization I • Goal: Reaction Optimization: > 95% Yield • 7 Dimensions: reagent, solvent, time, temperature, stoichiometry, reagent order, FG-compatibilityC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 18. ChemCodes - Reaction Optimization IIC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 19. ChemCodes - Reaction PlanningFunctionalGroupCompatibilityCheck H H NH OC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 20. Example 2: NCI Anti-tumor / Anti-viral Database • Initiated in April 1990 (modified 1994) • ~ 250.000 compounds • ~ 30.000 with anti-tumor screening data Enhanced NCI Database Browser • > 30 different molecular properties • up to 23 3D conformers per compoundC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 21. Lead Compound Discovery IIC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 22. Lead Compound Discovery IIC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 23. Overview Multi-variate and multi-dimensional datasets • Motivation • Information Visualization Techniques • Examples (ChemCodes Inc., NCI) • DemoC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002
  • 24. Acknowledgment • Prof. Johann Gasteiger Computer-Chemie-Centrum University of Erlangen-Nuremberg • Prof. Thomas Ertl, Dipl. Inf. Klaus Engel Visualization and interactive Systems University of Stuttgart • Dr. Patrick Kiser, Dr. Gary Eichenbaum ChemCodes Inc. • Marc Nicklaus Laboratory of Medicinal Chemistry NCI, NIH • Deutsche ForschungsgemeinschaftC 3© Oellien, Ihlenfeldt, Engel, Ertl MMWS 2002

×