The new JChem Microservices uses the latest generation search technology that can handle large datasets. Microservice architecture has a modular setup, so instead of one monolithic application, we have smaller modules with specific functionalities. JChem Microservices provide small, separate modules for different areas of ChemAxon functionalities like chemical dataset searching, conversion between chemical file formats, or chemical property calculation. This is scalable, easily manageable, and cloud-agnostic. In this webcast, we will show you how to set up a highly available architecture using Microservices, and demonstrate an example using Enamine search as a service.
6. The Three Little Pigs
● Gateway:
Route to APIs, also support predicates and filters
● Discovery:
Instances can be registered and clients can discover
the instances
● Config:
Resource-based API for external configuration
Not ChemAxon specific
Config server Discovery
Gateway
Chemical
intelligence
User frontier
Rest call
11. Goal
Use Enamine REAL molecule set
containing 720 million structures
for substructure
and similarity search
12. Setup
JChem Microservices DB is used for
chemical search
Memory requirement: 70577 MB RAM
AWS EC2 r5.4xlarge
16 vCPU 128 GB RAM
13. Convert SMILES csv 2 json
Enamine_REAL_database_smiles.zip 19GB
parallel conversion using 1M chunks finally ~90GB (finished in a day)
Import json
Took 3d 9h
Start service
Cache load ~ 5 min
Import data
35. • decoration free query structures and similarity search (tanimoto
>0.7) capture large pool of options
• some substructures match large sets of compounds
• ordered retrieval is beneficial
• fast retrieval is essential
Search options
36.
37. • 50 benchmark compounds from IRAK4 and Pan-Janus related
literature
• substructure searches ordered by similarity to query
• similarity search cutoff >0.7 Tanimoto
Enamine REAL search performance in Design Hub
max hits substructure similarity
200 3.36s 3.79s
2000 3.53s 3.87s
20000 4.30s 4.27s