(ATS3-GS03) Accelrys Enterprise
Platform Architecture Deeper Dive
 Jason Benedict
 Sr. Architect, Platform R&D
 jason.benedict@accelrys.com
The information on the roadmap and future software development efforts are
intended to outline general product direction and should not be relied on in making
a purchasing decision.
Agenda

•   Core services and security
•   Job launching
•   Process management
•   Latency and scalability data
•   Clustering methods
Accelrys Enterprise Platform
                                                                              SharePoint & Office                    3rd Party
     Web Applications                Thick Client Applications
                                                                                 Applications                       Applications




 Web Application Framework            Client Integration APIs                 MS Office Integration                SOA Integration



                                                Accelrys Enterprise Platform
                    Scientific and Generic Services                                              Data Management Services

       Work              Reports       Experiment           Instrument
      Request                           Workflow             Interfaces
                                                                                                                   Notebook
                                                                                    ORACLE            Docs                            Isentris
                                                                                                                     Vault
     Experiment         Scheduling       Virtual           Data Mining
       Design                           Chemistry          & Analytics

                                                                                              LIMS           LEA              Other
     Modeling
                         Biology       Registration             Imaging
    & Simulation                                                          …
Accelrys Enterprise Platform Integration
                                                                                                                    Client Integration
                                                                                                                    Build clients that connect to Pipeline
                                                                                                                    Pilot and run protocol services.
                                                                Web Browser
                               Run Protocol                                                           .NET Client             Java Client                 SOAP Client
 Professional Client          Command Line              Web Port          JavaScript Client
                                  Client
                                                   JavaScript Client                                  .NET Client
                                                                             REST API                                      Java Client SDK          Web Services API
                                                         SDK                                             SDK
                                                          Pipeline Pilot Enterprise Server
               Web Apps                         Web Services API                              Admin Portal                                  Help Portal

                                                                                                     Grid System Integration (optional)

                                                           Protocol Runtime Environment (scisvr)
  VB Script       VB Script
                                   Run                                                                   SOAP &         Telnet /                           ODBC /
    (On              (On                      Java             Perl       Python              .NET                                      SSH / SCP
                                 Program                                                                  HTTP            FTP                               JDBC
   Client)         Server)



                                   Java        Perl            .NET
   VB Script       Cmd Line
                                  Classes     Scripts         Classes



Server Integration                                                                                        REST            SOAP
                                                                                                                                        Cmd Line              DBs
Extend pipelines with new components that                                                                Service         Service

integrate your code, data and services.
Pipeline Server Architecture
                                                   Apache HTTP Server

                                      Authentication and Authorization Security Module

 Mod_balancer                                                              File      Protocol   runjob CGI   Admin Portal
                      Locator   XMLDB        Runner       Logging
                                                                         Access        Web
                      Service   Service      Service      Service                               WSDL CGI     Help Portal
  Mod_proxy                                                              Service     Services



                1 .. 1                                          1 .. N                                                      Corporate
                                                                                                                            Directory


                                                   Data Flow Services
   Apache Tomcat
                                                                                                                              DB’s
 Query
              Scheduler
 Service
                                                                                                                              SOA



                                                File System
                                                                                                                              CMS



                                XMLDB                   User Data        Job Data
Launching Asynchronous (polling) Jobs
                                                             Apache HTTP Server
                                               Authentication and Authorization Security Module
  1.                    2.                              3.          Runner Service                4.                         5.

       Create job            Need to fork                                                              Poll job status via        Read result
       directory with        scisvr?                                                                   sts file                   file from disk
       compressed                                                                                                                 and return
       protocol.xml                                                                                    Monitor Job                to client
                                        Write lck file to
       and uploaded                                                                                    existence via lck          through
                                        lck directory
       input files                                                                                     file and process           Apache
                                                              JVM                                      status
                                               scisvr(.exe)   CLR

                                                              Write sts file and results files
                                                              to job directory




                                                     Job Folder
Launching Synchronous (blocking) Jobs
                                                        Apache HTTP Server
                                          Authentication and Authorization Security Module
  1.   XMLDB        2.                             3.                 Runner Service         4.                        5.

         Get             Need to fork                    Connect to scisvr pipe.                  Send notification         Stream
         protocol        scisvr?                         Send protocol XML and                    to apache via pipe        results back
         XML from                                        request parameters                       when done                 to Apache
         Protocol                                                                                                           on pipe
         DB

                                   Write lck file to
                                   lck directory
                                                          JVM
                                          scisvr(.exe)    CLR




 XMLDB                                           XMLDB
Job Settings

               • Set Max Running Jobs to 2x available
                 cores
               • Set Blocking Job timeout between
                 10-30 seconds, not more due to
                 client starvation
               • Maximum Number of Parallel
                 Processing is a guideline, not a strict
                 maximum. Set to 2x cores
               • Set Maximum Job Daemons Per Pool
                 to 2x available cores
               • Job Readiness Refresh Rate assists
                 with multipurpose servers which can
                 become “cold”
               • Read application specific
                 recommendations for more details
Process Management - Pools
                             – Identified by
                               __poolid=<name>
                               parameter on request.
                                 –   Needs to be sent from the
                                     client, not from the saved
                                     protocol
                             – Latency of 20-200 ms
                             – Creates a pool of scisvr.exe
                               processes dedicated to that
                               pool
                             – Enables caching of
                               expensive resources:
                                 • JVM
                                 • CLR
                                 • Database connections
                                 • Protocol DB Shortcuts
                                    and References
Process Management – Pools w/ Impersonation

                                   – Impersonation
                                     create a small
                                     pool for each
                                     user for each
                                     pool
                                   – Lower the pool
                                     sizes to
                                     accommodate
                                     this behaviors
Scisvr Pool Settings – Config files
Setting                       Default   Description
Start Servers                 0         Number of initial processes in this pool, created when apache starts

Min Spare Servers             1         Min number of idle processes to keep alive
Max Spare Servers             1         Max number of “available”processes to keep alive
Max Spare Servers Trim Time   0         Time to wait (seconds) before pruning “Available” servers exceeding
                                        Max Spare Servers value
Max Servers                   16        The total number of servers to allow for this pool
Max Queue Depth               32        maximum number of jobs to queue before rejected. Can be 0 or -1 for
                                        infinite
Max Requests Per Server       -1        Maximum number of requests to handle in a single server before
                                        exiting, -1 is infinite
Time to Live                  300       Idle timeout (seconds) for pooled server to live
Warm-up Protocol                        Path to initial protocol to run
Memory Threshold              80        Max % phys mem use by all proc’s before pruning
Individual Usage Threshold    15        Max % phys mem use by one proc before pruning
Web Job Launch Scalability Improvements
   Framework overhead on blocking, pooled jobs on 8 core Windows 2008 R2 (64 bit)
Web Job Launch Scalability Improvements Linux
For simple chemistry fetch of 10 records to JSON on 8 core RedHat Linux ES5 (64 bit)




Identical tests on Windows 2008 RS on identical hardware
Performance Tuning Document

• Guide available on Accelrys Forums
   – http://doc.accelrys.com/library/PipelinePilot/doc/performance
     _tuning.pdf
Public Cluster


                 Execute


                    Login
       Users



                                                 Secondary
  Clients                                        Pipeline
                                           NFS
                                                 Pilot Servers

                            Primary
                            Pipeline
                            Pilot Server

       Users


                                                           16
Private Cluster



                  Login
       Users

                   Execute

                                              Secondary
                                              Pipeline
       Users                                  Pilot Servers
                                        NFS
                             Primary
                             Pipeline
                             Pilot
       Users                 Server

  Clients
                                                        17
Grid (SGE, PBS, LSF, other)


                                              Grid
                Login                         software
       Users                                  and SOAP

                 Execute

                                                         Grid Nodes:
                                                         do not require
       Users                                             Apache HTTPD
                           Primary Pipeline Pilot
                                               NFS
                           server
                           and
                           grid submission server
       Users

  Clients
                                                                  18
IP-based Load Balancing 1


               Execute

                  Login
       Users



                                                                 XMLDB

  Clients
                                                             File share
                            Reverse Proxy
                                  or
                            IP-based Load                       Job Folders
                                            Symmetrical
                               Balancer     Pipeline Pilot
                                                                User Folders

       Users                                Server Nodes
                                            Shared Storage

                                                                     19
Summary

• What we learned
  –   Apache service and launching system
  –   Job launching and settings
  –   Process management for pooling
  –   How pooling has improved latency (snappiness)
  –   Clustering and grids
The information on the roadmap and future software development efforts are
intended to outline general product direction and should not be relied on in making
a purchasing decision.


For more information on the Accelrys Tech Summits and other IT & Developer
information, please visit:
https://community.accelrys.com/groups/it-dev

(ATS3-GS03) Accelrys Enterprise Platform Deeper Dive

  • 1.
    (ATS3-GS03) Accelrys Enterprise PlatformArchitecture Deeper Dive Jason Benedict Sr. Architect, Platform R&D jason.benedict@accelrys.com
  • 2.
    The information onthe roadmap and future software development efforts are intended to outline general product direction and should not be relied on in making a purchasing decision.
  • 3.
    Agenda • Core services and security • Job launching • Process management • Latency and scalability data • Clustering methods
  • 4.
    Accelrys Enterprise Platform SharePoint & Office 3rd Party Web Applications Thick Client Applications Applications Applications Web Application Framework Client Integration APIs MS Office Integration SOA Integration Accelrys Enterprise Platform Scientific and Generic Services Data Management Services Work Reports Experiment Instrument Request Workflow Interfaces Notebook ORACLE Docs Isentris Vault Experiment Scheduling Virtual Data Mining Design Chemistry & Analytics LIMS LEA Other Modeling Biology Registration Imaging & Simulation …
  • 5.
    Accelrys Enterprise PlatformIntegration Client Integration Build clients that connect to Pipeline Pilot and run protocol services. Web Browser Run Protocol .NET Client Java Client SOAP Client Professional Client Command Line Web Port JavaScript Client Client JavaScript Client .NET Client REST API Java Client SDK Web Services API SDK SDK Pipeline Pilot Enterprise Server Web Apps Web Services API Admin Portal Help Portal Grid System Integration (optional) Protocol Runtime Environment (scisvr) VB Script VB Script Run SOAP & Telnet / ODBC / (On (On Java Perl Python .NET SSH / SCP Program HTTP FTP JDBC Client) Server) Java Perl .NET VB Script Cmd Line Classes Scripts Classes Server Integration REST SOAP Cmd Line DBs Extend pipelines with new components that Service Service integrate your code, data and services.
  • 6.
    Pipeline Server Architecture Apache HTTP Server Authentication and Authorization Security Module Mod_balancer File Protocol runjob CGI Admin Portal Locator XMLDB Runner Logging Access Web Service Service Service Service WSDL CGI Help Portal Mod_proxy Service Services 1 .. 1 1 .. N Corporate Directory Data Flow Services Apache Tomcat DB’s Query Scheduler Service SOA File System CMS XMLDB User Data Job Data
  • 7.
    Launching Asynchronous (polling)Jobs Apache HTTP Server Authentication and Authorization Security Module 1. 2. 3. Runner Service 4. 5. Create job Need to fork Poll job status via Read result directory with scisvr? sts file file from disk compressed and return protocol.xml Monitor Job to client Write lck file to and uploaded existence via lck through lck directory input files file and process Apache JVM status scisvr(.exe) CLR Write sts file and results files to job directory Job Folder
  • 8.
    Launching Synchronous (blocking)Jobs Apache HTTP Server Authentication and Authorization Security Module 1. XMLDB 2. 3. Runner Service 4. 5. Get Need to fork Connect to scisvr pipe. Send notification Stream protocol scisvr? Send protocol XML and to apache via pipe results back XML from request parameters when done to Apache Protocol on pipe DB Write lck file to lck directory JVM scisvr(.exe) CLR XMLDB XMLDB
  • 9.
    Job Settings • Set Max Running Jobs to 2x available cores • Set Blocking Job timeout between 10-30 seconds, not more due to client starvation • Maximum Number of Parallel Processing is a guideline, not a strict maximum. Set to 2x cores • Set Maximum Job Daemons Per Pool to 2x available cores • Job Readiness Refresh Rate assists with multipurpose servers which can become “cold” • Read application specific recommendations for more details
  • 10.
    Process Management -Pools – Identified by __poolid=<name> parameter on request. – Needs to be sent from the client, not from the saved protocol – Latency of 20-200 ms – Creates a pool of scisvr.exe processes dedicated to that pool – Enables caching of expensive resources: • JVM • CLR • Database connections • Protocol DB Shortcuts and References
  • 11.
    Process Management –Pools w/ Impersonation – Impersonation create a small pool for each user for each pool – Lower the pool sizes to accommodate this behaviors
  • 12.
    Scisvr Pool Settings– Config files Setting Default Description Start Servers 0 Number of initial processes in this pool, created when apache starts Min Spare Servers 1 Min number of idle processes to keep alive Max Spare Servers 1 Max number of “available”processes to keep alive Max Spare Servers Trim Time 0 Time to wait (seconds) before pruning “Available” servers exceeding Max Spare Servers value Max Servers 16 The total number of servers to allow for this pool Max Queue Depth 32 maximum number of jobs to queue before rejected. Can be 0 or -1 for infinite Max Requests Per Server -1 Maximum number of requests to handle in a single server before exiting, -1 is infinite Time to Live 300 Idle timeout (seconds) for pooled server to live Warm-up Protocol Path to initial protocol to run Memory Threshold 80 Max % phys mem use by all proc’s before pruning Individual Usage Threshold 15 Max % phys mem use by one proc before pruning
  • 13.
    Web Job LaunchScalability Improvements Framework overhead on blocking, pooled jobs on 8 core Windows 2008 R2 (64 bit)
  • 14.
    Web Job LaunchScalability Improvements Linux For simple chemistry fetch of 10 records to JSON on 8 core RedHat Linux ES5 (64 bit) Identical tests on Windows 2008 RS on identical hardware
  • 15.
    Performance Tuning Document •Guide available on Accelrys Forums – http://doc.accelrys.com/library/PipelinePilot/doc/performance _tuning.pdf
  • 16.
    Public Cluster Execute Login Users Secondary Clients Pipeline NFS Pilot Servers Primary Pipeline Pilot Server Users 16
  • 17.
    Private Cluster Login Users Execute Secondary Pipeline Users Pilot Servers NFS Primary Pipeline Pilot Users Server Clients 17
  • 18.
    Grid (SGE, PBS,LSF, other) Grid Login software Users and SOAP Execute Grid Nodes: do not require Users Apache HTTPD Primary Pipeline Pilot NFS server and grid submission server Users Clients 18
  • 19.
    IP-based Load Balancing1 Execute Login Users XMLDB Clients File share Reverse Proxy or IP-based Load Job Folders Symmetrical Balancer Pipeline Pilot User Folders Users Server Nodes Shared Storage 19
  • 20.
    Summary • What welearned – Apache service and launching system – Job launching and settings – Process management for pooling – How pooling has improved latency (snappiness) – Clustering and grids
  • 21.
    The information onthe roadmap and future software development efforts are intended to outline general product direction and should not be relied on in making a purchasing decision. For more information on the Accelrys Tech Summits and other IT & Developer information, please visit: https://community.accelrys.com/groups/it-dev