(ATS6-PLAT06) Maximizing AEP Performance

(ATS6-PLAT06) Maximizing AEP
Performance
Steven Bush
R&D, AEP Core Infrastructure
steven.bush@accelrys.com

The information on the roadmap and future software development efforts are
intended to outline general product direction and should not be relied on in making
a purchasing decision.

Content
• Tuning for different types of protocols
• Quick protocols
– Protocol Job Pooling
• Using PoolIDs
• Database connection pooling
• Long protocols
– Profiling protocols
– Tuning parallel subprotocols
– Disk I/O
• Server specifications
– General guidelines
– Cluster, Grid, and Load balancing
• When is it right and how do you choose?

Short Running: General Guidelines
• Job Pooling and blocking requests
– Use Database connection sharing
• Report templates
– “HTML Template” or “Pilotscript” components
– Much faster
– Harder to maintain
– Ideal for reports that rarely change
• Pilotscript is faster than Java is faster than Perl
• Minimize disk I/O
• Hashmap values instead of “Join Data From …”
– Use Cache Persistence mode in SQL Select for each Data

Job Pooling
• Each job execution occurs in a single scisvr process
– Isolated memory
– One bad protocol cannot crash the server
• Without job pooling, each job spawns a new process
• With job pooling, jobs with the same pool ID can reuse
idle processes

Job Pooling Performance
• Prevent reloading system files and configuration data
• Reuse allocated memory
• Skip initialization
• Fast running protocols see substantial improvement
• Longer protocols do not see much improvement

Fast running protocol (0.1 seconds)
16 simultaneous clients against 8 core laptop

Longer running protocol (20 seconds)

ZOOMED: Longer running protocol (20 seconds)

Job Pooling Disadvantages
• Some components may not reinitialize correctly
– Can be difficult to track down these errors
• Stale resources can cause subsequent protocol failure
– Example: persistent DB connections that have timed out at the DB
• Ties up memory resources
– The AEP server manages this and will shut down job pools when memory
resources begin to get low
• Can tie up 3rd party licenses if they are not properly released
• Hard to get a good grasp of how much memory is really being used
• Not as useful for Windows servers with “full” impersonation

Job Pooling Memory limits
• Under heavy memory usage, pooled processes will shut
down
– 80% total RAM usage
– 15% total RAM usage for an individual process
– Example: A server has 8 GB of RAM
• Idle pooled processes will shut down when RAM usage reaches 6.4 GB
• If an individual idle process reaches 1.2 GB, it will shut down

Debugging
• http://<server>:<port>/scitegic/managepools?action=debug
– Shows each pool by ID.
• Configuration
• Processes that belong to the pool
– PID
– Owner (impersonation only)
– Number of times the server has executed jobs (including warm ups)
– State
• Queue
– Apache Process/Threads that are waiting for a server in this pool

Using Job Pooling From Clients
• 9.0:
– Set the __poolID parameter on the Implementation tab of
the top level protocol
– Share the same __poolID with related protocols

Using Job Pooling From Clients
• 8.5
– Pro Client
• Automatic based on jobID
– Create Protocol Link…
• Add __poolID as a parameter to your URL
– http://<server>:<port>/auth/launchjob?_protocol=ABC&__poolID=MyPool
– Reporting Forms
• Add __poolID using “Hidden Form Data”
– Protocol Function
• use “Application ID” or “Pool ID” parameters
– Web Port and Reporting Protocol Links
• Add __poolID as a parameter to your protocol
– Client SDKs
• Pass in __poolID as a parameter when you call the LaunchXXX() methods

• Connection Timeout
– Keeps the connection
open while scisvr is idle
– Supported by ODBC and
JDBC data sources
Database Connection Sharing

Report Templates
• Web applications should consider using templates.
– HTML Template component
• Uses Velocity template engine
– Pilotscript text processing
• Extremely fast
• Good for reports that rarely change format
– Faster, but harder to maintain
– Difficult to handle images
• Typical timings:
– Table component and Viewer: 1.5 seconds
– HTML Template and Viewer: 0.7 seconds
– Pilotscript text manipulation: 0.05 seconds
• Use the reporting collection to create the original report, then view the source
and convert to a template

Long Running: General Guidelines
• Profile protocols for bottlenecks using Ctrl-T timings
• Disk I/O Performance
– Consider improving network disk I/O
– Minimize large scale disk I/O
• Use parallel subprotocols to speed up slow sections
– Offload large calculations to additional servers
– Make use of clusters and grids to spread out processing
• Make remote requests asynchronous or batched when possible
• Download large datasets and process locally
• Create custom readers to minimize excess data reading
– Don’t read 100000 records only to use the first 10 records.

Component Performance Timings
• Displays either percentage or total time for each
component.
– Subprotocols display total time of internal components plus
overhead
• Press Control-T or Right-Click->Show Process Times
• Useful to track down bottlenecks
• Times are relatively accurate.
– In particular, timings on Linux are susceptible to discrepancies

Disk I/O
• Performance of your disk I/O has huge impact
• Linux: Consider switching from NFS to IBM’s GPFS
– Much more scalable
– Much faster
• Minimize large disk read/writes.

Parallel Subprotocols
• Allow parallel execution across multiple CPUs and multiple servers or
cluster/grid nodes
• Work by batching incoming data records and sending out to server list for
processing
• General guidelines:
– Each batch should take a minimum 10 seconds to see a performance benefit, the
longer the better!
– Overhead
• Serializing input and output data records
• 1-3 seconds per batch
• Launching
• Polling for completion
• Serialization of data records
– 2 processes per CPU as starting point

Parallel Subprotocol Mechanism
• Modifies and launches “Parallel Subprotocol Template”
• Input data records are serialized, then shipped to remote
server
• Data is deserialized, processed, then serialized again
• Shipped back to original server and deserialized
• 4 Cache read/write events!
– Avoid sending large data records
– Consider sending file references instead
– For instance: with Imaging collection

Parallel Subprotocol Debugging
• Most remote errors are swallowed up
• Look in <root>/logs/messages/scitegicerror_scisvr.log of
the remote server to see error stacks
• Run with “Debugging” option
– use Shift-Left click or Shift-F5
– Debugging messages will show errors and status from the
subprotocol batches

Server Guidelines
• Predict and analyze your usage
– Type of application
– Number of simulataneous users
• Good starting point
– 2 active jobs per CPU
– RAM: Minimum 1 GB per active job + 2 GB for system processes
– Local disk for temporary files
– GPFS instead of NFS

Deployment Options
• Single Server
– Multiple CPUs
– Ideal for most applications
• Cluster (Linux)
– Distributes individual protocols to remote nodes
– Simple grid
– Ideal for ad-hoc analysis servers that occasionally require heavy processing
• Slower launch times than single server.
• Better data processing scalability
• Grid (Linux)
– Queues individual protocols via 3rd party grid software
– Tested on OGE, PBS, LSF. Custom option is available
– Ideal for large scale processing with very long application run times
• Slowest launch times
• Best data processing scalability

Deployment Options
• Load Balanced (Windows and Linux)
– Multiple identical single servers with a 3rdparty HTTP proxy
– Each individual request is distributed
– Protocol DB is READ-ONLY
• All changes are made through packages
– Parallel subprotocols do NOT distribute across nodes
– Ideal for canned applications that have large numbers of users
• Launch times are comparable to single server
• High scalability and high availability
• NOT useful as an ad-hoc server
• Cannot be used to build models (due to read-only Protocol DB)

• Optimization of protocol performance is application dependent
• For fast running protocols
– look at Job Pooling and Report Templates
– Avoid checkpoints and caches
• For long running protocols
– Use component timings to profile
– Parallelize whenever possible
– Batch and asynchronous remote requests
– Configure Disk I/O for maximum performance
• Deployment options for different applications
Summary

(ATS6-PLAT06) Maximizing AEP Performance

More Related Content

What's hot

Similar to (ATS6-PLAT06) Maximizing AEP Performance

More from BIOVIA

Recently uploaded

(ATS6-PLAT06) Maximizing AEP Performance