All contents © MuleSoft Inc.
Rupesh Ramachandran
Roy Prins
Performance Tuning
All contents © MuleSoft Inc.
What we’ll cover today…
• What does Mule runtime performance look like
• Why does it perform so well
• How do you tune for high performance
• Testing Process
• Quick Wins
• Fine Tuning
All contents © MuleSoft Inc.
Mule runtime performance
• Real-world customer examples
• Financial institution
• ~100 million transactions per day
• On-premise
• Elastic scale
• Large fast food chain
• ~40 million transactions per day
• CloudHub
All contents © MuleSoft Inc.
Stability under Concurrency very important for API’s
Performance characteristics
4
All contents © MuleSoft Inc.
Batch jobs and bulk load still relevant
Performance characteristics
5
All contents © MuleSoft Inc.
Scalability
6
All contents © MuleSoft Inc. 7
All contents © MuleSoft Inc.
Why does it perform this well (technology)
8
• Non-blocking
architecture
• Plus many others
• Streaming
• Log4j2
• Optimized xpath,
jms, jetty
• kryo serialize
• etc
All contents © MuleSoft Inc.
Why does it perform this well (people, process)
World class performance engineering team
• Regression tests for every release
• Realistic customer projects and production traffic
• Profile and optimize new capabilities
• Pass on lessons learnt
• Performance Tuning Guide
• Optimized defaults for tunables
All contents © MuleSoft Inc.
Tuning Mule Runtime
All contents © MuleSoft Inc.
Define Performance Goals
•Memory Footprint
• Large payload steady state
• Less important on 64-bit
• Prevent swapping
•Startup Time
• Somewhat important (e.g. production)
• Microservices, containerization requiring elastic scale up/down
•Throughput
• Transactions processed over time (e.g. TPS)
• Almost always Important
•Responsiveness
• Latency, round trip time, user wait time, etc
• Very Important for API’s
• Common mistake: using single message latency as indication of machine
performance, does not account for concurrency overheads
You can only pick two!!
Minimize
Memory
Footprint
Minimize
Latency
Minimize
Overhead
All contents © MuleSoft Inc.
Performance Tuning Process
•Repeatability - use a Controlled Environment
•High performance test driver
•External dependencies
•Mock if susceptible to outside constraints
•Keep CPU utilization under 75%
•Find the inflection point
•Tune one parameter at a time
All contents © MuleSoft Inc.Performance Tunables
All contents © MuleSoft Inc.
Mule Performance Tunables - Overview
•Mule Design time tuning
•Mule flows
•Mule configuration settings
•Mule Runtime tuning
•JVM
•GC
•Infrastructure tuning
•OS
•Network
All contents © MuleSoft Inc.
Longevity
Tests
Scalability
Tests
Tune test
infra
Tune
backend
Mule
flow/config
JVM/GC
Iterative Test,
Monitor, Profile
Final Report
Performance Test Lifecycle
Take
baseline
Tune test
infra and
tools
Functional
Tests
Design Mule
Flows
All contents © MuleSoft Inc.
Quick Wins
All contents © MuleSoft Inc.
80/20 Rule
•Runtime tuning
–JVM
• Xmx, Xms
–GC
• CMS for API’s
–If large machines (cores/RAM), consider more than 1 Mule per
machine
•Infrastructure tuning
–ulimit –n
–File descriptors for high socket connections
wrapper.java.additional.16=-XX:NewSize=1536m
wrapper.java.additional.17=-XX:MaxNewSize=1536m
wrapper.java.additional.20=-Xms=2048m
wrapper.java.additional.21=-Xmx=2048m
# GC Tuning
wrapper.java.additional.22=-XX:+UseConcMarkSweepGC
wrapper.java.additional.23=-XX:CMSInitiatingOccupancyFraction=65
wrapper.java.additional.xz=-XX:UseCMSInitiatingOccupancyOnly
All contents © MuleSoft Inc.
Fine tuning Mule Runtime
All contents © MuleSoft Inc.
Design time tuning – Mule Flows
•Session Variables
• Fewer
• Smaller
•Payload Formats
• Java objects fastest
•Data Extraction
• MEL preferred to scripting languages
•Message Transformers
• DataWeave preferred option
• XSLT
• Java
All contents © MuleSoft Inc.
Design time tuning – Mule Flows
•Integration Patterns and performance implications
• Scatter-Gather
–Two or more independent data operations with a single source
–Results of the operations are combined
• Async scope
• Cache scope
• Transactional scope (cost)
• Stateful components (resequencer, idempotent, etc)
• Batch Module for bulk loads
All contents © MuleSoft Inc.
Design time tuning – Mule configurations
•HTTP Connections
• Don’t use Jetty connector
• Use latest runtime for best NIO performance
• NIO acceptor threads handoff to worker threads
• Acceptor threads default is #cores. Works well
• For worker threads tuning, adjust MaxThreadsActive on HTTP listener (default 128)
•HTTP Keep-alive
• For HTTP outbound, on by default
• Test driver or HTTP clients should use the same
•HTTPS
• Use latest TLS 1.1
All contents © MuleSoft Inc.
Design time tuning – Mule configurations
•Messaging
• Producer Mule flows
• JMS Connection Pooling (sessionCacheSize param)
• Consumer Mule flows
• JMS – ‘numberOfConsumers’
• AMQP – ‘numberOfChannels’
• JMS Server
• Disable message persistence if not needed
• Durable subscriber (cost)
• Message filtering (cost)
• VM Endpoints
• Flow references instead of VM endpoints within same app
• Unless HA save point required functionally
All contents © MuleSoft Inc.
Design time tuning – Mule configurations
•Threading
• Threading Profiles
• Synchronous vs Asynchronuos processing
• Thread Pools
•Logging
• Async logger vs sync
All contents © MuleSoft Inc.
Runtime tuning – Mind-map of 80/20 rule for GC tuning
24
Oracle
Hotspot (Deprecated in JDK 8. Use MetaSpaceSize
instead to bound native space use)
All contents © MuleSoft Inc. 25
All contents © MuleSoft Inc.
Thank you!
All contents © MuleSoft Inc.

Mule Runtime: Performance Tuning

  • 1.
    All contents ©MuleSoft Inc. Rupesh Ramachandran Roy Prins Performance Tuning
  • 2.
    All contents ©MuleSoft Inc. What we’ll cover today… • What does Mule runtime performance look like • Why does it perform so well • How do you tune for high performance • Testing Process • Quick Wins • Fine Tuning
  • 3.
    All contents ©MuleSoft Inc. Mule runtime performance • Real-world customer examples • Financial institution • ~100 million transactions per day • On-premise • Elastic scale • Large fast food chain • ~40 million transactions per day • CloudHub
  • 4.
    All contents ©MuleSoft Inc. Stability under Concurrency very important for API’s Performance characteristics 4
  • 5.
    All contents ©MuleSoft Inc. Batch jobs and bulk load still relevant Performance characteristics 5
  • 6.
    All contents ©MuleSoft Inc. Scalability 6
  • 7.
    All contents ©MuleSoft Inc. 7
  • 8.
    All contents ©MuleSoft Inc. Why does it perform this well (technology) 8 • Non-blocking architecture • Plus many others • Streaming • Log4j2 • Optimized xpath, jms, jetty • kryo serialize • etc
  • 9.
    All contents ©MuleSoft Inc. Why does it perform this well (people, process) World class performance engineering team • Regression tests for every release • Realistic customer projects and production traffic • Profile and optimize new capabilities • Pass on lessons learnt • Performance Tuning Guide • Optimized defaults for tunables
  • 10.
    All contents ©MuleSoft Inc. Tuning Mule Runtime
  • 11.
    All contents ©MuleSoft Inc. Define Performance Goals •Memory Footprint • Large payload steady state • Less important on 64-bit • Prevent swapping •Startup Time • Somewhat important (e.g. production) • Microservices, containerization requiring elastic scale up/down •Throughput • Transactions processed over time (e.g. TPS) • Almost always Important •Responsiveness • Latency, round trip time, user wait time, etc • Very Important for API’s • Common mistake: using single message latency as indication of machine performance, does not account for concurrency overheads You can only pick two!! Minimize Memory Footprint Minimize Latency Minimize Overhead
  • 12.
    All contents ©MuleSoft Inc. Performance Tuning Process •Repeatability - use a Controlled Environment •High performance test driver •External dependencies •Mock if susceptible to outside constraints •Keep CPU utilization under 75% •Find the inflection point •Tune one parameter at a time
  • 13.
    All contents ©MuleSoft Inc.Performance Tunables
  • 14.
    All contents ©MuleSoft Inc. Mule Performance Tunables - Overview •Mule Design time tuning •Mule flows •Mule configuration settings •Mule Runtime tuning •JVM •GC •Infrastructure tuning •OS •Network
  • 15.
    All contents ©MuleSoft Inc. Longevity Tests Scalability Tests Tune test infra Tune backend Mule flow/config JVM/GC Iterative Test, Monitor, Profile Final Report Performance Test Lifecycle Take baseline Tune test infra and tools Functional Tests Design Mule Flows
  • 16.
    All contents ©MuleSoft Inc. Quick Wins
  • 17.
    All contents ©MuleSoft Inc. 80/20 Rule •Runtime tuning –JVM • Xmx, Xms –GC • CMS for API’s –If large machines (cores/RAM), consider more than 1 Mule per machine •Infrastructure tuning –ulimit –n –File descriptors for high socket connections wrapper.java.additional.16=-XX:NewSize=1536m wrapper.java.additional.17=-XX:MaxNewSize=1536m wrapper.java.additional.20=-Xms=2048m wrapper.java.additional.21=-Xmx=2048m # GC Tuning wrapper.java.additional.22=-XX:+UseConcMarkSweepGC wrapper.java.additional.23=-XX:CMSInitiatingOccupancyFraction=65 wrapper.java.additional.xz=-XX:UseCMSInitiatingOccupancyOnly
  • 18.
    All contents ©MuleSoft Inc. Fine tuning Mule Runtime
  • 19.
    All contents ©MuleSoft Inc. Design time tuning – Mule Flows •Session Variables • Fewer • Smaller •Payload Formats • Java objects fastest •Data Extraction • MEL preferred to scripting languages •Message Transformers • DataWeave preferred option • XSLT • Java
  • 20.
    All contents ©MuleSoft Inc. Design time tuning – Mule Flows •Integration Patterns and performance implications • Scatter-Gather –Two or more independent data operations with a single source –Results of the operations are combined • Async scope • Cache scope • Transactional scope (cost) • Stateful components (resequencer, idempotent, etc) • Batch Module for bulk loads
  • 21.
    All contents ©MuleSoft Inc. Design time tuning – Mule configurations •HTTP Connections • Don’t use Jetty connector • Use latest runtime for best NIO performance • NIO acceptor threads handoff to worker threads • Acceptor threads default is #cores. Works well • For worker threads tuning, adjust MaxThreadsActive on HTTP listener (default 128) •HTTP Keep-alive • For HTTP outbound, on by default • Test driver or HTTP clients should use the same •HTTPS • Use latest TLS 1.1
  • 22.
    All contents ©MuleSoft Inc. Design time tuning – Mule configurations •Messaging • Producer Mule flows • JMS Connection Pooling (sessionCacheSize param) • Consumer Mule flows • JMS – ‘numberOfConsumers’ • AMQP – ‘numberOfChannels’ • JMS Server • Disable message persistence if not needed • Durable subscriber (cost) • Message filtering (cost) • VM Endpoints • Flow references instead of VM endpoints within same app • Unless HA save point required functionally
  • 23.
    All contents ©MuleSoft Inc. Design time tuning – Mule configurations •Threading • Threading Profiles • Synchronous vs Asynchronuos processing • Thread Pools •Logging • Async logger vs sync
  • 24.
    All contents ©MuleSoft Inc. Runtime tuning – Mind-map of 80/20 rule for GC tuning 24 Oracle Hotspot (Deprecated in JDK 8. Use MetaSpaceSize instead to bound native space use)
  • 25.
    All contents ©MuleSoft Inc. 25
  • 26.
    All contents ©MuleSoft Inc. Thank you!
  • 27.
    All contents ©MuleSoft Inc.