SlideShare a Scribd company logo
1 of 51
Mykhailo Drach
Andrii Davydenko
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Presenters
Mykhailo Drach
Hybris Software Engineer @EPAM
Andrii Davydenko
Senior Performance Analyst @EPAM
2
1
3
Performance
metrics
This part introduces performance metrics. As defining
performance criteria and desirable parameters is the
starting point of performance improvement process.
Performance
monitoring
This part tells us about continuous monitoring of
enterprise Java applications.
Performance
improvement
process
This part is about optimization and tuning process of Java
applications.
Agenda
Performance metrics
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Response time Responsiveness Load Capacity Latency Throughput Efficiency Scalability
Response time
Response time is amount of time system takes to react to user
request
Response time metrics:
• Peak value
• Average, percentile value
• Error rate
Reference: Patterns of EAA by Martin Fowler
Percentile calculation
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Response time Responsiveness Load Capacity Latency Throughput Efficiency Scalability
Responsiveness
Responsiveness is about how quickly the system loads part of
response data. If your system waits during the whole request, then your
responsiveness and response time are the same. The earlier user can
interact with your site the better responsiveness it has.
Poor responsiveness Good responsiveness
Examples:
Reference: Patterns of EAA by Martin Fowler
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Response time Responsiveness Load Capacity Latency Throughput Efficiency Scalability
Load
Load is a statement of how much stress a system is under, which
might be measured in how many clients are currently connected to it.
The load is usually a context for some other measurement, such as a
response time. Thus, you may say that the response time for some
request is 0.5 seconds with 10 users and 2 seconds with 20 users.
Examples:
• 1000 Users are currently connected to the server
• 200 Active user sessions
Reference: Patterns of EAA by Martin Fowler
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Response time Responsiveness Load Throughput Capacity Latency Efficiency Scalability
Throughput
Throughput is how much stuff you can do in a given amount of
time. If you’re timing the copying of a file, throughput might be
measured in bytes per second. For enterprise applications a typical
measure is transactions per second (tps), but the problem is that this
depends on the complexity of your transaction.
Examples:
• Server can place 100 orders/min
• DB can commit 100 transaction/s
Reference: Patterns of EAA by Martin Fowler
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Response time Responsiveness Load Throughput Capacity Latency Efficiency Scalability
Capacity
The capacity of a system is an indication of maximum effective
throughput or load. This might be an absolute maximum or a point at
which the performance dips below an acceptable threshold.
Examples:
• Server capacity is 10000 active user sessions
• Server capacity is 2000 active third party clients
Reference: Patterns of EAA by Martin Fowler
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Response time Responsiveness Load Throughput Capacity Latency Efficiency Scalability
Latency
Latency is the minimum time required to get any form of response,
even if the work to be done is nonexistent. It’s usually the big issue in
remote systems. As an application developer, you can decrease latency
impact minimizing remote calls.
Examples:
• Ping to server is 50 ms
Reference: Patterns of EAA by Martin Fowler
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Response time Responsiveness Load Capacity Latency Throughput Efficiency Scalability
Efficiency
Efficiency is performance divided by resources.
Examples:
• A system with maximum throughput in 30 tps on two CPUs is more efficient than a
system that processes 40 tps on four identical CPUs.
• A system with 1000 active user`s sessions and 10 GB Ram is more efficient
than system with 1200 active user`s sessions and 15 GB Ram
Reference: Patterns of EAA by Martin Fowler
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Response time Responsiveness Load Capacity Latency Throughput Efficiency Scalability
Scalability
Scalability is a measure of how adding resources (usually hardware)
affects performance. A scalable system is one that allows you to add
hardware and get a performance improvement.
Scalability types:
• Vertical scalability, or scaling up, means adding more power to a
single server, such as more memory.
• Horizontal scalability, or scaling out, means adding more servers.
Reference: Patterns of EAA by Martin Fowler
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Performance monitoring
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Roadmap
Performance testing matters. E-Commerce
Preparation fot testing
Toolset
Typical issues
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Performance testing matters
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Market growth
• 10% increase in 10 years
• >56% of dollar gain made online
• Switching to mobile is growing
• Impact of COVID-19 lockdown
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Why performance test
• Cost of performance
issues, reputational risks
• Disaster recovery plan
• Cost optimization
• Building a culture of
performance focused development
• Performance Testing Under CI/CD
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Preparation for testing
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
NFRs & Load model
• Capacity in terms of users/tasks
• Response/processing time, aggregation
type
• Infrastructure setup
• Resource consumption
• Error conditions
• Scalability
• Review PROD logs,
Adobe/Google analytics, business data
• New product: business data, competitors
• Distribution in user types, devices on
client side
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Diff
Percentiles
• allow to understand the
distribution
• perfect for automatic
baselining
• better for tuning and
defining a goal for
optimizations
*** A Bell curve describes
simplified distribution
when average and median
response times are equal.
Almost impossible in real
life.
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Estimate capacity
Active users
DB CPU
Hits/sec
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Performance environment
Best case
• Scaled as PROD
• Integrations in place
• CDN not ignored
• Data restored from PROD with
masked data, refreshed on
regular basis
• New product: generated volume
to match expectation
Worst case
• Development machine hosting
PERF, load generators, updating
Facebook, etc.
• Minimum of data
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Toolset
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
What we can use
Backend
• JMeter, Tsung
• TeraVM, LoadRunner
Frontend
• JMeter+WebDriver
• Sitespeed.io
Analysers
• DevTools
• Lighthouse, HttpWatch
• Wireshark, Fiddler
APMs
• Dynatrace
• NewRelic
• AppDynamics
Profilers
• Jet Profiler
• JProfiler
• Etc.
hac
• Cache
• Thread Dump
• Database
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Hybris typical issues
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Important notes
• Cart creation, cleanup strategy,
recalculations, merging,
• FlexibleSearch caching, eviction
strategy
• Cluster configuration: tasks
engine, special node groups
• Page fragment caching *
• Table locking *
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Cronjob related issues
• Executed on dedicated node group
• Setting up sequential cronjobs when
applicable, plan enough buffer time
• Scheduling at non-peak hours
• Assess impact of 3rd party services
• Prioritize list you have
• Enough data to process during tests
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
BO <bombs_away> search
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Response time and GC issues
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Performance improvement
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Software improvement types
• Software optimization
• Optimization - identifying and eliminating internal
inefficient designs and implementations
• Software tuning
• establishing the optimal setting for every possible
external configuration parameter
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Before starting performance improvement process, we should define
desired performance results.
How to do this:
• Analyze business
• Ask stakeholders
• Find best practices
Defining desired params
Defining desired
params
Config
tuning
Code
optimization
Memory
analyzing
Profiling Scaling
Desired
performance
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Application performance can be improved by tuning jvm properties.
Configs to look at:
• Memory size
• GC algorithm
• ThreadPool size
• Caching
Application config tuning
Defining desired
params
Config
tuning
Code
optimization
Memory
analyzing
Profiling Scaling
Desired
performance
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Heap size is one of the most important parameters. We have to define best gc
time – performance value.
Parameters : -Xms – initial heap size -Xmx – maximum heap size
Oracle recommends setting –Xms equal to –Xmx to reduce garbage collections
Heap size
Defining desired
params
Config
tuning
Code
optimization
Memory
analyzing
Profiling Scaling
Desired
performance
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Thread pools have a minimum and maximum number of threads. The
minimum number of threads is kept around, waiting for tasks to be assigned to
them. The maximum number of threads also serves as a necessary throttle,
preventing too many tasks from executing at once.
ThreadPool size parameters
Defining desired
params
Config
tuning
Code
optimization
Memory
analyzing
Profiling Scaling
Desired
performance
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
The most used GC algorithm:
• The serial garbage collector
• The throughput collector
• The CMS collector
• The G1 collector
GC algorithm
Defining desired
params
Config
tuning
Code
optimization
Memory
analyzing
Profiling Scaling
Desired
performance
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Code optimization is the most important and biggest part of performance
improvement process. It means that you need to redesign and reimplement some
part of code.
Examples:
• SQL Queries
• Concurrency
• Looking for rid operations
• Reducing external calls
Code optimization
Defining desired
params
Config
tuning
Code
optimization
Memory
analyzing
Profiling Scaling
Desired
performance
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
DB Querying in large systems can be the biggest hotspot. To avoid it we should
carefully investigate each query and tune it in order to improve performance.
What can be improved?
• Joins
• Indexes
• Generated queries analysis
Code optimization: DB Queries
Defining desired
params
Config
tuning
Code
optimization
Memory
analyzing
Profiling Scaling
Desired
performance
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
In relational databases all data is stored in tables (relations). In most case to get
data we have to join tables.
How to improve joins?
• Indexes
• Views
Code optimization: DB table joins
Defining desired
params
Config
tuning
Code
optimization
Memory
analyzing
Profiling Scaling
Desired
performance
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
To improve SQL lookup and joins we can use indexes. Index is a special lookup
table that stores data into tree-based form.
Code optimization: DB table indexes
Defining desired
params
Config
tuning
Code
optimization
Memory
analyzing
Profiling Scaling
Desired
performance
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Pros:
• It speeds up join and data lookup
• Data lookup – O(log2N)
Cons:
• It slows down Create, Update and Delete operations
• It extends data size
Code optimization: DB table indexes
Defining desired
params
Config
tuning
Code
optimization
Memory
analyzing
Profiling Scaling
Desired
performance
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Java 8 Stream API is a convenient way to support a functional approach to
processing collections of objects. In addition it can speedup data processing.
Java 8 Stream API provides possibility to parallelize data processing with quite
simple API.
Code optimization: Prefer Streams over loops
Defining desired
params
Config
tuning
Code
optimization
Memory
analyzing
Profiling Scaling
Desired
performance
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
If you use ORM in your Java application then it keeps in the session a version
of the entities already persisted, just in case they are modified again before
session was closed. It fills memory and can end with OutOfMemoryError. To
fix it you can either avoid long running ORM sessions or clear
PersistenceContext from time to time.
Code optimization: Long running ORM sessions
Defining desired
params
Config
tuning
Code
optimization
Memory
analyzing
Profiling Scaling
Desired
performance
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Memory analyzing is the process of defining main application memory
parameters. It includes leaks finding
Tools for monitoring:
Memory analyzing
Defining desired
params
Config
tuning
Code
optimization
Memory
analyzing
Profiling Scaling
Desired
performance
Java VisualVm Eclipse memory analyzing tool
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Application profiling
Stackify Prefix
JProfiler YourKit
Java Profiling tools
Defining desired
params
Config
tuning
Code
optimization
Memory
analyzing
Profiling Scaling
Desired
performance
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Software profiling is a form of dynamic program analysis that measures,
for example, the space (memory) or time complexity of a program, the frequency
and duration of function calls.
Types of profiling:
• Sampling
• Instrumental
Application profiling
Defining desired
params
Config
tuning
Code
optimization
Memory
analyzing
Profiling Scaling
Desired
performance
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Adding more hardware resources improves performance. But you should make
code optimization and config tuning before.
Scaling application resources
Defining desired
params
Config
tuning
Code
optimization
Memory
analyzing
Profiling Scaling
Desired
performance
CONFIDENTIAL | © 2020 EPAM Systems, Inc.
Desired performance
Defining desired
params
Config
tuning
Code
optimization
Memory
analyzing
Profiling Scaling
Desired
performance
If you finished all previous steps then your
performance should be at desired level.
What to do next to keep it working well?
Next steps:
• Looking for performance issues in code
reviews
• Looking for possible performance issues
implementing new features
• Continuous performance testing
DEMO
Q&A
THE END

More Related Content

What's hot

What's hot (19)

Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
 
containerd the universal container runtime
containerd the universal container runtimecontainerd the universal container runtime
containerd the universal container runtime
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Tìm hiểu và triển khai ứng dụng Web với Kubernetes
Tìm hiểu và triển khai ứng dụng Web với KubernetesTìm hiểu và triển khai ứng dụng Web với Kubernetes
Tìm hiểu và triển khai ứng dụng Web với Kubernetes
 
presentation on Docker
presentation on Dockerpresentation on Docker
presentation on Docker
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Kubernetes Internals
Kubernetes InternalsKubernetes Internals
Kubernetes Internals
 
Containers technologies
Containers technologiesContainers technologies
Containers technologies
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
Alfresco node lifecyle, services and zones
Alfresco node lifecyle, services and zonesAlfresco node lifecyle, services and zones
Alfresco node lifecyle, services and zones
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetes
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelines
 
Apache pulsar - storage architecture
Apache pulsar - storage architectureApache pulsar - storage architecture
Apache pulsar - storage architecture
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
[오픈소스컨설팅]오픈스택에 대하여
[오픈소스컨설팅]오픈스택에 대하여[오픈소스컨설팅]오픈스택에 대하여
[오픈소스컨설팅]오픈스택에 대하여
 
GitOps on Kubernetes with Carvel
GitOps on Kubernetes with CarvelGitOps on Kubernetes with Carvel
GitOps on Kubernetes with Carvel
 
KubeCon EU 2016: Kubernetes Storage 101
KubeCon EU 2016: Kubernetes Storage 101KubeCon EU 2016: Kubernetes Storage 101
KubeCon EU 2016: Kubernetes Storage 101
 
Apache Pulsar First Overview
Apache PulsarFirst OverviewApache PulsarFirst Overview
Apache Pulsar First Overview
 
Introduction to Testcontainers
Introduction to TestcontainersIntroduction to Testcontainers
Introduction to Testcontainers
 

Similar to Java/Hybris performance monitoring and optimization

IBM InterConnect 2013 Expert Integrated Systems Keynote: Sotiropoulos & Wieck
IBM InterConnect 2013 Expert Integrated Systems Keynote: Sotiropoulos & WieckIBM InterConnect 2013 Expert Integrated Systems Keynote: Sotiropoulos & Wieck
IBM InterConnect 2013 Expert Integrated Systems Keynote: Sotiropoulos & Wieck
IBM Events
 

Similar to Java/Hybris performance monitoring and optimization (20)

Performance testing
Performance testingPerformance testing
Performance testing
 
Adding Value in the Cloud with Performance Test
Adding Value in the Cloud with Performance TestAdding Value in the Cloud with Performance Test
Adding Value in the Cloud with Performance Test
 
Migration to the cloud
Migration to the cloudMigration to the cloud
Migration to the cloud
 
Neev Load Testing Services
Neev Load Testing ServicesNeev Load Testing Services
Neev Load Testing Services
 
Storage Resource Optimization Delivers “Best Fit” Resources for Your Applicat...
Storage Resource Optimization Delivers “Best Fit” Resources for Your Applicat...Storage Resource Optimization Delivers “Best Fit” Resources for Your Applicat...
Storage Resource Optimization Delivers “Best Fit” Resources for Your Applicat...
 
Designing a Modern Disaster Recovery Environment
Designing a Modern Disaster Recovery EnvironmentDesigning a Modern Disaster Recovery Environment
Designing a Modern Disaster Recovery Environment
 
Designing a Modern Disaster Recovery Environment
Designing a Modern Disaster Recovery EnvironmentDesigning a Modern Disaster Recovery Environment
Designing a Modern Disaster Recovery Environment
 
Application Acceleration: Faster Performance for End Users
Application Acceleration: Faster Performance for End Users	Application Acceleration: Faster Performance for End Users
Application Acceleration: Faster Performance for End Users
 
JMeter
JMeterJMeter
JMeter
 
1457 - Reviewing Experiences from the PureExperience Program
1457 - Reviewing Experiences from the PureExperience Program1457 - Reviewing Experiences from the PureExperience Program
1457 - Reviewing Experiences from the PureExperience Program
 
Using IBM Rational Change as an Enterprise-Wide Error Management Solution – ...
 Using IBM Rational Change as an Enterprise-Wide Error Management Solution – ... Using IBM Rational Change as an Enterprise-Wide Error Management Solution – ...
Using IBM Rational Change as an Enterprise-Wide Error Management Solution – ...
 
Fundamentals Performance Testing
Fundamentals Performance TestingFundamentals Performance Testing
Fundamentals Performance Testing
 
Performance Testing
Performance TestingPerformance Testing
Performance Testing
 
IBM InterConnect 2013 Expert Integrated Systems Keynote: Sotiropoulos & Wieck
IBM InterConnect 2013 Expert Integrated Systems Keynote: Sotiropoulos & WieckIBM InterConnect 2013 Expert Integrated Systems Keynote: Sotiropoulos & Wieck
IBM InterConnect 2013 Expert Integrated Systems Keynote: Sotiropoulos & Wieck
 
Storage Sizing for SAP
Storage Sizing for SAPStorage Sizing for SAP
Storage Sizing for SAP
 
Machine Learning for Capacity Management
 Machine Learning for Capacity Management Machine Learning for Capacity Management
Machine Learning for Capacity Management
 
(BDT316) Offloading ETL to Amazon Elastic MapReduce
(BDT316) Offloading ETL to Amazon Elastic MapReduce(BDT316) Offloading ETL to Amazon Elastic MapReduce
(BDT316) Offloading ETL to Amazon Elastic MapReduce
 
SAP HANA on Power
SAP HANA on PowerSAP HANA on Power
SAP HANA on Power
 
Software Defined Infrastructure
Software Defined InfrastructureSoftware Defined Infrastructure
Software Defined Infrastructure
 
IBM Capacity Management Analytics
IBM Capacity Management AnalyticsIBM Capacity Management Analytics
IBM Capacity Management Analytics
 

Recently uploaded

Recently uploaded (20)

How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 

Java/Hybris performance monitoring and optimization

  • 2. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Presenters Mykhailo Drach Hybris Software Engineer @EPAM Andrii Davydenko Senior Performance Analyst @EPAM
  • 3. 2 1 3 Performance metrics This part introduces performance metrics. As defining performance criteria and desirable parameters is the starting point of performance improvement process. Performance monitoring This part tells us about continuous monitoring of enterprise Java applications. Performance improvement process This part is about optimization and tuning process of Java applications. Agenda
  • 5. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Response time Responsiveness Load Capacity Latency Throughput Efficiency Scalability Response time Response time is amount of time system takes to react to user request Response time metrics: • Peak value • Average, percentile value • Error rate Reference: Patterns of EAA by Martin Fowler Percentile calculation
  • 6. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Response time Responsiveness Load Capacity Latency Throughput Efficiency Scalability Responsiveness Responsiveness is about how quickly the system loads part of response data. If your system waits during the whole request, then your responsiveness and response time are the same. The earlier user can interact with your site the better responsiveness it has. Poor responsiveness Good responsiveness Examples: Reference: Patterns of EAA by Martin Fowler
  • 7. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Response time Responsiveness Load Capacity Latency Throughput Efficiency Scalability Load Load is a statement of how much stress a system is under, which might be measured in how many clients are currently connected to it. The load is usually a context for some other measurement, such as a response time. Thus, you may say that the response time for some request is 0.5 seconds with 10 users and 2 seconds with 20 users. Examples: • 1000 Users are currently connected to the server • 200 Active user sessions Reference: Patterns of EAA by Martin Fowler
  • 8. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Response time Responsiveness Load Throughput Capacity Latency Efficiency Scalability Throughput Throughput is how much stuff you can do in a given amount of time. If you’re timing the copying of a file, throughput might be measured in bytes per second. For enterprise applications a typical measure is transactions per second (tps), but the problem is that this depends on the complexity of your transaction. Examples: • Server can place 100 orders/min • DB can commit 100 transaction/s Reference: Patterns of EAA by Martin Fowler
  • 9. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Response time Responsiveness Load Throughput Capacity Latency Efficiency Scalability Capacity The capacity of a system is an indication of maximum effective throughput or load. This might be an absolute maximum or a point at which the performance dips below an acceptable threshold. Examples: • Server capacity is 10000 active user sessions • Server capacity is 2000 active third party clients Reference: Patterns of EAA by Martin Fowler
  • 10. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Response time Responsiveness Load Throughput Capacity Latency Efficiency Scalability Latency Latency is the minimum time required to get any form of response, even if the work to be done is nonexistent. It’s usually the big issue in remote systems. As an application developer, you can decrease latency impact minimizing remote calls. Examples: • Ping to server is 50 ms Reference: Patterns of EAA by Martin Fowler
  • 11. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Response time Responsiveness Load Capacity Latency Throughput Efficiency Scalability Efficiency Efficiency is performance divided by resources. Examples: • A system with maximum throughput in 30 tps on two CPUs is more efficient than a system that processes 40 tps on four identical CPUs. • A system with 1000 active user`s sessions and 10 GB Ram is more efficient than system with 1200 active user`s sessions and 15 GB Ram Reference: Patterns of EAA by Martin Fowler
  • 12. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Response time Responsiveness Load Capacity Latency Throughput Efficiency Scalability Scalability Scalability is a measure of how adding resources (usually hardware) affects performance. A scalable system is one that allows you to add hardware and get a performance improvement. Scalability types: • Vertical scalability, or scaling up, means adding more power to a single server, such as more memory. • Horizontal scalability, or scaling out, means adding more servers. Reference: Patterns of EAA by Martin Fowler
  • 13. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Performance monitoring
  • 14. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Roadmap Performance testing matters. E-Commerce Preparation fot testing Toolset Typical issues
  • 15. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Performance testing matters
  • 16. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Market growth • 10% increase in 10 years • >56% of dollar gain made online • Switching to mobile is growing • Impact of COVID-19 lockdown
  • 17. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Why performance test • Cost of performance issues, reputational risks • Disaster recovery plan • Cost optimization • Building a culture of performance focused development • Performance Testing Under CI/CD
  • 18. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Preparation for testing
  • 19. CONFIDENTIAL | © 2020 EPAM Systems, Inc. NFRs & Load model • Capacity in terms of users/tasks • Response/processing time, aggregation type • Infrastructure setup • Resource consumption • Error conditions • Scalability • Review PROD logs, Adobe/Google analytics, business data • New product: business data, competitors • Distribution in user types, devices on client side
  • 20. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Diff Percentiles • allow to understand the distribution • perfect for automatic baselining • better for tuning and defining a goal for optimizations *** A Bell curve describes simplified distribution when average and median response times are equal. Almost impossible in real life.
  • 21. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Estimate capacity Active users DB CPU Hits/sec
  • 22. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Performance environment Best case • Scaled as PROD • Integrations in place • CDN not ignored • Data restored from PROD with masked data, refreshed on regular basis • New product: generated volume to match expectation Worst case • Development machine hosting PERF, load generators, updating Facebook, etc. • Minimum of data
  • 23. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Toolset
  • 24. CONFIDENTIAL | © 2020 EPAM Systems, Inc. What we can use Backend • JMeter, Tsung • TeraVM, LoadRunner Frontend • JMeter+WebDriver • Sitespeed.io Analysers • DevTools • Lighthouse, HttpWatch • Wireshark, Fiddler APMs • Dynatrace • NewRelic • AppDynamics Profilers • Jet Profiler • JProfiler • Etc. hac • Cache • Thread Dump • Database
  • 25. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Hybris typical issues
  • 26. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Important notes • Cart creation, cleanup strategy, recalculations, merging, • FlexibleSearch caching, eviction strategy • Cluster configuration: tasks engine, special node groups • Page fragment caching * • Table locking *
  • 27. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Cronjob related issues • Executed on dedicated node group • Setting up sequential cronjobs when applicable, plan enough buffer time • Scheduling at non-peak hours • Assess impact of 3rd party services • Prioritize list you have • Enough data to process during tests
  • 28. CONFIDENTIAL | © 2020 EPAM Systems, Inc. BO <bombs_away> search
  • 29. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Response time and GC issues
  • 30. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Performance improvement
  • 31. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Software improvement types • Software optimization • Optimization - identifying and eliminating internal inefficient designs and implementations • Software tuning • establishing the optimal setting for every possible external configuration parameter
  • 32. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Before starting performance improvement process, we should define desired performance results. How to do this: • Analyze business • Ask stakeholders • Find best practices Defining desired params Defining desired params Config tuning Code optimization Memory analyzing Profiling Scaling Desired performance
  • 33. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Application performance can be improved by tuning jvm properties. Configs to look at: • Memory size • GC algorithm • ThreadPool size • Caching Application config tuning Defining desired params Config tuning Code optimization Memory analyzing Profiling Scaling Desired performance
  • 34. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Heap size is one of the most important parameters. We have to define best gc time – performance value. Parameters : -Xms – initial heap size -Xmx – maximum heap size Oracle recommends setting –Xms equal to –Xmx to reduce garbage collections Heap size Defining desired params Config tuning Code optimization Memory analyzing Profiling Scaling Desired performance
  • 35. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Thread pools have a minimum and maximum number of threads. The minimum number of threads is kept around, waiting for tasks to be assigned to them. The maximum number of threads also serves as a necessary throttle, preventing too many tasks from executing at once. ThreadPool size parameters Defining desired params Config tuning Code optimization Memory analyzing Profiling Scaling Desired performance
  • 36. CONFIDENTIAL | © 2020 EPAM Systems, Inc. The most used GC algorithm: • The serial garbage collector • The throughput collector • The CMS collector • The G1 collector GC algorithm Defining desired params Config tuning Code optimization Memory analyzing Profiling Scaling Desired performance
  • 37. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Code optimization is the most important and biggest part of performance improvement process. It means that you need to redesign and reimplement some part of code. Examples: • SQL Queries • Concurrency • Looking for rid operations • Reducing external calls Code optimization Defining desired params Config tuning Code optimization Memory analyzing Profiling Scaling Desired performance
  • 38. CONFIDENTIAL | © 2020 EPAM Systems, Inc. DB Querying in large systems can be the biggest hotspot. To avoid it we should carefully investigate each query and tune it in order to improve performance. What can be improved? • Joins • Indexes • Generated queries analysis Code optimization: DB Queries Defining desired params Config tuning Code optimization Memory analyzing Profiling Scaling Desired performance
  • 39. CONFIDENTIAL | © 2020 EPAM Systems, Inc. In relational databases all data is stored in tables (relations). In most case to get data we have to join tables. How to improve joins? • Indexes • Views Code optimization: DB table joins Defining desired params Config tuning Code optimization Memory analyzing Profiling Scaling Desired performance
  • 40. CONFIDENTIAL | © 2020 EPAM Systems, Inc. To improve SQL lookup and joins we can use indexes. Index is a special lookup table that stores data into tree-based form. Code optimization: DB table indexes Defining desired params Config tuning Code optimization Memory analyzing Profiling Scaling Desired performance
  • 41. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Pros: • It speeds up join and data lookup • Data lookup – O(log2N) Cons: • It slows down Create, Update and Delete operations • It extends data size Code optimization: DB table indexes Defining desired params Config tuning Code optimization Memory analyzing Profiling Scaling Desired performance
  • 42. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Java 8 Stream API is a convenient way to support a functional approach to processing collections of objects. In addition it can speedup data processing. Java 8 Stream API provides possibility to parallelize data processing with quite simple API. Code optimization: Prefer Streams over loops Defining desired params Config tuning Code optimization Memory analyzing Profiling Scaling Desired performance
  • 43. CONFIDENTIAL | © 2020 EPAM Systems, Inc. If you use ORM in your Java application then it keeps in the session a version of the entities already persisted, just in case they are modified again before session was closed. It fills memory and can end with OutOfMemoryError. To fix it you can either avoid long running ORM sessions or clear PersistenceContext from time to time. Code optimization: Long running ORM sessions Defining desired params Config tuning Code optimization Memory analyzing Profiling Scaling Desired performance
  • 44. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Memory analyzing is the process of defining main application memory parameters. It includes leaks finding Tools for monitoring: Memory analyzing Defining desired params Config tuning Code optimization Memory analyzing Profiling Scaling Desired performance Java VisualVm Eclipse memory analyzing tool
  • 45. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Application profiling Stackify Prefix JProfiler YourKit Java Profiling tools Defining desired params Config tuning Code optimization Memory analyzing Profiling Scaling Desired performance
  • 46. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Software profiling is a form of dynamic program analysis that measures, for example, the space (memory) or time complexity of a program, the frequency and duration of function calls. Types of profiling: • Sampling • Instrumental Application profiling Defining desired params Config tuning Code optimization Memory analyzing Profiling Scaling Desired performance
  • 47. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Adding more hardware resources improves performance. But you should make code optimization and config tuning before. Scaling application resources Defining desired params Config tuning Code optimization Memory analyzing Profiling Scaling Desired performance
  • 48. CONFIDENTIAL | © 2020 EPAM Systems, Inc. Desired performance Defining desired params Config tuning Code optimization Memory analyzing Profiling Scaling Desired performance If you finished all previous steps then your performance should be at desired level. What to do next to keep it working well? Next steps: • Looking for performance issues in code reviews • Looking for possible performance issues implementing new features • Continuous performance testing
  • 49. DEMO
  • 50. Q&A