IMPACT 2014 ACU-1523: Performance Optimization Using IBM Java on z/OS & IBM WebSphere Application Server on z/OS V8.5.5
I was a guest speaker at IBM IMPACT 2014 conference. This session outlines how to optimize the performance of IBM WebSphere Application Server on z/OS applications, reduce CPU utilization, and take advantage of the latest zEC12 enhancements. IBM continues its efforts and investments in its Java Virtual Machine on IBM System z. zEC12 hardware packs an awesome performance punch with second-generation, out-of-order pipeline design, large caches, and 5.5 GHz hex-core processor. With the exploitation of new features, IBM Java Runtime Environment continues a long history of aggressive vertical integration on IBM System z. Come hear how HCSC is taking advantage of the latest IBM WebSphere Application Server and Java releases and enhancements. This presentation covers installation of Java V6.1, V7.0, and V7.1 with IBM WebSphere Application Server on z/OS V8.5.5 and exploitation of 1 Meg large pages with zEC12 Flash Express and IBM zEnterprise Data Compression with z/OS V2.1. Benchmark performance data is presented
2. Please Note
IBM’s statements regarding its plans, directions, and intent are subject to change
or withdrawal without notice at IBM’s sole discretion.
Information regarding potential future products is intended to outline our general
product direction and it should not be relied on in making a purchasing decision.
The information mentioned regarding potential future products is not a
commitment, promise, or legal obligation to deliver any material, code or
functionality. Information about potential future products may not be incorporated
into any contract. The development, release, and timing of any future features or
functionality described for our products remains at our sole discretion.
Performance is based on measurements and projections using standard IBM
benchmarks in a controlled environment. The actual throughput or performance
that any user will experience will vary depending upon many factors, including
considerations such as the amount of multiprogramming in the user’s job stream,
the I/O configuration, the storage configuration, and the workload processed.
Therefore, no assurance can be given that an individual user will achieve results
similar to those stated here.
1
3. Java Road Map
Language Updates
Java 5.0
• New Language features:
• Autoboxing
• Enumerated types
• Generics
• Metadata
Java 6.0
• Performance Improvements
• Client WebServices Support
• Support for dynamic languages
• Improve ease of use for SWING
• New IO APIs (NIO2)
• Java persistence API
• JMX 2.x and WS connection for JMX
agents
• Language Changes
Java 7.0
IBM Java Runtimes
IBM Java 5.0 (J9 R23)
• Improved performance
• Generational Garbage Collector
• Shared classes support
• New J9 Virtual Machine
• New Testarossa JIT technology
• First Failure Data Capture
• Full Speed Debug
• Hot Code Replace
• Common runtime technology
• ME, SE, EE
IBM Java 6.0 (J9 R24)
• Improvements in
• Performance
• Serviceability tooling
• Class Sharing
• XML parser improvements
• z10™ Exploitation
• DFP exploitation for BigDecimal
• Large Pages
• New ISA features
5.0
6.0
2005 2009
SE5.0
18platforms
SE6.0
20platforms
EE 5
WAS
6.1
WAS
7.0
2006 2008
WAS
6.0
200704
EE 6.x
**Timelines and deliveries are subject to change.
2010 2011
IBM Java 6.0.1/Java 7
(J9 R26)
• Improvements in
• Performance
• GC Technology
• z196™ Exploitation
• OOO Pipeline
• 70+ New Instructions
• JZOS/Security Enhancements
WAS
8.5
2012 2013 2014
7.0
• Language improvements
• Closures for simplified fork/join
Java 8.0**
SE601/7.x
>=20platforms
IBM Java 7 (J9 R26 SR3+)
• Improvements in
• Performance
• zEC12™ Exploitation
• Transactional Execution
• Flash 1Meg pageable LPs
• 2G large pages
• Hints/traps
IBM Java 7R1 (J9 R27)
• Improvements in
• Performance
• RAS
• Monitoring
• zEC12™ Exploitation
• zEDC for zip acceleration
• SMC-R integration
• Transactional Execution
• Runtime instrumentation
• Hints/traps
• Data Access Accelerator
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
2
4. zEC12 – More Hardware for Java
Continued aggressive investment in Java on Z
Significant set of new hardware features tailored
and co-designed with Java
Hardware Transaction Memory (HTM)
Better concurrency for multi-threaded applications
eg. ~2X improvement to juc.ConcurrentLinkedQueue
Run-time Instrumentation (RI)
Innovation new h/w facility designed for managed
runtimes
Enables new expanse of JRE optimizations
2GB page frames
Improved performance targeting 64-bit heaps
Pageable 1M large pages with Flash Express
Better versatility of managing memory
Shared-Memory-Communication
RDMA over Converged Ethernet
zEnterprise Data Compression accelerator
gzip accelerator
New software hints/directives/traps
Branch preload improves branch prediction
Reduce overhead of implicit bounds/null checks
New 5.5 GHz 6-Core Processor Chip
Large caches to optimize data serving
Second generation OOO design
Up-to 60% improvement in throughput amongst Java
workloads measured with zEC12 and IBM Java 7
Engineered Together—IBM Java and zEC12 Boost Workload Performance
http://www.ibmsystemsmag.com/mainframe/trends/whatsnew/java_compiler/
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
3
5. 4
z/OS IBM Java 7: 16-Way Performance
Aggregate HW and SDK Improvement z9 IBM Java 5 to zEC12 IBM Java 7
(Controlled measurement environment, results may vary)
~12x aggregate hardware and software improvement comparing IBM Java5 on z9 to IBM Java 7 on zEC12
LP=Large Pages for Java heap CR= Java compressed references
Java7SR3 using -Xaggressive + Flash Express pageable 1Meg large pages
z/OS Multi-Threaded 64 bit Java Workload 16-Way
~12x Improvement in Hardware and Software
0
20
40
60
80
100
120
140
160
1 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32
Threads
NormalizedThroughput
zEC12 SDK 7 SR3
Aggressive +
LP Code Cache
zEC12 SDK 7 SR1
z196 SDK 7 SR1
z196 SDK 6 SR8
z10 SDK 6 SR4
z10 SDK 6 GM
NO (CR or Heap LP)
z9 Java 5 SR5
NO (CR or Heap LP)
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
4
6. WAS on z/OS – DayTrader
Aggregate HW, SDK and WAS Improvement: WAS 6.1 (IBM Java 5) on z9 to WAS 8.5 (IBM Java 7) on zEC12
(Controlled measurement environment, results may vary)
6x aggregate hardware and software improvement comparing WAS 6.1 IBM Java5 on z9 to WAS 8.5 IBM Java7 on zEC12
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
5
7. IBM SDK for z/OS, Java Tech. Edition, Version 7 Release 1
Expand zEC12/zBC12 exploitation
• More TX, instruction scheduler, traps, branch preload
• Runtime instrumentation exploitation
• zEDC exploitation through java/util/zip
• Integration of SMC-R
Improved native data binding - Data Access Accelerator
• Integrated with JZOS native record binding framework
Improved general performance/throughput
• Up-to 19% improvement to throughput (ODM)
• Up-to 2.4x savings in CPU-time for record parsing batch
application
Improved WLM capabilities
Improved SAF and cryptography support
Additional reliability, availability, and serviceability (RAS)
enhancements
Enhanced monitoring and diagnostics
http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=AN&subtype=CA&htmlfid=897/ENUS213-498&appname=USN
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
6
8. Java-based Store, Inventory and Point-of-Sale App and IBM Java 7R1
10% improvement to Java-based Inventory and Point-of-Sale application with IBM Java 7R1
compared to IBM Java 7
(Controlled measurement environment, results may vary)
Java Store, Inventory and Point-Of-Sale Application zEC12 16-way
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
IBM Java 7 IBM Java 7R1
Normalized
MaximumOperationspersecond
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
7
9. IBM Operational Decision Manager and IBM Java 7R1
19% improvement to ODM with IBM Java 7R1 compared to IBM Java 7
(Controlled measurement environment, results may vary)
IBM Operational Decision Management zEC12 16-way
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
IBM Java 7 IBM Java 7R1
Throughput
(NormalizedtoIBMJava7SR4)
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
8
10. Store your Data - zEnterprise Data Compression and IBM Java 7R1
** IDC: The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East
With IBM Java 7R1 :
Java application to compress files using java.util.zip.GZIPOutputStream class
Up to 91% reduction in CPU time using zEDC hardware versus zlib software
Up to 74% reduction in Elapsed time (not shown)
Compression ratio up-to ~5x
Every day over 2000 petabytes of data are created
• Between 2005 to 2020, the digital universe will grow by 300x, going
from 130 to 40,000 exa-bytes**
• 80% of world's data was created in last two years alone
(Controlled measurement environment, results may vary)
What is it?
zEDC Express is an IO
adapter that does high
performance industry
standard compression
Used by z/OS Operating
System components, IBM
Middleware and ISV products
Applications can use
zEDC via industry
standard APIs (zlib and
Java)
Each zEDC Express sharable
across 15 LPARs, up to 8
devices per CEC.
Raw throughput up to 1 GB/s
per zEDC Express Hardware
Adapter
CPU Time for Software versus zEDC Hardware Compression
Using - java.util.zip.GZIPOutputStreamClass
-
1,000
2,000
3,000
4,000
5,000
6,000
7,000
8,000
Public Domain Books SVC Dump SMF Data
Compressed Data Files
CPUTime
zlib Software
zEDC Hardware
Size of Compressed Data - Software versus zEDC Hardware
Using - java.util.zip.GZIPOutputStream Class
-
20
40
60
80
100
120
Public Domain Books SVC Dump SMFData
Compressed Data Files
FileSizeinMegaBytes
InputFile Size
zlib Software
zEDC Hardware
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
9
11. RDMA Enables a host to read or write directly from/to a remote host’s memory without
involving the remote host’s CPU
SMC-R automatically/transparently exploits RDMA/RoCE for sockets based TCP
applications
Move your Data - Shared Memory Communications
(Controlled measurement environment, results may vary)
SMC-Rz/OS SYSA
z/OS SYSB
RoCE
WAS
Liberty
TradeLite DB2
JDBC/DRDA
3 per HTTP
Connection
Linux on x
Workload Client Simulator
(JIBE)
HTTP/REST
40 Concurrent
TCP/IP Connections
TCP/IP
WebSphere to DB2 communications using SMC-R
40%reduction in overall
Transaction response time! –
As seen from client’s perspective
Small data sizes ~ 100 bytes
10
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
12. Transform your Data - Data Access Accelerator in IBM Java 7R1
A Java library for bare-
bones data conversion
and arithmetic
Operates directly on byte array
No Java object tree created
Orchestrated with JIT for deep platform opt.
Avoids expensive Java object instantiation
Library is platform and JVM-neutral
Current Approach:
byte[] addPacked(array a[], array b[]) {
BigDecimal a_bd = convertPackedToBd(a[]);
BigDecimal b_bd = convertPackedToBd(b[]);
a_bd.add(b_bd);
return (convertBDtoPacked(a_bd));
}
Proposed Solution:
byte[] addPacked(array a[], array b[]) {
DAA.addPacked(a[], b[]);
return (a[]);
}
Marshalling and Un-marshalling
Transform primitive type (short, int, long, float, double) byte array
Support both big/little endian byte arrays
Packed Decimal (PD) Operations
Arithmetic: +, -, *, /, % on 2 PD operands
Relation: >,<,>=,<=,==,!= on 2 PD operands
Error checking: checks if PD operand is well-formed
Other: shifting, and moving ops on PD operand
Decimal Data Type Conversions
Decimal Primitive: Convert Packed Decimal(PD), External
Decimal(ED), Unicode Decimal(UD)
primitive types (int, long)
Decimal Decimal: Convert between dec. types (PD, ED, UD)
Decimal Java: Convert dec. types (PD, ED, UD)
BigDecimal, BigInteger
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
11
13. DAA – JZOS Medicare Record Benchmark and IBM Java 7R1
31-bit IBM Java 7R1 with DAA versus IBM Java 7 CPU Time improved by 2.4x
64-bit IBM Java 7R1 with DAA versus IBM Java 7 CPU Time improved by 1.9x
http://www.ibm.com/developerworks/java/zos/javadoc/jzos/index.html?com/ibm/jzos/sample/fields/MedicareRecord.html
(Controlled measurement environment, results may vary)
JZOS Medicare Record Parsing Benchmark
0
0.2
0.4
0.6
0.8
1
1.2
31-bit 64-bit
CPU-timetoParse5MRecords
(Normalizedto31-bitIBMJava7SR4w/oDAA)
IBM Java 7 SR4
IBM Java 7R1
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
12
14. HCSC Java on z/OS and
WebSphere Application Server
on z/OS V8.5.5
User Experience
15. Health Care Service Corporation (HCSC) is the fourth
largest health insurance company in the nation.
Largest customer-owned health insurer in the U.S.,
founded in 1936, now with more than 14 million members,
HCSC operates health insurance Plans in Illinois,
Montana, New Mexico, Oklahoma, Texas, and
Dearborn National.
We're greater than 21,000 employees strong with 60 local
offices and state-of-the-art technology, including two Tier
IV data centers – the industry's highest reliability level –
that provide the speed and data security to meet our
customers' current and future business needs.
14
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
About Health Care Service Corporation
16. Why WebSphere on z/OS?
15
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
WebSphere on z/OS has been selected at HCSC as a preferred platform to
support development and deployment of the Core processing for new Java
mission-critical Applications for the following reasons:
z/OS Hardware, Software, Storage, and Network are all designed for
maximum application availability
WebSphere on z/OS is designed to support very high transactional
volume
WebSphere on z/OS provides highest Quality of Service:
- Performance
- Scalability
- Recovery/failover capability
- High Availability
- Stability
- Manageability
- Maintainability
- Security/Integrity
By using WebSphere on z/OS you can minimize the number of physical
tiers to get to backend data
Use of single tier removes Network layer and additional overhead
associated with it
Tight integration with DB2, MQ, and CICS
17. Features and Technology Unique to z/OS
16
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
Server Architecture
- Control/Servant Region Split
- Multiple Servant Region
Workload Management
- Leverages Workload Manager (WLM)
- WLM/RMF integration
- Work classified according to importance & performance goals
- Work is selected from WLM queue and managed to goal
- Provides Failover to available Servers
- Automatic servant restart after an outage
- Automatic startup of additional servants, as needed, based on Policies
WebSphere on z/OS Network Deployment Clustering across z/OS LPARs
- Horizontal scaling for increased throughput
- Continuous availability & fail-over
MQ Queue Sharing using Shared Queues across LPARs and XM memory
communication for optimum performance
DB2 Data Sharing across LPARs, with JDBC Type 4 driver
SYSPlex Distributor - workload management and distribution across multiple systems
Coupling Facility - high-speed inter-system communication, used with MQ Queue
Sharing & DB2 Data sharing
Resource Recovery Services - required for 2-phase commits
zSeries Application Assist Processor (zAAP) - specialty assist processor
dedicated exclusively to execution of Java workloads under z/OS
Mainframe security
18. HCSC WebSphere on z/OS Java Exploitation
Currently using WASz V8.5.5.1 JVM Build is JRE 1.6.0
IBM J9 2.6 z/OS s390x-64 Compressed References
20130823_162690
J9VM - R26_Java626_SR6_Ifix_20130823_2006_B162690
JIT - r11.b04_20130528_38954ifx6
GC - R26_Java626_SR6_Ifix_20130823_2006_B162690_CMPRSS
J9CL - 20130823_162690.
Next Plan on upgrading to WASz V8.5.5.2 (GA 4/28/14),
which adds IBM Java 7R1 (JRE 1.7) support to WASz
IBM SDK Version 7.1 has been released but can't install it as an optional feature to
IBM WebSphere Application Server, version 8.5.5, until V8.5.5.2.
Enable -Xaggressive with IBM Java7 SR4
Explore exploitation of zEnterprise Data Compression
Make use of WASz Health Center, to look for Java tuning
opportunities at Application level
17
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
19. HCSC WebSphere on z/OS Cell Architecture
18
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
20. Applications #1-3 - WASz V7.0.27 vs V8.5.5.1
Application #1 – MDB Application in 64 bit mode. Compressed reference is set .
Max Heap size increased from 640M to 1792M, Min Heap size increased from 512M to
1344M.
Encountered Java OOM after upgrade, had to increase max/min heap size.
Memory Leak found in WASz JMS code, using ThreadLocals with AlarmManagerThread and
not releasing storage. Fixing APAR PI14746 . IBM Flash alert “Memory leak in WAS 8.5.x
J2C PoolManager” at http://www-01.ibm.com/support/docview.wss?uid=swg21670448 .
MDB response times went down 25-30% and throughput increased, using a little more CPU.
Application #2 – Activation Specification Application in 31 bit mode. Writes a
message to MQ when DB2 on z/OS update triggers.
Max Heap size 512M, Min Heap 256M.
Performs better in 31 bit mode vs 64 bit mode. Performance did not change after WASz
V8.5.5.1 upgrade.
Application #3 – MDB Application in 64 bit mode. Compressed reference is set .
Max Heap size is 2100M, Min Heap size is 1024M.
15-20% CPU reduction.
Being tuned to lower heap size under 2048M and exploit Pageable Large Pages, with Flash
Express.
19
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
21. Application #1 - WASz V7.0.27 vs V8.5.5.1
20
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
MDB Response times – 25-30% reduction under WASz V8.5.5.1
WASz V7.0.27 31 bit - Max Heap 640M
WASz V8.5.5.1 64 bit - Max Heap 1240M
22. Application #1 - WASz V7.0.27 vs V8.5.5.1
21
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
Backend DB2 calls Average Response Times –
much lower spikes
23. Application #1 - WASz V7.0.27 31 bit mode
Max Heap size set at 640M, Min Heap size set at 512M.
This was our configuration prior to WASz V8.5.5.1 upgrade.
22
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
Heap storage usage per servant CPU usage (zAAP & GP)
24. Application #1 - WASz V8.5.5.1 64 bit mode
Max Heap size increased from 640M to 832M, Min Heap size increased from
512M to 640M .
20% throughput increase in WASz V7.0.27 vs V8.5.5.1.
Getting Java Out Of Memory under heavier load and major increase in zAAP
usage, due to heavy GC activities. APAR PI14746 resolves this issue.
23
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
Heap storage usage per servant zAAP CPU increase due to Java OOM
25. Memory Leak In Application #1
24
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
ISSUE
A memory leak in the J2C PoolManager. Observed major increase in Java
Heap size usage, 30% GC overhead and 5-10X CPU usage increase.
It specifically manifests itself when using the MQ JMS Resource adapter
where for each JMS Managed Connection a Session pool and a Connection
pool is created.
When the Managed connection is destroyed due to unused or aged timeouts
or the connection is stale, then the associated JMS Session pool should be
stopped/destroyed and the reaper alarms associated with the PoolManager
should also be cancelled when the pool is stopped.
The Session pool is being destroyed, however, the PoolManager instance
registered in the alarms never get destroyed and the alarms are
repeatedly created and cancelled for every reap cycle, thus leading to the
PoolManager objects and its associated reaper alarm objects to stay on the
heap forever and potentially leading to OOM conditions.
SOLUTION
Install APAR PI14746, which is in OPEN status and IBM Level 3 can provide the
ifix, which is targetted for inclusion in WAS fixpack 8.5.5.3.
26. Memory Leak in the J2C PoolManager
25
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
See below PMAT data, showing GC data analyses from the server where
memory leak was observed. System is almost entirely out of free space.
27. Application #1 - WASz V8.5.5.1 64 bit mode
Max Heap size set at 832M
26
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
Response time & CPU usage
28. Application #1 - WASz V8.5.5.1 64 bit mode
Max Heap size increased from 832M to 1280M
27
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
Response time & CPU usage
29. Application #1 - WASz V8.5.5.1 64 bit mode
Max Heap size increased from 832M to 1280M
28
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
Heap storage usage per servant CPU usage back to normal, NO Java OOM
30. Application #1 - WASz V8.5.5.1 64 bit mode
Max Heap size increased from 832M to 1280M
Again Java OOM, high GC and high CPU usage after a few days
29
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
31. Application #1 - WASz V8.5.5.1 64 bit mode
Max Heap size increased from 1280M to 1792M
Steady increase in Heap allocation prior to APAR PI14746 installation.
30
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
32. Application #2 - WASz V8.5.5.1
31 Bit vs 64 Bit Mode
31
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
Heap storage usage per servant zAAP CPU increase due to Java OOM
WASz V8.5.5.1 CPU usage comparison, Heap size Max=512M, Min=256M
31 bit mode 64 bit mode
33. Application #2 - WASz V7.0.27 vs
WASz V8.5.5.1 CPU Usage
Activation Specification 31 bit mode Application CPU usage in WASz
V7.0.27 vs WASz V8.5.5.1 - slight increase in CPU usage
32
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
WASz V7.0.27 CPU Usage by Report Class WASz V8.5.5.1 CPU Usage by Report Class
3/11/14 3/25/14
34. Application #3 - WASz V7.0.27 vs
WASz V8.5.5.1 CPU Usage
MDB Application CPU usage in WASz V7.0.27 vs WASz V8.5.5.1
15-20% CPU usage reduction under WASz V8.5.5.1
33
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
WASz V7.0.27 CPU Usage by Report Class WASz V8.5.5.1 CPU Usage by Report Class
3/11/14 3/25/14
35. Application #4 - WASz V8.5.5.1 64 Bit Mode
APPL4 – Java Application that is being converted from COBOL, running in CICS.
Compressed reference is set with 64 bit mode.
Application uses Spring Framework and does not use currently JPA.
Application has ASYNC batch process and Online HTTP work, using 2 different WASz
Clusters. Most of the work is done by lower priority ASYNC Cluster.
After WASz V8.5.5.1 we saw around 28% CPU reduction and we turned off GP
cross over, using zAAP capacity instead. We were spilling around 1 GP engine under load
with WASz V7.0.27.
We are currently using Health Center to see if there are tuning opportunities at Java level.
34
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
36. Application #4 - WASz V8.5.5.1 64 bit mode
Load & Performance Testing
~30% overall CPU reduction comparing WASz V7.0.27 vs tuned WASz V8.5.5.1 with
Pageable 1M Large Pages, using Flash Express.
WASz V8.5.5.1 upgrade – 11.6% improvement
Exploiting Pageable 1M Large Pages with Flash Express – 4.4 % improvement
Reduced Max Heap size from 2100M to 2047M – 3% improvement, due to more efficient
compressed reference with Heap size < 2G
Increased Min Heap size to 1532M and Java Nursery size to 1023M –
10.5% improvement, less GC overhead – global GC scans were taking 1-2 seconds, now
only around 300ms.
35
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
37. Application #4 - WASz V8.5.5.1 64 bit mode
Prod Configuration - decided to run with the larger Max Heap size to accommodate
expected growth in workload. We are expecting 60% increase in volume this year.
Observed ~28% CPU reduction.
36
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
Prod CPU usage in seconds before and after WASz upgrade and doubling ASYNC JVMs
& increasing volume, converting more code from COBOL to Java.
38. PMAT and Application #4 Testing
37
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
39. WASz V8.5.x Packaging
38
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
Liberty Profile is a feature of the WebSphere Application Server install and independently
installable Installation Manager offering for all WebSphere Application Server editions.
WebSphere Extreme Scale for z/OS is now included with WebSphere Application Server
for z/OS.
SMP/E FMIDs and files included for:
40. WebSphere on z/OS V8.5.5 and Java Installation
39
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
Beginning with WASz V8.5.5, WebSphere Application Server provides separate
Installation Manager offerings for the Liberty profile, and for Java 7 for use with
Liberty.
For more details see:
http://publib.boulder.ibm.com/infocenter/ieduasst/v1r1m0/topic/com.ibm.iea.was_v8/was/8.5.5.0/content/WAS855_Install-
zOS.pdf?dmuid=20130820122411195350
41. WASz V8.5.5 and Java Installation
40
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
Check in WASz Admin Console what version of Java you are running and in what
mode – 64 bit mode is a DEFAULT now.
Application Servers > server name > Java SDKs
See the instructions in the Information Center to install Java 7:
http://pic.dhe.ibm.com/infocenter/wasinfo/v8r5/index.jsp?topic=/com.ibm.websphere.installation.zseries.doc/ae/tins_installation_zos_installing_jdk7.html
42. Why Flash Express?
Flash Express is a PCIe IO adapter with NAND Flash SSDs that can help you improve
availability and performance especially during periods of paging spikes.
When using small pages (4K pages), paging is less efficient than paging using
fewer larger 1 MB pages.
Cache buffers are used by the operating system to reduce virtual to real address
translations. Performance of this translation can be improved through the use of having
a greater number of page entries in cache; this is made possible through the use of
larger 1MB pages.
As a result of improved cache hits, exploiters of Pageable Large Pages and Flash
Express experience performance improvements both in elapsed time and CPU.
41
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
43. Large Page Support and Flash Express
Need to size LFAREA area and PAGESCM with WASz V8.5.x exploitation of
Large Page support
When the JVM is allocating large pages, if a particular Large Page size cannot
be allocated, the following sizes are attempted, in order, where applicable:
2G nonpageable , 1M nonpageable, 1M pageable, 4K pageable
For example, if 1M nonpageable Large Pages are requested but cannot be allocated, pageable 1M large
pages are attempted, and then pageable 4K pages.
Flash Express is designed to offer exceptional performance for paging
spikes by reducing paging latency. Flash can be allocated before the LPARs are
activated and detected by z/OS during IPL, or configured on dynamically after IPL.
Use TSO RMF to see Storage Memory Objects usage
42
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
44. Exploitation of Large Page Support
An example of setup needed to enable 1 Meg Pageable Large Page support –
-Xlp:codecache:pagesize=1m,pageable -Xlp:objectheap:pagesize=1m,pageable
-Xlp:codecache - Requests the JVM to allocate the JIT code cache by using pageable 1M large page sizes.
-Xlp:codecache:pagesize=1m,pageable - default for Java V7, needs to be set for Java V6.0.1.
-Xlp:objectheap - Requests the JVM to allocate the Java object heap by using pageable 1M large page sizes
-Xlp:objectheap:pagesize=1m,pageable - default for Java V7, needs to be set for Java V6.0.1
Note that - Xlp will override the default - Xlp:objectheap:pagesize=1m,pageable.
If there are no 1M pageable frames available, RSM is acting outside of JAVA so it will
do the allocations without JAVA knowing there are no more 1mb Pageable frames since
RSM will just allocate more dynamically from either 1mb fixed frames or as a last resort
4k frames.
To check what page size you are using, look at WAS servant log
43
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
Also can use - verbose:sizes set on IBM_JAVA_OPTIONS - displays default Java setting
used for that JVM.
<attribute name="pageSize" value="0x1000" /> getting 4K
<attribute name="requestedPageSize" value=" 0x1000 " /> requested 4K page size
<attribute name="pageSize" value="0x100000" /> getting 1M page size
<attribute name="requestedPageSize" value="0x100000" /> requested 1M page size
45. Issue to Watch Out For - High CPU with
Intelligent Management Enabled
ISSUE
The Virtual Enterprise product was integrated in WAS V8.5.x and is now referred to as Intelligent
Management. It is enabled by default. This might increase idle server CPU time considerably.
Major increase in the number of TCP/IP SMF 119 records. We have seen the creation of millions
of SMF type 119 TCP/IP sub type 1 and 2 SMF records (open and close connections), instead of
1,000s prior to WASz V8.5.5 upgrade.
Related APAR - http://www-01.ibm.com/support/docview.wss?uid=swg1PM79754
Review IBM WASz doc "Idle WebSphere Tuning Considerations" -
http://www-01.ibm.com/support/docview.wss?uid=tss1wp101894&aid=1
SOLUTION
In WAS V8.5.5 a new custom property (LargeTopologyOptimization) was added to disable Intelligent
Management for those who do not to use the functionality .
To configure the Cell custom property via the administrative console go to System Administration,
Cell, Configuration, Additional Properties, Custom Properties, and create a new entry with Name
LargeTopologyOptimization, and Value false.
44
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
46. Issue to Watch Out For - Higher CPU Usage by
Spring Applications Using @Async Annotation
ISSUE
A Spring Application calling ApplicationContext.getBean() on a prototype
bean using the @Async annotation caused an increase in Spring method
calls. This increase in Spring method calls resulted in higher CPU usage
when moving to WebSphere V8.0 and above.
CAUSE
WebSphere V8 and above contain interface classes for the EJB 3.1 specification level which includes the
support for using the @Asynchronous annotation. Prior to WebSphere V8.0 Spring was only searching for
the @Async annotation. At WebSphere V8.0 and above Spring was searching for both @Async and
@Asynchronous annotations. The additional searching for the added support for @Asynchronous was the
cause of the higher CPU.
SOLUTION
Change the logic to Cache the bean instance returned from
ApplicationContext.getBean() call.
Remove annotation-driven @Async searches by Spring. This was
accomplished by removing the configuration option "task:annotation-
driven“
More info at http://www-01.ibm.com/support/docview.wss?uid=swg21648523&acss=danl_335_email
45
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
47. Issue to Watch Out - Memory Leak
Using ThreadLocals
ThreadLocal variables don't work well with thread pools in J2EE environment
We observed memory leaks in Native Storage caused by using ThreadLocal variables.
ThreadLocals are only garbage collected if their owning thread is destroyed. Thread pooling in
WebSphere Application Server keeps threads alive indefinitely, and as such, ThreadLocal variables
remain alive even after the application is stopped. This problem is compounded by ThreadLocal
variables that consume native storage, such as classloaders.
Best coding practice recommendations
To avoid storage leaking (native or heap):
Use ThreadPool threads, which are managed by WebSphere on z/OS
Avoid the use of ThreadLocal variables
Clear all ThreadLocal variables before returning control from an EJB or Servlet invocation
An example of the error to watch out for -
46
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
FFDC Exception:java.lang.Exception SourceId:com.ibm.ejs.j2c.PoolManager$2
ProbeId:50 Reporter:com.ibm.ejs.j2c.PoolManager$2@ae185c45
java.lang.Exception: WSThreadLocal: instance count = 200:
Potential memory leak; verify usage.
48. Useful Links
Recommended fixes for WebSphere Application Server
http://www-01.ibm.com/support/docview.wss?uid=swg27004980
Which version of WebSphere MQ is shipped with
WebSphere Application Server?
http://www-01.ibm.com/support/docview.wss?uid=swg21248089
Knowledge Collection: Migrating to WebSphere Application
Server V8.5
http://www-01.ibm.com/support/docview.wss?uid=swg27008727
Introduction to Flash Express Improving Availability with
Flash Express
http://public.dhe.ibm.com/common/ssi/ecm/en/zsl03189usen/ZSL03189USEN.PDF
The Flash Express Feature on IBM zEnterprise EC12 and
z/OS exploitation of flash storage
http://www-03.ibm.com/systems/resources/flash.pdf
47
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
49. Join your local WUG - http://www.websphereusergroup.org/
Join Chicago North-West Integration and Cloud Computing
WebSphere User Group, that Cindy Schmoeller (CSC) and Elena Nanos
(HCSC) are leading - http://www.websphereusergroup.org/chicagonw
Join us for annual on-site meeting on June 12th, 2014 in Chicago area
Tentative Agenda
WebSphere Application Server V8.5.x update - Paul Lucas (IBM)
WebSphere Liberty Profile and demo - Bill Killer (IBM)
WebSphere MQ Next.x update - Mitch Johnson (IBM)
Demo – Watch how CICS applications can be integrated with Portal to create an
intuitive, "point and click," mobile front end - Chris Ganim (IBM)
Advantages of a Private Cloud on zEnterprise - Michael J. Casile (IBM)
Demos - CSL Wave, DB2 Analytics Adapter performance demonstration and
Mobile - Michael J. Casile (IBM)
Modernized Digital Experiences on System z - Chris Ganim (IBM)
Global WebSphere Community
48
Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5
51. We Value Your Feedback
Don’t forget to submit your Impact session and speaker
feedback! Your feedback is very important to us – we use it to
continually improve the conference.
Use the Conference Mobile App or the online Agenda Builder to
quickly submit your survey
• Navigate to “Surveys” to see a view of surveys for sessions
you’ve attended
50