AIX Performance Updates & Issues 2011

© 2010 IBM Corporation
AIX Performance
Updates & Issues 2011
Session: PE20
Steve Nasypany nasypany@us.ibm.com
IBM Advanced Technical Skills

© 20101IBM Corporation2
Power Systems Technical UniversityPower Systems Technical University
Agenda
AIX 6.1 & POWER7
– nmon & topas updates
– Perfstat Library updates
– Dynamic Power Save/Active Energy Management
– 4-way Simultaneous Multi-threading
– Virtual Processor Folding
– Utilization Issues
– Enhanced Affinity
– Partition Placement
– 1TB Segment Aliasing
– JFS2 i-node/metadata
– iostat block IO
AIX 7.1
– Performance tools 1024-way scaling
Java/WAS Performance, Best Practices Links and Java Performance Advisor
FCoE adapter performance
New Free Memory Tool

nmon On Demand Recording (ODR)
New function ideal for benchmarks, proof-of-concepts and problem analysis
Allows “high-resolution” recordings to be made while in monitoring mode
– Records samples at the interactive monitoring rate
– AIX 5.3 TL12, AIX 6.1 TL05 and AIX 7.1
Usage
– Start nmon, use “[“ and “]” brackets to start and end a recording
• Records standard background recording metrics, not just what is on screen.
• You can adjust the recorded sampling interval with
-s [seconds]
on startup, interactive options “-” and “+” (<shift> +) do NOT change ODR
interval
– Generates a standard nmon recording of format:
<host>_<YYYYMMDD>_<HHMMSS>.nmon
– Tested with nmon Analyser v33C, and works fine

nmon ODR

nmon chaos – to be or not to be logical
Unknown to many customers and the field, the per-CPU metrics in the main nmon
panel and that are recorded have always reported logical utilization for active CPUs,
whereas AIX tools (topas, sar, mpstat, vmstat and lparstat) were adjusted in AIX 5.3 to
report physical utilization
– Hardware register represents utilization in terms of absolute core consumption by
hardware context thread (SMT) – this is called the Processor Utilization Resource
Register (PURR)
– Values are relative to physical consumption for that CPU. ie, 99% busy means
nothing without knowing what that CPU’s physical consumption is - 99% of 0.01
physc is perfectly normal behavior in shared lpars and not reason for alarm
– Details of hardware register are provided in last years white paper by Saravanan
Devendra
http://www.ibm.com/developerworks/wikis/display/WikiPtype/Understanding+Processor+Utilization+on+POWER+Systems+-+AIX
Nmon does (and has for many years) provide global utilization values (EC & VP)
adjusted for PURR, and the ‘#’ option that provides ‘physical’ calculations that match
other AIX tools. However, nmon does not provide physical consumption values for
each CPU (screens or recording) – it simply notes “PURR Stats” are active
– To best understand this, run nmon and topas (or topas –L) and note that wherever
CPU values are reported, topas reports ‘physc’ (global utilization) or ‘pc’ (per CPU
utilization)

nmon chaos – to be or not to be logical
A number of customers complained that the nmon main screen did not match
other AIX tools. Development decided to switch nmon to use the same
calculations as the other AIX tools
– These APARs shipped in 2H 2010 - many customers noticed this change and
complained
– We discovered some ISV products and customers actually use per-logical CPU
recording information for capacity planning. This is a bit of a problem:
• Anyone doing capacity planning in shared environments should be using the
global utilization, physical and entitlement metrics reported by all tools,
including nmon recording
• But many customers still do not understand that per-CPU metrics are relative
to physical values, they are still living in an AIX 5.2 world
– Due to the new complaints, development decided to revert nmon back to the
way it was. Those APARs shipped in Q1/2011
Q4/2011 Updates provide both styles of per-logical CPU consumption in
nmon recordings. Physical PURR numbers under PCPU* and PCPU_ALL
(global) tags.
For other updates, see Nigel Griffith’s NMON presentation

Dynamic Power Save
POWER6 & POWER7 have the ability to automatically scale down energy usage based on
processor utilization and thermal levels
Static Power Saver Mode
– Lowers the processor frequency and voltage on a system by a fixed amount
– Reduces the power consumption of the system while still delivering predictable
performance
– Percentage of power saved is predetermined and is not user configurable
– Memory also allowed to enter low power state when no access occurs (with supported
firmware and DIMMs)
– Workload performance is not impacted, though CPU utilization may increase due to
reduced frequency
Dynamic Power Saver Mode
– Varies processor frequency and voltage based on the utilization of the system's processors
– System utilization is measured based on real-time utilization data
– Supports two modes:
• Favoring System Performance
• Favoring System Power Savings

Dynamic Power Save
Whitepaper: IBM EnergyScale for POWER7 Processor-Based Systems
http://www-03.ibm.com/systems/power/hardware/whitepapers/energyscale7.html
Capability requires a new type of reporting where processor utilization is provided by current
running frequency and rated frequency
Reporting is based on Processor Utilization Resource Registers (PURR), the accounting
framework. See the Processor Utilization in AIX whitepaper for more details
– Actual based on PURR
– Normalized based on Scaled PURR (SPURR)
– Physical processors reported in each mode
– Adding user, system, idle and wait will equal the total entitlement of the partition from an actual and
normalized view
– Shared, uncapped partitions can exceed entitlement
lparstat (utilization and status): AIX 5.3 TL11 & AIX 6.1 TL04
Statistics now available in Q4/2011 updates in nmon recordings, under SCPU* and
SCPU_ALL tags

Dynamic Power Save - lparstat -E
Idle value in this report has been modified to report the actual entitlement available (available
capacity) – so be aware of this and do not directly compare to the legacy lparstat reports (if
you go above entitlement, idle will equal 0)
Available Capacity = Idle = Entitlement - (user + sys + wait)
When the partition is running at a reduced frequency, the actual available capacity (idle)
shown by both the counters are different. The current idle capacity is shown by PURR. The
idle value shown by SPURR is what the idle capacity would be (approximately) if the CPU ran
at the rated frequency.
#lparstat -E 1
System configuration: type=Dedicated mode=Capped smt=Off lcpu=64 mem=262144MB
Physical Processor Utilisation:
--------Actual-------- ------Normalised------
user sys wait idle freq user sys wait idle
---- ---- ---- ---- --------- ---- ---- ---- ----
47.61 6.610 0.004 9.780 3.9GHz[102%] 48.35 6.714 0.004 8.933
46.24 6.743 0.000 11.02 3.9GHz[102%] 46.96 6.849 0.000 10.19
47.84 6.651 0.000 9.505 3.9GHz[102%] 48.59 6.756 0.000 8.653

VIOS Monitoring using topas (AIX 6.1 TL04)
Run topas -C and press 'v' to show the VIOS Monitoring Panel
All systems must be at AIX 5.3 TL09, VIOS 2.1 or higher to be monitored

VIOS Monitoring using topas
From topas VIOS panel, move the cursor to a particular VIOS server and press 'd' to get the detailed monitoring
for that server

topas Remote CEC & Cluster Views
AIX 6.1 TL-04
CEC function has been expanded to allow viewing of remote CEC
– topas -C option can attach to remote systems
– All partitions sharing that hardware ID will then be monitored
Can now pre-define sets of partitions to make up Cluster
– topas –G
– Configuration details at end
– SMIT panels available
Firewall support in AIX 6.1 TL06 and AIX 7.1

topas CEC (legacy –C)

topas CEC (new remote function)
Run “topas -C -o xmtopas=ses10.in.ibm.com”

topas Cluster Utilization Panel (topas –G)
A Cluster can be defined as a group of related partitions or nodes. The Cluster utilization view can either show the utilization of a HACMP
Cluster utilization or an user defined cluster

topas Cluster subcommands
Press ‘g’ - Toggles global section between brief/detailed listing
Press ‘d’ and ‘s’ to toggle between dedicated/shared only partition listings

topas Cluster (remote function)
Run “topas –G –o xmtopas=ses12.in.ibm.com”

Perfstat Library Overview
Perfstat is a documented system library for collection of AIX performance metrics
– Supported since AIX V5
– 32-bit, 64-bit threadsafe API
– Supports all performance-related resources
– Structures defined in /usr/include/libperfstat.h header file
Enhanced with additional metrics and new APIs in AIX 6.1 TL07 & AIX 7.1 TL01
– All CPU metrics – global CPU, physical consumption, physical busy, frequency, etc
– Partition configuration metrics, supplement lpar_get_info()
– Support for process level information outside of legacy libc “procsinfo” API
– Detailed disk statistics (read/write and queue service times)
– New API calls for maintaining state data for interval computations (CPU, process, partition)
– Host Fabric Interface (POWER 775 IH blade)
Examples of API usage in AIX pubs, and installed at /usr/samples/libperfstat
Older examples at http://www.ibm.com/developerworks/wikis/display/WikiPtype/ryo

Libperfstat enhancements
The list of Data Structures added:
perfstat_cpu_util_t Global CPU Utilization
perfstat_rawdata_t
perfstat_process_t Process info
perfstat_partition_config_t Partition Configuration (lparstat –i, etc)
The Data Structure updated:
perfstat_disktotal_t Detailed disk statistics
The list of APIs added :
perfstat_partition_config()
perfstat_cpu_util()
perfstat_process()
perfstat_process_util()

Data Structure Example - perfstat_cpu_util_t

POWER7 Simultaneous Multi-threading Review
POWER7 processors can run in ST, SMT2, SMT4 modes
– Like POWER6, the SMT threads will dynamically adjust based on workload
– Applications that are single process and/or single threaded may benefit from
running in ST mode, particularly if they want to completely consume a single
physical core
– Multi-process applications may run better in ST mode if there are fewer
application processes than cores
– Multi-threaded and/or multi-process applications (where there are more than
the number of cores) will benefit more running in SMT2 or SMT4 mode
0
1
2 3
SMT Thread
Primary
Secondary
Tertiary
POWER7 threads have different priorities with
Primary, Secondary and Tertiary instances
Work will not be assigned to tertiary threads
until enough workload exists to drive primary
and secondary (same threads in POWER5 &
POWER6) threads – typically ~80%

POWER7 Processor Utilization
FX0
FX1
FP0
FP1
LS0
LS1
BRX
CRL
POWER7 4 Way SMT
Calibrated CPU Utilization
– The Processor Utilization Resource Registers (PURR)
hardware counters that are the basis for computing CPU
utilization values in all tools, have been modified in
POWER7
– Internal hardware counters are calibrated to a variety of
commercial workloads to more accurately report real
world utilization
– When SMT2 or SMT4 is enabled, a single hardware
thread context will no longer be reported as consuming
100% of the core
– The goal is to provide a linear relationship between
utilization values and throughput
– Use smtclt –t [1 | 2 | 4] to change SMT mode
Whitepaper on SMT: Simultaneous Multi-Threading on POWER7 Processors by Mark Funk
http://www.ibm.com/systems/resources/pwrsysperf_SMT4OnP7.pdf
Whitepaper on utilization Processor Utilization in AIX by Saravanan Devendran
http://www.ibm.com/developerworks/wikis/display/WikiPtype/Understanding+Processor+Utilization+on+POWER+Systems+-+AIX

POWER7 Processor Utilization
Simulating a single threaded process on 1 core, utilization values change
– 1 VP system, using simple shell/perl script cpu hogs
– Some variability between tools and reports depending on implementation
– Nigel’s nstress package has a CPU stress tool, ncpu, but may run more than one thread
by default (see command usage)
Calibrated PURR applies whether running in POWER7 or POWER6 modes on POWER7
– AIX 5.3 and/or POWER6 mode on POWER7 can only support SMT2
Real world production workloads will involve dozens to thousands of threads, so many users
may not notice any difference
busy
idle
POWER6 SMT2
Htc0
Htc1
100%
busy
busy
idle
POWER7 SMT4
63%
busy
idle
idle
busy
idle
POWER7 SMT2
~70%
busy
busy
busy
100%
busy
busy
busy
100%
busy
Htc0
Htc1
Htc0
Htc1
Htc0
Htc1
Htc0
Htc1
Htc2
Htc3

POWER7 SMT4 Enhancement (AIX 6.1 TL06)
The original SMT4 algorithm was too aggressive when moving workloads from
secondary & tertiary threads to the primary thread when utilization had dropped
– Mechanism is known as idle-shedding
– This impacted OLTP workloads, so it was disabled in favor of a load-based
algorithm
Field experience on large shared-pool systems uncovered scenarios where
workload changes did not optimally result in switching from SMT4 to ST mode
– Secondary/tertiary thread would hold a thread for too longer than desired when
the primary became idle
– Idle shedding algorithm is now enabled, enhanced to support OLTP workloads
and prevent threads from sticking to secondary/tertiary threads
These are optimizations applied from experiences with “real world” customer
workloads on POWER7
APAR IZ97088

POWER7 SMT - 720 Firmware Bug
There is a Hypervisor dispatch bug where it is only evaluating a single thread
rather than all for SMT threads to determine the workload. SMT threads with
work are ignored.
Issue appears in 720_064 and later levels and is fixed in 720_101. It does not
exist in 710 or 730 firmware levels.
http://www-304.ibm.com/webapp/set2/sas/f/power5cm/power7.html
Can impact single or multiple-lpars while others are not effected
There is no clear diagnosis from the OS level, but the primary complaint from
customers is that they see distinct differences in application latency issues
between lpars and have no identifiable resource constraints
– Some data appears to show that all VP’s are being dispatched to the primary
SMT thread(s), so a normally multi-process or multi-threaded workload is very
busy on the primary thread(s) and the secondary/tertiary threads are idle. sar
–P ALL or mpstat output for all logical CPUs might show little or no distrubtion
across SMT threads.

POWER7 SMT – Application/DB settings
Where customers have seen performance differences between other systems and
POWER7 with SMT4, we have found storage differences or software settings to be
the primary factor for any delta. Examples:
– One DB environment used Asynchronous IO, whereas the POWER7 system did
not. When the systems were made comparable, the POWER7 system
performed as expected.
– Oracle’s CPU_COUNT (init.ora) and PARALLEL_THREADS_PER_CPU
parameters determine the factor of parallelism
• Number of Virtual Processors * SMT Setting *
PARALLEL_THREADS_PER_CPU
• Our ATS Software Specialists have additional material on these settings for
Oracle
These parameters can have a large impact on performance and must be taken into
account before believing something is wrong with SMT

Virtual Processor Folding - Review
Virtual Processor Folding is the technology which consolidates threads to the
minimum number of Virtual Processors (VP) required to support a workload
– Each virtual processor can consume a maximum of one physical processor
– Operating system constantly assesses workload requirements and folds or
unfolds VPs as required
– Response to customers allocating excessive VPs vs physical cores available
in a shared pool
– Enabled by default since AIX 5.3 ML03
– Also allows dedicated partitions to donate free cycles to shared pools
– Dedicated systems do in fact run under VPs – they are just not enabled for
folding by default
– All of the SMT threads associated with a physical core must be quiesced
before a VP can be folded
– Technology aids the PowerVM hypervisor to put physical cores into lower
energy levels, presuming all the VPs on different partitions within a shared
pool associated with a physical core are “foldable”

POWER7 Virtual Processor Folding - Algorithm
Every second, the OS calculates the physical utilization for the last second
– VPs are activated based on utilization thresholds and the vpm_xvcpus tunable
setting
– Where the schedo vpm_xcpus setting is:
• Defaults to 0 (enabled)
• Disabled with -1
• Can be set to a positive whole number to increase the number of active VPs
Folding is activated and deactivated 1 VP at a time, even in the case where
utilization drops to idle, the VPs fold one at a time
The legacy threshold at which default settings would trigger another VP was a
utilization level of ~80%.
– This threshold has changed, and may evolve at any time
– AIX 6.1 TL05 is more aggressive about unfolding Virtual Processors
non-IBM ISV’s have largely adopted IBM’s recommendations
http://www.oracle.com/technetwork/database/clusterware/overview/rac-aix-system-
stability-131022.pdf

POWER7 Virtual Processor Folding - Cost
Disabling folding will result in:
– Overriding optimizations built into the OS schedulers
– All VPs being dispatched to the hypervisor, whether they have work to do or not
– More hypervisor overhead, possible impact on physical resource affinity
The upside to disabling folding is that it can lead to better performance when
lpar(s) are perfectly sized. This typically applies to performance benchmarks and
not a mix of real-world, traditional Unix production workloads sharing CPU
resources.
– If the lpars are perfectly sized, and the pool is never constrained, customers
may not notice much of a performance hit
– When the pool or enough lpars are constrained, excess VPs on other lpars will
hurt performance
Folding is a real-world performance feature
Disabling folding will adversely impact most environments and may result in
IBM Support refusing to analyze a PERFPMR until collections have been
performed with restricted tunables reset to their defaults

Utilization Issues - 2011
AIX 6.1 TL05 SP6 defects can inflate physical utilization
– Symptom is high entitlement/physical consumption AND high %Idle – very
obviously wrong
– APAR IZ94768 in latest TL06
Idle kernel proc looping in dispatch cycles
– APAR IV01111: WAITPROC IDLE LOOPING CONSUMES CPU
Applicable to multi-node systems only, and if encountering problems (see
Enhanced Affinity section)
– APAR IV06194: SRAD LOAD BALANCING ISSUES ON SHARED LPARS
Because newer AIX 6.1 levels are more aggressive at unfolding Virtual
Processors, comparisons with older AIX levels may cause confusion
– Noticed in some POWER6 to POWER7 migrations
– Physical consumption is higher, but User and System time are lower (system is
more idle)
– Review utilization or physical busy metrics before assuming something is wrong

Utilization Issues - 2011
Many customers do not know that physical busy, the User + System percentage of physical
consumed is reported by lparstat
> lparstat 1
System configuration: type=Shared mode=Uncapped smt=4 lcpu=48 mem=49152MB
psize=15 ent=1.20
%user %sys %wait %idle physc %entc lbusy app vcsw phint %nsp
----- ----- ------ ------ ----- ----- ------ --- ----- ----- -----
61.5 1.3 0.0 37.1 1.36 113.2 15.1 11.39 13333 26 95
61.4 1.2 0.0 37.4 1.35 112.2 14.3 11.43 13664 21 96
59.4 3.0 0.0 37.6 1.39 115.9 15.0 11.36 12400 16 95
> lparstat –l 1
System configuration: type=Shared mode=Uncapped smt=4 lcpu=48 mem=49152MB
psize=15 ent=1.20
%user %sys %wait %idle physc %entc pbusy app vcsw phint %nsp
----- ----- ------ ------ ----- ----- ------ --- ----- ----- -----
14.6 0.2 0.0 85.2 1.35 112.1 84.7 11.29 13461 10 96
14.1 0.0 0.0 85.9 1.38 115.1 86.4 11.22 12677 32 95
14.0 0.4 0.0 85.6 1.36 113.5 85.4 11.38 13397 16 96
NOTE: –l results in utilization values switching to logical, whereas default shows physical “PURR” utilization

POWER7 Prefetch
POWER prefetch instructions can be used to mask latencies of requests to the memory controller and fill
cache.
– The POWER7 chip can recognize memory access patterns and initiate prefetch instructions
automatically.
– Control over how aggressive the hardware will prefetch, i.e. how many cache lines will be prefetched
for a given reference, is controlled by the Data Streams Control Register (DSCR).
The dscrctl command can be used to query and set the system wide DSCR value
# dscrctl -q
Current DSCR settings:
Data Streams Version = V2.06
number_of_streams = 16
platform_default_pd = 0x5 (DPFD_DEEP)
os_default_pd = 0x0 (DPFD_DEFAULT)
A system administrator can change the system wide value using the dscrctl command
# dscrctl [-n | -b] –s <value>
Disengage the data prefetch feature : dscrctl -n -s 1
Returning to default: dscrctl –n –s 0
This is a dynamic system-wide setting. Consult AIX Release Notes and Performance Guide for HPC
Applications on IBM Power 755 System for more information

POWER7 Enhanced Affinity
Affinity is a threads relationship with CPU and memory resources
– Large partitions and shared pools will span chips and nodes
– Maintaining proximity to a set of resources provides optimal performance
– POWER7 and AIX 6.1 provide much better instrumentation for affinity enhancements than
have been available in previously
Enhanced Affinity Summary
– Memory and CPU resources are localized, and form Affinity Domains
• Resource Allocation Domain (RAD) is a collection of physical resources
• Scheduler Resource Allocation Domain (SRAD)
Collection of system resources that are the basis for most resource allocation and
scheduling activities performed by the kernel
– An Affinity Domain (home “node”) is assigned to each thread at startup
• Thread’s private data is affinitized to home node
• Threads may temporarily execute remotely, but will eventually return to their home SRAD
• Single-threaded processes application data heap will be placed on home SRAD
• Multi-threaded processes will be balanced on SRADs depending upon footprint

Enhanced Affinity
Resource Affinity structures used by Enhanced Affinity function to help maintain locality for
threads to hardware resources. New terms describe the distance between two resources
in a two- or three-tier affinity environment (POWER7)
– 2-tier for low-end systems (blades, 710, 720, 730, 740, 750, 755)
Local resources have affinity within a chip
Far resources outside the chip
– 3-tier for multi-node systems (770, 780, 795)
Local resources have affinity within a chip
Near resources share the same node/book
Far resources outside the node/book
AIX Topology Service
– System Detail Level (SDL) is used to identify local (chip), near (within node) and far
(external node)
– “REF” System Detail Level Used to identify near/far memory boundaries (nodes)
A new tool, lssrad, displays hierarchy and topology for memory and scheduler. When
dynamically changing CPU/memory configurations, lssrad output can show the systems
balance.
Affinity metrics can be monitored in dedicated or shared partitions, but shared partitions
layout is not a 1:1 mapping to physical layout

System Topology & lssrad –va (4 node 770)
REF SRAD MEM LCPU
0
0 28250.00 0-31
1 27815.00 32-63
1
2 28233.00 64-95
3 27799.00 96-127
2
4 28281.00 128-159
5 27799.00 160-191
3
6 28016.00 192-223
7 27783.00 224-255
MEM 0
32GB
REF0
MEM 1
32GB
C
P
U
0
C
P
U
1
MEM 2
32GB
REF1
MEM 3
32GB
C
P
U
2
C
P
U
3
MEM 4
32GB
REF2
MEM 5
32GB
C
P
U
4
C
P
U
5
MEM 6
32GB
REF3
MEM 7
32GB
C
P
U
6
C
P
U
7

Enhanced Affinity: topas -M
You can select columns to sort on with tab key

Enhanced Affinity: mpstat
System configuration: lcpu=8 ent=1.0 mode=Uncapped
cpu cs ics .. S0rd S1rd S2rd S3rd S4rd S5rd ilcs vlcs S3hrd S4hrd S5hrd
0 12344 4498 .. 95.0 0.0 0.0 5.0 0.0 0.0 3 3095 100.0 0.0 0.0
1 112 56 .. 99.6 0.4 0.0 0.0 0.0 0.0 0 139 100.0 0.0 0.0
2 0 0 .. 50.0 50.0 0.0 0.0 0.0 0.0 0 90 100.0 0.0 0.0
3 0 0 .. 0.0 100.0 0.0 0.0 0.0 0.0 0 90 100.0 0.0 0.0
4 12109 4427 .. 94.9 0.0 0.0 5.1 0.0 0.0 1 3053 100.0 0.0 0.0
5 326 163 .. 99.9 0.1 0.0 0.0 0.0 0.0 0 250 100.0 0.0 0.0
6 0 0 .. 0.0 100.0 0.0 0.0 0.0 0.0 0 90 100.0 0.0 0.0
7 0 0 .. 0.0 100.0 0.0 0.0 0.0 0.0 0 90 100.0 0.0 0.0
ALL 24891 9144 .. 95.0 0.0 0.0 5.0 0.0 0.0 4 6897 100.0 0.0 0.0
mpstat (-a, -d flags) displays logical-CPU SRAD affinity
– Home SRAD redispatch statistics
• S3hrd – local
• S4hrd – near (3-tier only)
• S5hrd – far

Enhanced Affinity: svmon
Global Report
– Affinity domains are represented based on SRADID
Memory information of each SRAD: total, used, free, filecache
Logical CPUs in each SRAD
Process Report
– Displays the ‘home SRAD’ affinity statistics for the threads of a process
– Also provides an application’s memory placement policies

# svmon -G -O affinity=on,unit=MB
size inuse free pin virtual available mmode
memory 32768.00 3353.63 29414.37 1850.54 3195.75 29410.62 Ded
pg space 5408.00 10.6
work pers clnt other
pin 896.20 0 0 954.34
in use 3195.75 0.10 157.78
Domain affinity free used total filecache lcpus
0 24011.50 1837.38 25848.88 117.41 0 1 2 3
1 5403.90 560.89 5964.79 26.8 4 5 6 7 8 9 10
# lssrad -va
REF1 SRAD MEM CPU
0
0 25848.88 0-3
1 5964.79 4-10

# svmon -P 3670212 -O threadaffinity=on
Pid Command Inuse Pin Pgsp Virtual
3670212 rmcd 20334 10793 0 20162
Tid HomeSRAD LocalDisp NearDisp FarDisp
16449773 0 602 672 0
18808987 0 41 41 0
7864593 1 23 0 0
7930141 1 21 0 0
# svmon -P 1 -O affinity=detail
Pid Command Inuse Pin Pgsp Virtual
1 init 18654 10786 0 18636
Domain affinity Npages Percent Private lcpus
1 9914 53.2 31 2 3 4
0 8722 46.8 147 0 1

POWER7 Affinity & Partition Placement
POWER6 and earlier, Hypervisor (PHYP) minimized the number of affinity domains
(books/drawers/chips) per partition
POWER7 Hypervisor improves affinity by selecting optimized number of domains
–Ensures cores/memory allocated from each domain
PHYP, AIX 7 and IBM i v7r1m0 changed to support both a primary and secondary
affinity domains.
–For 795, chip is primary and book is secondary domain.
–OS enforces affinity in SPLPAR partitions in POWER7 (not done in P5 & P6)
PHYP, AIX 7 and IBM i v7r1m0 also added support for home node per shared virtual
processor (previous PHYP internally supported home node per partition) to improve
affinity
New System Partition Processor Limit (SPPL) gives direction to PHYP whether to
contain partitions to minimum domains or spread partitions across multiple domains.
–Applies to shared or dedicated environments

Setting System Partition Processor Limit (SPPL) on the HMC
The following section in Managing the HMC infocenter topic provides a reference to System Partition
Processor Limit (SPPL):
http://publib.boulder.ibm.com/infocenter/powersys/v3r1m5/index.jsp?topic=/p7ha1/smProperties.htm
Systems Management
Properties
Advanced Tab

Placement with Max Partition Size = 32
LPAR
(8 VPs)
LPAR
(16 VPs) LPAR
(32 VPs)
LPAR
(8 VPs)
Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8
LPAR
(24 VPs)
LPAR
(16 VPs)
Partitions will be contained in minimum number of nodes.
If a partition cannot be contained within a single node, then it will be spread across a
minimum number of nodes.
LPAR
(32 VPs)
LPAR
(16 VPs)
LPAR
(28 VPs)
LPAR
(26 VPs)
LPAR
(16 VPs)
LPAR
(16 VPs)
8 Free
cores
4 Free
cores
6 Free
cores

Partition Placement/Licensing
New Firmware (eFW7.3 and later)
At system power on treat all processors and memory as licensed
Place all the partitions as optimally as possible from a performance viewpoint
May require spreading a partition across multiple chips/drawers/books to ensure
memory and processors on domains (i.e. try to ensure if memory from a domain
there is also a processor from the domain and visa versa).
Optimization of other hardware components might also cause spreading of larger
partitions across domains (i.e. to provide additional internal bus bandwidth, spread
>24 way processor partitions across multiple books)
Unlicense individual processors that have not been assigned to partitions
First choice is to unlicense processors that do not have any memory DIMMs
connected to the processor
Second is to spread out the unlicensed processors across the domains such that
each domain would have similar number of unlicensed processors

Placement with Max Partition > 32 + Licensing
LPAR1
(20 VPs)
Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8
Partitions of 24 or fewer virtual CPU (VPs) are packed into a single node if sufficient memory
Partition >24 processors are spread across multiple books to allow for additional bandwidth
Memory would come from same books where processors are located.
Licensed memory is a max value across all 8 books, not specific locations
LPAR2
(36 VPs)
LPAR3
(64 VPs)
LPAR4
(16 VPs)
6
unlicensed
processors
12
unlicensed
processors
12
unlicensed
processors
LPAR5
(42 VPs)
12
unlicensed
processors
4
unlicensed
processors
6
unlicensed
processors
2
unlicensed
processors
12
unlicensed
processors
LPAR6
(12 VPs, Memory across 3 Books)

POWER7 1TB Segment Aliasing
Workloads with large memory footprints and low spatial locality can perform poorly
– Analysis shows that processor Segment Lookaside Buffers (SLB) faults can take
a significant amount of time
– POWER 6
• SLB has 64 entries
• 20 reserved for the kernel
• 44 available for user processes -> yields 11GB of accessible memory
• Many customer workloads do not fit into 11GB
– POWER 7
• SLB has 32 entries - architectural trend towards smaller SLB sizes
• 20 still reserved for the kernel
• 12 available for user processes -> yields 3GB of accessible memory
• Potential for performance regression

Example Process Address Space with 1TB Aliasing
Kernel
Program Text
Heap
80GB shared
Memory region
(320 segments)
Kernel
Program Text
Heap
80GB shared
Memory region
(320 segments)
3 256MB
segments
320 256MB
segments
8 256MB
segments
3 256MB
segments
1 1TB
shared alias
segment
1 1TB unshared
alias segment
331 SLB entries,
heavy SLB thrashing
5 SLB entries,
no SLB faults

POWER7 1TB Segment Aliasing
Feature allows user applications to use 1TB segments
– 12 SLB entries can now address 12TB of memory
– SLB fault issue no longer relevant
– Immediate performance boost for applications, new and legacy
– Significant changes under the covers
• New address space allocation policy
• Attempts to group address space requests together to facilitate 1TB aliasing.
• Once certain allocation size thresholds have been reached, OS automatically aliases
memory with 1TB aliases.
– 256MB segments still exist for handling IO
– Currenty only available for 64-bit process shared memory regions
– Default in AIX 7.1, optional in AIX 6.1 TL06
Use & diagnosis
– Engage ATS Software Specialists, ask for Ralf Schmidt-Dannert’s whitepaper
– Diagnosis techniques require use of trace tools for analysis. Example:
tprof –T 100000000 –skeux sleep 10
– Review sleep.prof file for noticeable cpu % in kernel routine set_smt_pri_user_slb_found()

JFS2 i-node cache review
The maximum size of the JFS2 i-node and metadata caches are set by two ioo tunables
– j2_inodeCacheSize = 400 (AIX 6.1)
– j2_metadataCacheSize = 400 (AIX 6.1)
– Both of these default is 400 and the units are undocumented and undefined, but we do know that the the i-
node structure is 1K
Procfs command is the only simple way to view usage on AIX 6.1
# cat /proc/sys/fs/jfs2/memory_usage
metadata cache: 48476160
inode cache: 161546240
total: 210022400
Years ago, Ralf Schmidt-Dannert and Doug Ranz investigated the unit settings and determined for a
10GB system:
1.0 GB
400 MB
300 MB
200 MB
100 MB
Maximum metadata
Cache
2500K2.5 GB1000
1000K1.0 GB400 (default)
750K750 MB300
500K500 MB200
250K250 MB100
Cacheable i-nodesMaximum i-node
cache
j2_inodeCacheSize or
j2_metadataCacheSize

JFS2 i-node cache review
The general answer to the question: “svmon isn’t reporting all my memory, where’s the
rest of it?” is to check inode/metacache usage
The behavior of AIX JFS2 i-node and metadata cache can cause problems on customer
systems
– Customers may note system memory increasing over time with no clear explanation, vmstat
and svmon do not report i-node or metadata usage
– Kernel pinned heap is used for these structures
– Cache usage is not reduced when files are deleted, and remounting filesystems does not
release the i-node or metadata
– New i-nodes use new memory until reaching the capacity defined by the tunables. At that
point, older entries may be recycled
– These tunables are dynamic, but the general thinking is it is not a good idea to tune an active
system down once it has reached these limits. If you are running 10% inode cache on a very
large system and want to reduce that, open a PMR or wait for a reboot.
Undocumented formulas generally appear to allow for the defaults to consume these
percentages of real memory:
– Inode cache 10% (AIX 6.1) and 5% (AIX 7.1)
– Metadata cache 4% (AIX 6.1) and 2% (AIX 7.1)
Weblink provides an online reference for customers:
http://www.ibm.com/developerworks/wikis/display/WikiPtype/AIXJ2inode

Support for enhanced iostat metrics
AIX 7.1 and AIX 6.1 TL06 (SP2)
The option 'b' provides detailed I/O statistics for block devices. Enabled for root and non-root
user
Block IO stats collection is disabled by default
– root user can enable with: raso -o biostat=1
Syntax:
iostat -b [block Device1 [block Device [...]]] Interval [sample]

Raso tunable - biostat
Purpose:
Specifies whether block IO device statistics collection should be enabled or not.
Values:
Default: 0
Range: 0, 1
Type: Dynamic
Unit: boolean
Tuning:
This tunable is useful in analyzing performance/utilization of various block IO devices. If this tunable is enabled,
we can use iostat -b to show IO statistics for various block IO devices.
Possible Value:
1 : Enabled
0 : Disabled

Block IO Device Utilization Report
The Block IO Device report provides statistics on per IO device basis. The report has
following format:
device Name of the device
reads Number of read requests over interval
writes Number of write requests over interval
bread Number of bytes read over interval
bwrite Number of bytes written over interval
rserv Read service time in milliseconds per read over interval
wserv Write service time in milliseconds per write over interval
rerr Number of read errors over interval
werr Number of write errors over interval

iostat –b sample output

Support for 1024 Logical CPUs – AIX 7.1
Growth of physical core support and SMT4 for POWER7 will drastically increase the number
of logical CPUs on systems
Presents difficult challenge in analysis of very large partitions
Processor tools need to support new filtering and output options for analysis
– Sorting and filtering options for sar and mpstat
– Screen freezing, scrolling, paging for topas
– New XML formatted reports (vmstat, iostat, mpstat, sar, lparstat)

sar –O option for sorting and filtering
sar [ { -A [ -M ] | [ -a ] [ -b ] [ -c ] [ -d ][ -k ] [ -m ] [ -q ] [ -r ] [ -u ] [ -v ] [ -w ]
[ -y ] [ -M ] } ] [ -P processoridentifier, ... | ALL | RST
[-O {sortcolumn=col_name[,sortorder={asc|desc}][,topcount=n]}]]]
[ [ -@ wparname ] [ -e[YYYYYMMDD]hh [ :mm [ :ss ] ] ] [ -ffile ] [ -iseconds ] [ -ofile ]
[ -s[YYYYYMMDD]hh [ :mm [:ss ] ] ][-x] [ Interval [ Number ] ]
-O Options Allows users to specify the command option. -O options=value...
Following are the supported options:
sortcolumn = Name of the metrics in the sar command output
sortorder = [asc|desc] - Default value of sortorder is “desc”
topcount = Number of CPUs to be displayed in the sar command sorted output
To display the sorted output for the column cswch/s with the -w flag, enter the following command:
sar -w -P ALL -O sortcolumn=cswch/s 1 1
To list the top ten CPUs, sorted on the scall/s column, enter the following command:
sar -c -O sortcolumn=scall/s,sortorder=desc,topcount=10 -P ALL 1
Support for 1024 Logical CPUs – sar filtering

mpstat –O option for sorting and filtering
mpstat [ { -d | -i | -s | -a | -h } ] [ -w ][ -OOptions ] [ -@ wparname] [ interval [
count ] ]
-O Options Specifies the command option. -O options=value...
Following are the supported options:
sortcolumn = Name of the metrics in the mpstat command output
sortorder = [asc|desc] - Default value of sortorder is “desc”
topcount = Number of CPUs to be displayed in the mpstat command sorted output
To see the sorted output for the column cs (context switches), enter the following command:
mpstat -d -O sortcolumn=cs
To see the list of the top 10 CPUs, enter the following command:
mpstat -a -O sortcolumn=min,sortorder=desc,topcount=10
Support for 1024 Logical CPUs – mpstat filtering

topas panel freezing. 'Space Bar' is used to
toggle a screen freeze
Sort using left/right arrows to select column
Scroll using PgUp/PgDn
Support for 1024 Logical CPUs - topas

XML output for commands lparstat, vmstat, iostat, mpstat, sar
Default output file name is command_DDMMYYHHMM.xml and is generated in current directory
User can specify output file name and directory using “–o”
lparstat -X -o /tmp/lparstat_data.xml
XML schema files are shipped with base OS under /usr/lib/perf
iostat_schema.xsd, lparstat_schema.xsd, mpstat_schema.xsd, sar_schema.xsd,
vmstat_schema.xsd
Currently, the xml output generated by these commands is not validated as per schema. It is up to the
application to do this validation
sar [-X [-o filename]] [interval[count]]
mpstat [-X [-o filename]] [interval[count]]
lpartstat [-X [-o filename]] [interval[count]]
vmstat [ -X [ -o filename]] [interval [ count ]]]
iostat [-X [-o filename]] [interval[count]]
Support for 1024 Logical CPUs – XML output

Miscellaneous AIX 6.1 TL06 & AIX 7.1
CPU Interrupt Disabling
– Minimize interrupt jitter that impacts application performance
– Quiesce external interrupts on a set of logical processors
– Control interface (subroutine, kernel service, command line)
– POWER5 and later systems, dedicated or shared
Kernel memory pinning
– Default in AIX 7.1, option in AIX 6.1
– vmo vmm_klock_mode tunable
– Should you do this in AIX 6.1? No – you should be talking to SupportLine with problems involving kernel
segment(s) being paged before this is done. But there is more flexibility now to protect kernel memory.
Hot Files Detection Subsystem
– A new subsystem for detecting hot files in JFS2 filesystems
– Currently, only a program interface using ioctl() calls on active file descriptors. Contact us if you are an
interested in using this interface.
– System header and structures, see /usr/include/sys/hfd.h
Perfstat and PTX SPMI API’s have been extended for covering the various new technologies supported (Active
Memory Expansion, etc)

Java & POWER7
Java 6 SR7 enhanced for POWER7 instructions, pre-fetch and autonomic 64KB page sizes
http://www.ibm.com/developerworks/java/jdk/aix/faqs.html
Best Practices for Java powerformance on POWER7
https://www.ibm.com/developerworks/wikis/display/LinuxP/Java+Performance+on+POWER7
Websphere Application Server (WAS)
– V7 & V8 provide specific exploitation of POWER6 & POWER7 instructions and 64KB page
sizes
– V8 includes scaling, footprint reduction, Java Persistence API (JPI) improvements
Java Performance Advisor (JPA)
– Provides performance recommendations for Java/WAS applications on AIX
https://www.ibm.com/developerworks/wikis/display/WikiPtype/Java+Performance+Advisor

Java Performance Advisor

FC over Ethernet – 5708 Adapter
9000 MTU1500 MTUSessionsDirectionTest
1756 MB/s
1733 MB/s
1179 MB/s
1173 MB/s
1111 MB/s
1076 MB/s
Single port
237415 TPS
26171 TPS
1914 MB/s
1712 MB/s
992 MB/s
1015 MB/s
1402 MB/s
1311 MB/s
Both ports
182062 TPS150
13324 TPS111 byte
message
TCP_Request &
Response
2176 MB/s1527 MB/s4
2106 MB/s1439 MB/s1duplex
1393 MB/s925 MB/s4
1393 MB/s785 MB/s1receive
1668 MB/s1068 MB/s4
1647 MB/s870 MB/s1sendTCP STREAM
Both portsSingle port
Host: P7 750 4-way, SMT-2, 3.3 GHz, AIX 5.3 TL12, dedicated LPAR, dedicated adapter
Client: P6 570 with two single-port 10 Gb (FC 5769), point-to-point wiring (no ethernet switch)
1Single session 1/TPS round trip latency is 75 microseconds, default ISNO settings, no interrupt coalescing

FCoE – Ethernet Performance
Note that receive is the lowest performance
– Lower thruput
– Both ports only get slightly more thruput than a single port as sessions are added
– Cannot provide 10 Gb receive bandwidth on two ports
– A 50% duty cycle should be OK
Disk IO does better due to larger blocks/buffers
AIX 6.1 SMT4 should have better throughput

New Free Memory Tool
Memory Tools
– An IBM FTSS maintains a webpage reviewing AIX memory and paging issues
http://www.ibm.com/developerworks/wikis/display/WikiPtype/AIXV53memory
– He has also developed a script that post-processes various memory tool outputs
to provide detailed breakdowns of system and user memory usage
• Tool is called pmrmemuse and a wrapper script called showmemsuse
https://www.ibm.com/developerworks/wikis/display/WikiPtype/AIXmemuse
– Website provides usage, output examples and extensive FAQ on the caveats of
svmon-based output

pmrmemuse – System Summary
Inuse Pin Pgsp Virtual Inuse Pin Pgsp Virtual Segment Process
(4K pages) (4K pages) (4K pages) (4K pages) (MBs) (MBs) (MBs) (MBs) count count
------------------------------------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- -------- -------
Free (on AIX free list) 39548 0 0 0 154.484 0.000 0.000 0.000
Used in AIX kernel but not in any segment 212304 212304 0 212304 829.312 829.312 0.000 829.312
Used in AIX kernel & extension segments 1279266 947052 31815 1292591 4997.133 3699.422 124.277 5049.184 337
Used in segments shared by several users 27528 0 620 27974 107.531 0.000 2.422 109.273 3
Used for clnt (JFS2 & NFS) file cache segs 177201 0 0 0 692.191 0.000 0.000 0.000 3126
Empty clnt (JFS2 & NFS) file cache segs 0.000 0.000 0.000 0.000 415659
Used for pers (JFS) file cache segments 5 0 0 0 0.020 0.000 0.000 0.000 4
Empty pers (JFS) file cache segs 0.000 0.000 0.000 0.000 5
Used by user wmethods in unshared segments 3945684 404 742594 4251313 15412.828 1.578 2900.758 16606.691 120 12
Used by user wmethods in shared segments 1 0 0 1 0.004 0.000 0.000 0.004 1 2
Used by user oracle in unshared segments 74554 5368 98428 173070 291.227 20.969 384.484 676.055 572 24
Used by user oracle in shared segments 627361 0 479823 760211 2450.629 0.000 1874.309 2969.574 13 89
Used by user root in unshared segments 25784 6436 11055 36485 100.719 25.141 43.184 142.520 261 108
Used by user root in shared segments 1 0 1 2 0.004 0.000 0.004 0.008 2 4
Used by user wmuser in unshared segments 620 24 273 877 2.422 0.094 1.066 3.426 12 3
Used by user wmuser in shared segments 2 0 1 3 0.008 0.000 0.004 0.012 3 6
….
Used by user daemon in unshared segments 122 4 519 602 0.477 0.016 2.027 2.352 3 1
Used by user flexsens in unshared segments 74 4 76 132 0.289 0.016 0.297 0.516 2 1
Unused work segments 12706 200 388 13142 49.633 0.781 1.516 51.336 55
Segments found only in svmon -S output 2065 160 45 2107 8.066 0.625 0.176 8.230 16
Empty work segments 0.000 0.000 0.000 0.000 4327
Empty rmap segments 0.000 0.000 0.000 0.000 10
Empty mmap segments 0.000 0.000 0.000 0.000 110
Empty clnt segs found only in svmon -S otpt 0.000 0.000 0.000 0.000 2
Empty work segs found only in svmon -S otpt 0.000 0.000 0.000 0.000 23
Segs not found in svmon -S otpt (see below) 0.000 0.000 0.000 0.000 49 2
------------------------------------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- -------- -------
Totals (Free + Used): 6425414 1172020 1365638 6771402 25099.273 4578.203 5334.523 26450.789 424729 258

pmrmemuse – System Summary
Totals reported in svmon -G output (for comparison to summary above):
------------------------------------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- -------- -------
Total real memory 6422528 0 0 0 25088.000 0.000 0.000 0.000
Free (on AIX free list) 39548 0 0 0 154.484 0.000 0.000 0.000
Used in AIX kernel but not in any segment 212304 212304 0 212304 829.312 829.312 0.000 829.312
Used for clnt (JFS2 & NFS) file cache 178644 0 0 0 697.828 0.000 0.000 0.000
Used for pers (JFS) file cache 5 0 0 0 0.020 0.000 0.000 0.000
Total memory used 6382980 1172180 1348900 6759780 24933.516 4578.828 5269.141 26405.391
------------------------------------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- -------- -------
Information reported in vmo -a and vmstat -v output (for comparison to summary above):
Value Value
(4K pages) (MBs)
-------------------------------------------------------- ---------- ----------
Number of memory pages 6422528 25088.000
Number of lruable pages 6194912 24198.875
minfree (2048) * number of memory pools (6) 12288 48.000
number of free pages 38987 152.293
maxfree (3072) * number of memory pools (6) 18432 72.000
minperm% (3.0%) of number of lruable pages (6194912) 185847 725.965
numperm% (2.2%) of number of lruable pages (6194912) 136288 532.375
maxperm% (90.0%) of number of lruable pages (6194912) 5575421 21778.988
numclient% (2.2%) of number of lruable pages (6194912) 136288 532.375
maxclient% (90.0%) of number of lruable pages (6194912) 5575421 21778.988

pmrmemuse – Detailed Summary by User
------------------------------------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- -------- -------
Used by user wmethods in unshared segments 3945684 404 742594 4251313 15412.828 1.578 2900.758 16606.691 120 12
------------------------------------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- -------- -------
Vsid Esid Type Description PSize Inuse Pin Pgsp Virtual Comments
---- ---- ---- ------------------------- ----- ------ ------ ------ ------- -----------------------------
9f5819 - work mmap source sm 65536 0 0 65536 65536 x 4 KB = 256.000 MB, pid=4784560: Inuse=6110.3 MB, pscmd=java
aeb72b - work mmap source sm 65536 0 0 65536 65536 x 4 KB = 256.000 MB, pid=4784560: Inuse=6110.3 MB, pscmd=java
c45442 - work mmap source sm 65536 0 0 65536 65536 x 4 KB = 256.000 MB, pid=4784560: Inuse=6110.3 MB, pscmd=java
e6afe2 - work mmap source sm 65172 0 1959 65536 65172 x 4 KB = 254.578 MB, pid=4784560: Inuse=6110.3 MB, pscmd=java
91d991 - work mmap source sm 65078 0 4996 65536 65078 x 4 KB = 254.211 MB, pid=4784560: Inuse=6110.3 MB, pscmd=java
e75a66 - work mmap source sm 65026 0 2435 65536 65026 x 4 KB = 254.008 MB, pid=4784560: Inuse=6110.3 MB, pscmd=java
b414b1 - work mmap source sm 65021 0 2355 65536 65021 x 4 KB = 253.988 MB, pid=4784560: Inuse=6110.3 MB, pscmd=java
c325c7 - work mmap source sm 64996 0 2479 65536 64996 x 4 KB = 253.891 MB, pid=4784560: Inuse=6110.3 MB, pscmd=java
f1d171 - work mmap source sm 64948 0 5770 65536 64948 x 4 KB = 253.703 MB, pid=4784560: Inuse=6110.3 MB, pscmd=java
91af93 - work mmap source sm 64938 0 2586 65536 64938 x 4 KB = 253.664 MB, pid=4784560: Inuse=6110.3 MB, pscmd=java
…
a626a2 11 work text data BSS heap sm 48275 0 18642 62762 48275 x 4 KB = 188.574 MB, pid=4784560: Inuse=6110.3 MB, pscmd=java
e769e1 - work mmap source sm 48028 0 26780 65536 48028 x 4 KB = 187.609 MB, pid=4784560: Inuse=6110.3 MB, pscmd=java
a4eca1 13 work text data BSS heap sm 6517 0 768 6803 6517 x 4 KB = 25.457 MB, pid=4784560: Inuse=6110.3 MB, pscmd=java
f0a4f4 9001000a work shared library data sm 333 0 70 449 333 x 4 KB = 1.301 MB, pid=4784560: Inuse=6110.3 MB, pscmd=java
b73c31 f00000002 work process private m 5 3 0 5 5 x 64 KB = 0.312 MB, pid=4784560: Inuse=6110.3 MB, pscmd=java
ba5e3c 80020014 work USLA heap sm 46 0 1 47 46 x 4 KB = 0.180 MB, pid=4784560: Inuse=6110.3 MB, pscmd=java
b62632 ffffffff work application stack sm 3 0 12 15 3 x 4 KB = 0.012 MB, pid=4784560: Inuse=6110.3 MB, pscmd=java
dc2f59 - work mmap source sm 0 0 1 1 0 x 4 KB = 0.000 MB, pid=4784560: Inuse=6110.3 MB, pscmd=java
…
b32437 - work mmap source sm 65536 0 0 65536 65536 x 4 KB = 256.000 MB, pid=7798878: Inuse=5361.5 MB, pscmd=java
950f90 - work mmap source sm 65536 0 450 65536 65536 x 4 KB = 256.000 MB, pid=7798878: Inuse=5361.5 MB, pscmd=java
a9f3ac - work mmap source sm 65536 0 0 65536 65536 x 4 KB = 256.000 MB, pid=7798878: Inuse=5361.5 MB, pscmd=java

Stay Connected & Continue Skills Transfer via IBM Training
Training paths
What to take,
when to take it
Social media
Join the
conversation
Custom catalog
Create a catalog
that meets your
interest areas
RSS feeds
Up-to-date
information on
the training
you need
IBM Training
News
Targeted to
your needs
New to
Instructor Led
Online (ILO)?
Take a free
test drive!
Education Packs
Online discount program for ALL IBM
Training courses for your company
Questions? Email Lisa Ryan (lisaryan@us.ibm.com)

This document was developed for IBM offerings in the United States as of the date of publication. IBM may not make these offerings available in
other countries, and the information is subject to change without notice. Consult your local IBM business contact for information on the IBM
offerings available in your area.
Information in this document concerning non-IBM products was obtained from the suppliers of these products or other public sources. Questions
on the capabilities of non-IBM products should be addressed to the suppliers of those products.
IBM may have patents or pending patent applications covering subject matter in this document. The furnishing of this document does not give
you any license to these patents. Send license inquires, in writing, to IBM Director of Licensing, IBM Corporation, New Castle Drive, Armonk, NY
10504-1785 USA.
All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives
only.
The information contained in this document has not been submitted to any formal IBM test and is provided "AS IS" with no warranties or
guarantees either expressed or implied.
All examples cited or described in this document are presented as illustrations of the manner in which some IBM products can be used and the
results that may be achieved. Actual environmental costs and performance characteristics will vary depending on individual client configurations
and conditions.
IBM Global Financing offerings are provided through IBM Credit Corporation in the United States and other IBM subsidiaries and divisions
worldwide to qualified commercial and government clients. Rates are based on a client's credit rating, financing terms, offering type, equipment
type and options, and may vary by country. Other restrictions may apply. Rates and offerings are subject to change, extension or withdrawal
without notice.
IBM is not responsible for printing errors in this document that result in pricing or information inaccuracies.
All prices shown are IBM's United States suggested list prices and are subject to change without notice; reseller prices may vary.
IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.
Any performance data contained in this document was determined in a controlled environment. Actual results may vary significantly and are
dependent on many factors including system hardware configuration and software design and configuration. Some measurements quoted in this
document may have been made on development-level systems. There is no guarantee these measurements will be the same on generally-
available systems. Some measurements quoted in this document may have been estimated through extrapolation. Users of this document
should verify the applicable data for their specific environment.
Revised September 26, 2006
Special notices

IBM, the IBM logo, ibm.com AIX, AIX (logo), AIX 6 (logo), AS/400, Active Memory, BladeCenter, Blue Gene, CacheFlow, ClusterProven, DB2, ESCON, i5/OS, i5/OS
(logo), IBM Business Partner (logo), IntelliStation, LoadLeveler, Lotus, Lotus Notes, Notes, Operating System/400, OS/400, PartnerLink, PartnerWorld, PowerPC, pSeries,
Rational, RISC System/6000, RS/6000, THINK, Tivoli, Tivoli (logo), Tivoli Management Environment, WebSphere, xSeries, z/OS, zSeries, AIX 5L, Chiphopper, Chipkill,
Cloudscape, DB2 Universal Database, DS4000, DS6000, DS8000, EnergyScale, Enterprise Workload Manager, General Purpose File System, , GPFS, HACMP,
HACMP/6000, HASM, IBM Systems Director Active Energy Manager, iSeries, Micro-Partitioning, POWER, PowerExecutive, PowerVM, PowerVM (logo), PowerHA, Power
Architecture, Power Everywhere, Power Family, POWER Hypervisor, Power Systems, Power Systems (logo), Power Systems Software, Power Systems Software (logo),
POWER2, POWER3, POWER4, POWER4+, POWER5, POWER5+, POWER6, POWER7, pureScale, System i, System p, System p5, System Storage, System z, Tivoli
Enterprise, TME 10, TurboCore, Workload Partitions Manager and X-Architecture are trademarks or registered trademarks of International Business Machines Corporation
in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (®
or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be
registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at
www.ibm.com/legal/copytrade.shtml
The Power Architecture and Power.org wordmarks and the Power and Power.org logos and related marks are trademarks and service marks licensed by Power.org.
UNIX is a registered trademark of The Open Group in the United States, other countries or both.
Linux is a registered trademark of Linus Torvalds in the United States, other countries or both.
Microsoft, Windows and the Windows logo are registered trademarks of Microsoft Corporation in the United States, other countries or both.
Intel, Itanium, Pentium are registered trademarks and Xeon is a trademark of Intel Corporation or its subsidiaries in the United States, other countries or both.
AMD Opteron is a trademark of Advanced Micro Devices, Inc.
Java and all Java-based trademarks and logos are trademarks of Sun Microsystems, Inc. in the United States, other countries or both.
TPC-C and TPC-H are trademarks of the Transaction Performance Processing Council (TPPC).
SPECint, SPECfp, SPECjbb, SPECweb, SPECjAppServer, SPEC OMP, SPECviewperf, SPECapc, SPEChpc, SPECjvm, SPECmail, SPECimap and SPECsfs are
trademarks of the Standard Performance Evaluation Corp (SPEC).
NetBench is a registered trademark of Ziff Davis Media in the United States, other countries or both.
AltiVec is a trademark of Freescale Semiconductor, Inc.
Cell Broadband Engine is a trademark of Sony Computer Entertainment Inc.
InfiniBand, InfiniBand Trade Association and the InfiniBand design marks are trademarks and/or service marks of the InfiniBand Trade Association.
Other company, product and service names may be trademarks or service marks of others.
Revised February 9, 2010
Special notices (cont.)

The IBM benchmarks results shown herein were derived using particular, well configured, development-level and generally-available computer systems. Buyers should
consult other sources of information to evaluate the performance of systems they are considering buying and should consider conducting application oriented testing. For
additional information about the benchmarks, values and systems tested, contact your local IBM office or IBM authorized reseller or access the Web site of the benchmark
consortium or benchmark vendor.
IBM benchmark results can be found in the IBM Power Systems Performance Report at http://www.ibm.com/systems/p/hardware/system_perf.html .
All performance measurements were made with AIX or AIX 5L operating systems unless otherwise indicated to have used Linux. For new and upgraded systems, AIX
Version 4.3, AIX 5L or AIX 6 were used. All other systems used previous versions of AIX. The SPEC CPU2006, SPEC2000, LINPACK, and Technical Computing
benchmarks were compiled using IBM's high performance C, C++, and FORTRAN compilers for AIX 5L and Linux. For new and upgraded systems, the latest versions of
these compilers were used: XL C Enterprise Edition V7.0 for AIX, XL C/C++ Enterprise Edition V7.0 for AIX, XL FORTRAN Enterprise Edition V9.1 for AIX, XL C/C++
Advanced Edition V7.0 for Linux, and XL FORTRAN Advanced Edition V9.1 for Linux. The SPEC CPU95 (retired in 2000) tests used preprocessors, KAP 3.2 for FORTRAN
and KAP/C 1.4.2 from Kuck & Associates and VAST-2 v4.01X8 from Pacific-Sierra Research. The preprocessors were purchased separately from these vendors. Other
software packages like IBM ESSL for AIX, MASS for AIX and Kazushige Goto’s BLAS Library for Linux were also used in some benchmarks.
For a definition/explanation of each benchmark and the full list of detailed results, visit the Web site of the benchmark consortium or benchmark vendor.
TPC http://www.tpc.org
SPEC http://www.spec.org
LINPACK http://www.netlib.org/benchmark/performance.pdf
Pro/E http://www.proe.com
GPC http://www.spec.org/gpc
VolanoMark http://www.volano.com
STREAM http://www.cs.virginia.edu/stream/
SAP http://www.sap.com/benchmark/
Oracle Applications http://www.oracle.com/apps_benchmark/
PeopleSoft - To get information on PeopleSoft benchmarks, contact PeopleSoft directly
Siebel http://www.siebel.com/crm/performance_benchmark/index.shtm
Baan http://www.ssaglobal.com
Fluent http://www.fluent.com/software/fluent/index.htm
TOP500 Supercomputers http://www.top500.org/
Ideas International http://www.ideasinternational.com/benchmark/bench.html
Storage Performance Council http://www.storageperformance.org/results
Revised March 12, 2009
Notes on benchmarks and values

Revised March 12, 2009
Notes on HPC benchmarks and values
The IBM benchmarks results shown herein were derived using particular, well configured, development-level and generally-available computer systems. Buyers should
consult other sources of information to evaluate the performance of systems they are considering buying and should consider conducting application oriented testing. For
additional information about the benchmarks, values and systems tested, contact your local IBM office or IBM authorized reseller or access the Web site of the benchmark
consortium or benchmark vendor.
IBM benchmark results can be found in the IBM Power Systems Performance Report at http://www.ibm.com/systems/p/hardware/system_perf.html .
All performance measurements were made with AIX or AIX 5L operating systems unless otherwise indicated to have used Linux. For new and upgraded systems, AIX
Version 4.3 or AIX 5L were used. All other systems used previous versions of AIX. The SPEC CPU2000, LINPACK, and Technical Computing benchmarks were compiled
using IBM's high performance C, C++, and FORTRAN compilers for AIX 5L and Linux. For new and upgraded systems, the latest versions of these compilers were used: XL
C Enterprise Edition V7.0 for AIX, XL C/C++ Enterprise Edition V7.0 for AIX, XL FORTRAN Enterprise Edition V9.1 for AIX, XL C/C++ Advanced Edition V7.0 for Linux, and
XL FORTRAN Advanced Edition V9.1 for Linux. The SPEC CPU95 (retired in 2000) tests used preprocessors, KAP 3.2 for FORTRAN and KAP/C 1.4.2 from Kuck &
Associates and VAST-2 v4.01X8 from Pacific-Sierra Research. The preprocessors were purchased separately from these vendors. Other software packages like IBM ESSL
for AIX, MASS for AIX and Kazushige Goto’s BLAS Library for Linux were also used in some benchmarks.
For a definition/explanation of each benchmark and the full list of detailed results, visit the Web site of the benchmark consortium or benchmark vendor.
SPEC http://www.spec.org
LINPACK http://www.netlib.org/benchmark/performance.pdf
Pro/E http://www.proe.com
GPC http://www.spec.org/gpc
STREAM http://www.cs.virginia.edu/stream/
Fluent http://www.fluent.com/software/fluent/index.htm
TOP500 Supercomputers http://www.top500.org/
AMBER http://amber.scripps.edu/
FLUENT http://www.fluent.com/software/fluent/fl5bench/index.htm
GAMESS http://www.msg.chem.iastate.edu/gamess
GAUSSIAN http://www.gaussian.com
ANSYS http://www.ansys.com/services/hardware-support-db.htm
Click on the "Benchmarks" icon on the left hand side frame to expand. Click on "Benchmark Results in a Table" icon for benchmark results.
ABAQUS http://www.simulia.com/support/v68/v68_performance.php
ECLIPSE http://www.sis.slb.com/content/software/simulation/index.asp?seg=geoquest&
MM5 http://www.mmm.ucar.edu/mm5/
MSC.NASTRAN http://www.mscsoftware.com/support/prod%5Fsupport/nastran/performance/v04_sngl.cfm
STAR-CD www.cd-adapco.com/products/STAR-CD/performance/320/index/html
NAMD http://www.ks.uiuc.edu/Research/namd
HMMER http://hmmer.janelia.org/
http://powerdev.osuosl.org/project/hmmerAltivecGen2mod

Revised April 2, 2007
Notes on performance estimates
rPerf for AIX
rPerf (Relative Performance) is an estimate of commercial processing performance relative to other IBM UNIX systems. It is derived from an
IBM analytical model which uses characteristics from IBM internal workloads, TPC and SPEC benchmarks. The rPerf model is not
intended to represent any specific public benchmark results and should not be reasonably used in that way. The model simulates some of
the system operations such as CPU, cache and memory. However, the model does not simulate disk or network I/O operations.
rPerf estimates are calculated based on systems with the latest levels of AIX and other pertinent software at the time of system
announcement. Actual performance will vary based on application and configuration specifics. The IBM eServer pSeries 640 is the
baseline reference system and has a value of 1.0. Although rPerf may be used to approximate relative IBM UNIX commercial processing
performance, actual system performance may vary and is dependent upon many factors including system hardware configuration and
software design and configuration. Note that the rPerf methodology used for the POWER6 systems is identical to that used for the
POWER5 systems. Variations in incremental system performance may be observed in commercial workloads due to changes in the
underlying system architecture.
All performance estimates are provided "AS IS" and no warranties or guarantees are expressed or implied by IBM. Buyers should consult
other sources of information, including system benchmarks, and application sizing guides to evaluate the performance of a system they are
considering buying. For additional information about rPerf, contact your local IBM office or IBM authorized reseller.
========================================================================
CPW for IBM i
Commercial Processing Workload (CPW) is a relative measure of performance of processors running the IBM i operating system.
Performance in customer environments may vary. The value is based on maximum configurations. More performance information is
available in the Performance Capabilities Reference at: www.ibm.com/systems/i/solutions/perfmgmt/resource.html

AIX Performance Updates & Issues 2011

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (16)

Similar to AIX Performance Updates & Issues 2011

Similar to AIX Performance Updates & Issues 2011 (20)

More from xKinAnx

More from xKinAnx (20)

Recently uploaded

Recently uploaded (20)

AIX Performance Updates & Issues 2011