More Related Content Similar to tuningfor_oracle Similar to tuningfor_oracle (20) tuningfor_oracle1. IBM Advanced Technical Support - Americas
AIX Performance:
Configuration & Tuning for Oracle
Vijay Adik
vadik@us.ibm.com
ATS - Oracle Solutions Team
11/15/13
© 2008 IBM Corporation
2. IBM Advanced Technical Support - Americas
Legal information
The information in this presentation is provided by IBM on an "AS IS"
basis without any warranty, guarantee or assurance of any kind. IBM also
does not provide any warranty, guarantee or assurance that the
information in this paper is free from errors or omissions. Information is
believed to be accurate as of the date of publication. You should check
with the appropriate vendor to obtain current product information.
Any proposed use of claims in this presentation outside of the United
States must be reviewed by local IBM country counsel prior to such use.
IBM, ^,
, RS6000, System p, AIX, AIX 5L, GPFS, and Enterprise
Storage Server (ESS) are trademarks or registered trademarks of the
International Business Machines Corporation.
Oracle, Oracle9i and Oracle10g are trademarks or registered trademarks
of Oracle Corporation.
All other products or company names are used for identification purposes
only, and may be trademarks of their respective owners.
2
© 2008 IBM Corporation
11/15/13
3. IBM Advanced Technical Support - Americas
Agenda
AIX Configuration Best Practices for Oracle
– Memory
– I/O
– Network
– Miscellaneous
3
© 2008 IBM Corporation
11/15/13
4. IBM Advanced Technical Support - Americas
AIX Configuration Best Practices for Oracle
The suggestions presented here are considered to
be basic configuration “starting points” for
general Oracle workloads
Your workloads may vary
Ongoing performance monitoring and tuning is
recommended to ensure that the configuration is
optimal for the particular workload characteristics
4
© 2008 IBM Corporation
11/15/13
5. IBM Advanced Technical Support - Americas
Performance Overview – Tuning Methodology
Iterative Tuning Process
•
Understand the external view of system performance
The external view of system performance is the
observable event that is causing someone to say
the system is performing poorly. Typically, (1)
end-user response time, (2) application (or task)
response time or (3) throughput. Should not use
system metrics to judge improvement.
•
Performance only improves when the predominant
bottleneck is fixed
Fixing a secondary bottleneck will not improve
performance and typically results in overloading an
already overloaded predominant bottleneck.
•
Monitor Performance after a change – Tuning is an
iterative process
Monitoring is required after making a change for
two reasons (1) Fixing the predominant bottleneck
typically uncovers another bottleneck, and (2) Not
all changes yield a positive results. If possible you
should have a “repeatable” test to so change can
be accurately evaluated.
Stress System (i.e., Tune at Peak workload)
Monitor Sub-Systems
Identify Predominant Bottleneck
Tune Bottleneck
Predominant Bottleneck
Repeat
CPU
Memory
Network
I/O
• End-User Response time is the elapsed time between when a user submits a request and receives a response.
• Application Response time is the elapsed required for one or more jobs to complete. Historically, these jobs have been called batch jobs.
• Throughput is the amount of work that can be accomplished per unit time. This metric is typically expressed in terms of transaction per minute.
5
© 2008 IBM Corporation
11/15/13
6. IBM Advanced Technical Support - Americas
Performance Monitoring and Tuning Tools
CPU
Memory
I/O
Subsystem
Network
Processes
& Threads
vmstat, topas, vmstat, topas,
iostat, ps,
ps, lsps, ipcs
mpstat,
lparstat, sar,
time/timex,
emstat/alstat
vmstat, topas,
iostat,
lvmstat, lsps,
lsattr/lsdev,
lspv/lsvg/lslv
netstat, topas, ps, pstat,
atmstat,
topas,
entstat,
emstat/alstat
tokstat,
fddistat,
nfsstat, ifconfig
netpmon
svmon,
netpmon,
filemon
fileplace,
filemon
netpmon,
tcpdump
svmon,
truss, kdb,
dbx, gprof,
kdb, fuser,
prof
Trace Level
Commands
tprof, curt,
splat, trace,
trcrpt
trace,trcrpt
trace, trcrpt
iptrace,
ipreport, trace,
trcrpt
truss, pprof
curt, splat,
trace, trcrpt
ioo, lvmo,
chdev,
migratepv,chl
v, reorgvg
no,
chdev,ifconfig
nfso,chdev,
fdpr
Tuning tools
schedo, fdpr, vmo,
bindprocessor rmss,fdpr,
, bindintcpu,
chps/mkps
nice/renice,
setpri
Status
Commands
Monitor
Commands
6
© 2008 IBM Corporation
11/15/13
7. IBM Advanced Technical Support - Americas
Agenda
AIX Configuration Best Practices for Oracle
– Memory
– I/O
– Network
– Miscellaneous
7
© 2008 IBM Corporation
11/15/13
8. Advanced Technical Support – System p
AIX Memory Management Overview
The role of Virtual Memory Manager (VMM) is to provide the capability for programs to address
more memory locations than are actually available in physical memory.
On AIX this is accomplished using segments that are partitioned into fixed sizes called “pages”.
– A segment is 256M
– default page size 4K
– POWER 4+ and POWER5 can define large pages, which are 16M
The 32-bit or 64-bit address translates into a 52-bit or 80-bit virtual address
– 32-bit system : 4-bit segment register that contains a 24-bit segment id, and 28-bit offset.
•
24-bit segment id + 28-bit offset = 52-bit VA
– 64-bit system: 32-bit segment register that contains a 52-bit segment id, and 28-bit offset.
•
52-bit segment id + 28-bit offset = 80-bit VA
The VMM maintains a list of free frames that can be used to retrieve pages that need to be
brought into memory.
– The VMM replenishes the free list by removing some of the current pages from real memory (i.e., steal
memory).
– The process of moving data between memory and disk is called “paging”.
The VMM uses a Page Replacement Algorithm (implemented in the lrud kernel threads) to select
pages that will be removed from memory.
8
© 2008 IBM Corporation
11/15/13
9. Advanced Technical Support – System p
Virtual Memory Space – 64 Bits
36-bits selects Segment Register
Segments IDs
0
28-bits offset within Segment
64-bit Address
Each Segment Register contains
a 52-bit Segment ID
Kernel Segment
Page Space Disk Map
Kernel Heap
Segment is divided into 4096 byte chunks called pages
28-bit offset – to access a
specific location in the
segment
228 = 256M
Each Segment can have a maximum of
65536 pages
.
.
.
256 Mbyte Segment
52-bit Segment Id + 28-bit offset = 80-bit Virtual Address
Virtual Memory
1 Trillion Terabytes or 1 Yotta byte
9
© 2008 IBM Corporation
11/15/13
10. Advanced Technical Support – System p
Memory Tuning Overview
vmo –p –o <parameter name>=<new value>
Memory:
-p flags updates /etc/tunables/nextboot
Virtual Memory
JFS
(General)
Enhanced JFS
Large Pages
(JFS2)
(Pinned Memory 1)
minfree
maxperm
maxclient
v_pinshm
maxfree
strict_maxperm
strict_maxclient
lgpg_regions
lru_file_repage
lgpg_size
lru_poll_interval
NAME
CUR
DEF
BOOT
MIN
MAX
UNIT
TYPE
-------------------------------------------------------------------------------lru_file_repage
1
1
1
0
1
boolean
D
lru_poll_interval
0
0
0
0
60000 milliseconds
D
maxclient%
80
80
80
1
100
% memory
D
maxfree
1088
1088
1088
8
200K
4KB pages
D
maxperm%
80
80
80
1
100
% memory
D
minfree
960
960
960
8
200K
4KB pages
D
strict_maxclient
1
1
1
0
1
boolean
D
strict_maxperm
0
0
0
0
1
boolean
D
minperm%
20
20
20
1
100
% memory
D
10
© 2008 IBM Corporation
11/15/13
11. IBM Advanced Technical Support - Americas
Virtual Memory Manager (VMM) Tuning
The AIX “vmo” command provides for the display and/or
update of several parameters which influence the way AIX
manages physical memory
– The “-a” option displays current parameter settings
vmo –a
– The “-o” option is used to change parameter values
vmo –o minfree=1440
– The “-p” option is used to make changes persist across a reboot
vmo –p –o minfree=1440
A number of the default “vmo” settings are not optimized for
database workloads and should be modified for Oracle environments
11
© 2008 IBM Corporation
11/15/13
12. IBM Advanced Technical Support - Americas
VMM Tuning
Suggested Combination
– maxperm%=maxclient%=<High Percentage>
– minperm% = <Low Percentage>
– strict_maxperm=0
– strict_maxclient=1
– lru_file_repage=0
– lru_poll_interval=10
The file cache will be allowed to grow; however, when the VMM needs
memory it will steal only file pages. Why? Because we’ve set
lru_file_repage=0.
What is <High Percentage>
– If possible, set so maxclient% is always greater than numclient% (vmstat –v)
• Why? Maxclient is a hard limit; therefore, lrud will not run
What is <Low Percentage>
– Set so that numperm (vmstat –v) is always greater than minperm%
• Why? If numperm drops below minperm then lru_file_repage is set to 1 and you
will steal computational pages
12
© 2008 IBM Corporation
11/15/13
13. Advanced Technical Support – System p
VMM Tuning Combination Summary – Goal is to
prevent paging of computational memory.
Recommended Method:
Classic Method*:
lru_file_repage = 0
lru_file_repage = 1
strict_maxperm = 0
strict_maxperm = 0
strict_maxclient = 1
strict_maxclient = 0
maxperm% = maxclient% = High Percentage
maxperm% = maxclient% = 20% (or small number)
minperm% = Low Percentage
minperm% = 5
lru_poll_interval=10
lru_poll_interval=10
* This method is appropriate for system that don’t have
‘lru_file_repage’ tunable.
Calculated Method:
Avoid:
lru_file_repage = 0
strict_maxperm = 1 and strict_maxclient = 0
strict_maxperm = 0
strict_maxperm = strict_maxclient = 0 & lru_file_repage = 0
strict_maxclient = 1
maxperm% = maxclient% = 1 - % Computational + 20%
lru_poll_interval=10
Where,
%Computational = max. AVM / Real Memory Frames
13
© 2008 IBM Corporation
11/15/13
14. IBM Advanced Technical Support - Americas
Virtual Memory Management (VMM) Thresholds
Start stealing pages when
free memory below minfree
80%
Physical Memory
100%
Stop stealing pages when
free memory above maxfree
60%
When numperm% >
maxperm%, steal only file
system pages
40%
When minperm% < numperm
% < maxperm%, steal file
system or computation
pages, depending on repage
rate
20%
0%
numperm%
maxfree
14
Time
comp%
minfree
© 2008 IBM Corporation
Free%
minperm%
maxperm%
When numperm% < minperm
%, steal both file system and
computational pages
11/15/13
15. IBM Advanced Technical Support - Americas
VMM Page Stealing Thresholds
The following define thresholds for the VMM page stealing process (lrud):
minfree
– Set minfree = 120 x # logical CPUs / # Memory pools
– Consider increasing if vmstat “fre” column frequently approaches zero or
if “vmstat –s” shows significant “free frame waits”
maxfree
– Set maxfree = minfree + (MAX(maxpgahead, j2_maxPageReadAhead) x #
logical CPUs)
Example:
For a 6-way LPAR with SMT enabled, maxpgahead=8 and
j2_maxPageReadAhead=8:
– minfree = 360 = 120 x 6 x 2 / 4
– maxfree = 1536 = 1440 + (max(8,8) x 6 x 2)
vmo –o minfree=1440 –o maxfree=1536 -p
15
© 2008 IBM Corporation
11/15/13
16. Advanced Technical Support – System p
AIX 5.3/6.1 – minfree and maxfree changes
minfree and maxfree on AIX 5.3/6.1 are now applied to each memory pool.
total free list = minfree * # of memory pools
In earlier releases of AIX (5.2 and 5.1), minfree was divided by the number of memory pools
so that the total free list (determined by adding minfree for *each* memory pool) equaled the
vmo/vmtune value of minfree.
AIX Level
51/52
53
minfree
1024
1024
mempools
4
4
LRUD starts when
free_list =< 1024
free_list =< (4 * 1024)
Initial Setting AIX 5.3/6.1
Initial Setting AIX 5.2
minfree = max( 960, lcpus * 120)
----------------------# of mempools
minfree = max( 960, lcpus * 120)
maxfree = minfree + (Max Read Ahead * lcpus)
---------------------# of mempools
maxfree = minfree + (Max Read Ahead * lcpus)
Where,
Max Read Ahead = max( maxpgahead, j2_maxPageReadAhead)
16
© 2008 IBM Corporation
11/15/13
17. IBM Advanced Technical Support - Americas
AIX Paging Space
Allocate Paging Space:
Configure Server/LPAR with enough physical memory to satisfy memory requirements
With AIX demand paging, paging space does not have to be large
Provides safety net to prevent system crashes when memory overcommitted.
Generally, keep within internal drive or high performing SAN storage
Monitor paging activity:
vmstat -s
sar -r
nmon
Resolve paging issues:
Reduce file system cache size (MAXPERM, MAXCLIENT)
Reduce Oracle SGA or PGA (9i or later) size
Add physical memory
Do not over commit real memory!
17
© 2008 IBM Corporation
11/15/13
18. IBM Advanced Technical Support - Americas
AIX 5.3/6.1 Multiple Page Size Support
AIX 5.3 5300-04 introduces two new page sizes:
– 64K
– 16M (large pages)
Requires p5+ hardware
Requires p5 System Release 240, Service Level 202 microcode
16MB support requires Version 5 Release 2 of the Hardware
Management Console (HMC) machine code
User/Application must request preferred page size
– 64K pages appear very promising, since they do not need to be
configured/reserved in advance
– Will require Oracle code changes to explicitly support (10.2.0.4)
– If preferred size not available, the largest available smaller size will
be used
•
18
Current Oracle versions should end up using 64KB pages if 16mb pages
not configured?
© 2008 IBM Corporation
11/15/13
19. IBM Advanced Technical Support - Americas
Large Page Support (optional)
Pinning shared memory
AIX Parameters
• vmo –p –o v_pinshm = 1
• Leave maxpin% at the default of 80% unless the SGA exceeds 77% of real memory
– Vmo –p –o maxpin%=[(total mem-SGA size)*100/total mem] + 3
Oracle Parameters
• LOCK_SGA = TRUE
Enabling Large Page Support
vmo –r –o lgpg_size = 16777216 –o lgpg_regions=(SGA size / 16 MB)
Allowing Oracle to use Large Pages
chuser capabilities=CAP_BYPASS_RAC_VMM,CAP_PROPAGATE oracle
Using Monitoring Tools
svmon –G
svmon –P
Oracle metalink note# 372157.1
19
© 2008 IBM Corporation
11/15/13
20. IBM Advanced Technical Support - Americas
Determining SGA size
SGA Memory Summary for DB: test01 Instance: test01 Snaps: 1046 -1047
SGA regions
Size in Bytes
------------------------------
----------------
Database Buffers
16,928,210,944
Fixed Size
768,448
Redo Buffers
2,371,584
Variable Size
1,241,513,984
----------------
sum
18,172,864,960
lgpg_regions = 18,172,864,960 / 16,777,216 = 1084 (rounded up)
20
© 2008 IBM Corporation
11/15/13
21. Advanced Technical Support – System p
Tuning and Improving System Performance
Adjust the VMM Tuning Parameters
– Key parameters listed on word document
Implement VMM related Mount Options
– DIO / CIO
– Release behind or read and/or write
Reduce Application Memory Requirements
Memory Model
– %Computational < 70% - Large Memory Model – Goal is to adjust tuning parameters to
prevent paging
• Multiple Memory pools
• Page Space smaller than Memory
• Must Tune VMM key parameters
– %Computational > 70% - Small Memory Model – Goal is to make paging as efficient as
possible
• Add multiple page spaces on different spindles
• Make all pages space the same size to ensure round-robin scheduling
• PS = 1.5 computational requirements
• Turn off DEFPS
• Memory Load Control
Add additional Memory
21
© 2008 IBM Corporation
11/15/13
22. IBM Advanced Technical Support - Americas
Agenda
AIX Configuration Best Practices for Oracle
– Memory
– I/O
– Network
– Miscellaneous
22
© 2008 IBM Corporation
11/15/13
23. IBM Advanced Technical Support - Americas
The AIX IO stack
Application
Raw disks
Raw LVs
Logical File
System
Local FS
JFS/JFS2
Remote FS
NFS
VMM
Application memory area caches data to
avoid IO
NFS caches file attributes
NFS has a cached filesystem for NFS clients
JFS and JFS2 cache use extra system RAM
JFS uses persistent pages for cache
JFS2 uses client pages for cache
LVM
Device Driver (s)
Disk Subsystem (optional)
Disk
Cache
23
Queues exist for both adapters and disks
Adapter device drivers use DMA for IO
Disk subsystems have read and write cache
Disks have memory to store commands/data
Write Cache - ack sent back to application
© 2008 IBM Corporation
11/15/13
24. IBM Advanced Technical Support - Americas
Asynchronous I/O
AIX parameters (smit aio)
minservers = 10 * # cpus
maxservers = (10 * # disks) / # cpus
maxreqs = a multiple of 4096 > 4 * #disks * queue_depth
“enable” at system restart
Typical settings: minservers=100, maxservers=200, maxreqs=16384
Oracle parameters (init.ora)
disk_asynch_io = TRUE
filesystemio_options = {ASYNCH | SETALL}
db_writer_processes = n (normally left at default, 1)
db_writer_io_slaves = n (don’t use – implements AIO simulation)
Monitor usage:
•
•
Watch for Oracle alert log or trace file messages:
– Warning “lio_listo returned EAGAIN”
AIX Monitoring
– “pstat –a | grep aios”
– Use “-A” and “-t” options for NMON
Note: FASTPATH, which uses async IO. AIO servers method uses the process based IO, whereas FASTPATH method uses
Kernel based (interrupt based) is much better. Make sure it is enabled by using the following command:
– lsattr -El aio0 and look for the value "fastpath", which should be enabled
24
© 2008 IBM Corporation
11/15/13
25. IBM Advanced Technical Support - Americas
AIX Filesystems
Journaled File System (JFS)
Better for lots of small file creates & deletes
– Buffer caching (default) provides Sequential Read-Ahead, cached writes, etc.
– Direct I/O (DIO) mount/open option no caching on reads
Enhanced JFS (JFS2)
Better for large files/filesystems
– Buffer caching (default) provides Sequential Read-Ahead, cached writes, etc.
– Direct I/O (DIO) mount/open option no caching on reads
– Concurrent I/O (CIO) mount/open option DIO, with write serialization
disabled
• Use for Oracle .dbf, control files and online redo logs only!!!
GPFS
Clustered filesystem – the IBM filesystem for RAC
– Non-cached, non-blocking I/Os (similiar to JFS2 CIO) for all Oracle files
GPFS and JFS2 with CIO offer similar performance as Raw Devices
25
© 2008 IBM Corporation
11/15/13
26. IBM Advanced Technical Support - Americas
Cached vs. non-Cached (Direct) I/O
File System caching tends to benefit heavily sequential workloads with low
write content. To enable caching for JFS/JFS2:
Use default filesystem mount options
Set Oracle filesystemio_options=ASYNCH
DIO tends to benefit heavily random access workloads and CIO tends to
benefit heavy update workloads. To disable JFS, JFS2 caching, see the
following table:
Oracle 9i
JFS
JFS2
26
Oracle 10g
Set filesystemio_options=SETALL
-orUse “dio” mount option
Set filesystemio_options=SETALL
-orUse “dio” mount option
Use “cio” mount option
Set filesystemio_options=SETALL
-orUse “cio” mount option
© 2008 IBM Corporation
11/15/13
27. IBM Advanced Technical Support - Americas
CIO Demotion and Filesystem Block Size
Data Base Files (DBF)
If db_block_size = 2048 set agblksize=2048
If db_block_size >= 4096 set agblksize=4096
Redo Log Files
Set agblksize=512 and use CIO or DIO
27
© 2008 IBM Corporation
11/15/13
28. IBM Advanced Technical Support - Americas
I/O Tuning (ioo)
READ-AHEAD (Only applicable to JFS/JFS2 with caching enabled)
MINPGAHEAD (JFS) or j2_minPageReadAhead (JFS2)
– Default: 2
– Starting value: MAX(2,DB_BLOCK_SIZE / 4096)
MAXPGAHEAD (JFS) or j2_maxPageReadAhead (JFS2)
– Default: 8 (JFS), 128 (JFS2)
– Set equal to (or multiple of) size of largest Oracle I/O request
• DB_BLOCK_SIZE * DB_FILE_MULTI_BLOCK_READ_COUNT
Number of buffer structures per filesystem:
NUMFSBUFS:
– Default: 196, Starting Value: 568
j2_nBufferPerPagerDevice (j2_dynamicBufferPreallocation replaces)
– Default: 512, Starting Value: 2048
Monitor with “vmstat –v”
28
© 2008 IBM Corporation
11/15/13
29. IBM Advanced Technical Support - Americas
Data Layout for Optimal I/O Performance
Stripe and mirror everything (SAME) approach:
Goal is to balance I/O activity across all disks, loops, adapters, etc...
Avoid/Eliminate I/O hotspots
Manual file-by-file data placement is time consuming, resource intensive and iterative
Use RAID-5 or RAID-10 to create striped LUNs (hdisks)
Create AIX Volume Group(s) (VG) w/ LUNs from multiple
arrays, striping on the front end as well for maximum
distribution
Physical Partition Spreading (mklv –e x) –orLarge Grained LVM striping (>= 1MB stripe size)
http://www1.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/WP100319
29
© 2008 IBM Corporation
11/15/13
30. IBM Advanced Technical Support - Americas
Data Layout cont’d…
Stripe using Logical Volume (LV) or Physical Partition (PP) striping
LV Striping
– Oracle recommends stripe width of a multiple of
• Db_block_size * db_file_multiblock_read_count
• Usually around 1 MB
– Valid LV Strip sizes:
• AIX 5.2: 4k, 8k, 16k, 32k, 64k, 128k, 256k, 512k, 1 MB
• AIX 5.3: AIX 5.2 Stripe sizes + 2M, 4M, 16 MB, 32M, 64M, 128M
– Use AIX Logical Volume 0 offset (9i Release 2 or later)
• Use Scalable Volume Groups (VGs), or use “mklv –T O” with Big VGs
• Requires AIX APAR IY36656 and Oracle patch (bug 2620053)
PP Striping
– Use minimum Physical Partition (PP) size (mklv -t, -s parms)
• Spread AIX Logical Volume (LV) PPs across multiple hdisks in VG
(mklv –e x)
30
© 2008 IBM Corporation
11/15/13
31. IBM Advanced Technical Support - Americas
Tuning and Improving System Performance
Adjust the key IOO Tuning Parameters
Adjust device specific tuning Parameters
Other I/O tuning Options
– DIO / CIO
– Release behind or read and/or write
– IO Pacing
– Write Behind
Improve the data layout
Add additional hardware resources
31
© 2008 IBM Corporation
11/15/13
32. IBM Advanced Technical Support - Americas
Agenda
AIX Configuration Best Practices for Oracle
– Memory
– I/O
– Network
– Miscellaneous
32
© 2008 IBM Corporation
11/15/13
33. IBM Advanced Technical Support - Americas
Network Options (no) Parameters
•
•
•
•
33
Set sb_max >= 1 MB (1048576)
Set tcp_sendspace >= 262144
Set tcp_recvspace >= 262144
Set rfc1323=1
© 2008 IBM Corporation
11/15/13
34. IBM Advanced Technical Support - Americas
Additional Network (no) Parameters for RAC:
Set udp_sendspace = db_block_size *
db_file_multiblock_read_count
(not less than 65536)
Set udp_recvspace = 4 * udp_sendspace
– Must be < sb_max
Increase if buffer overflows occur
Examples:
no -a |grep udp_sendspace
no –o -p udp_sendspace=65536
netstat -s |grep "socket buffer overflows"
34
© 2008 IBM Corporation
11/15/13
35. IBM Advanced Technical Support - Americas
Agenda
AIX Configuration Best Practices for Oracle
– Memory
– I/O
– Network
– Miscellaneous
35
© 2008 IBM Corporation
11/15/13
36. IBM Advanced Technical Support - Americas
Miscellaneous parameters
User Limits (smit chuser)
–
–
–
–
–
Soft FILE size = -1 (Unlimited)
Soft CPU time = -1 (Unlimited)
Soft DATA segment = -1 (Unlimited)
Soft STACK size -1 (Unlimited)
/etc/security/limits
Maximum number of PROCESSES allowed per user (smit chgsys)
– maxuproc >= 2048
Environment variables:
– AIXTHREAD_SCOPE=S
36
© 2008 IBM Corporation
11/15/13
37. IBM Advanced Technical Support - Americas
DLPAR & Oracle
CPU
Oracle 9i
– Oracle CPU_COUNT does not recognize change in # logical cpus
– AIX scheduler can still use the added CPUs
Oracle 10g
Oracle CPU_COUNT is dynamically updated for change in # logical cpus
Memory
Oracle 9i or 10g
– SGA can be dynamically resized, but has an upper bound of SGA_MAX_SIZE.
• SGA_TARGET (10g)
• DB_CACHE_SIZE, SHARED_POOL_SIZE., etc.
– PGA_AGGREGATE_TARGET can be dynamically resized
SGA_TARGET and PGA_AGGREGATE_TARGET are not hard limits
37
© 2008 IBM Corporation
11/15/13
38. IBM Advanced Technical Support - Americas
Micro-Partitioning technology
Min
Max
Hypervisor
Note: Micro-partitions are optional.
38
© 2008 IBM Corporation
Partitioning options
– Micro-partitions: Up to 254*
– Dynamic LPARs: Up to 32*
– Combination of both
Configured via the HMC
** 3 R5 V S O 5
/ i
x un L
i
3 5 V L5 X A
.
I
Entitled
capacity
x un L
i
Pool of 6 CPUs
3 5 V L5 X A
.
I
Whole
Processors
3 5 V L5 X A
.
I
Micro-partitions
2 5 V L5 X A
.
I
Dynamic
LPARs
Micro-Partitioning technology allows
each processor to be subdivided into as
many as 10 “virtual servers”
Number of logical processors
– Minimum/maximum
Entitled capacity
– In units of 1/100 of a CPU
– Minimum 1/10 of a CPU
Variable weight
– % share (priority) of
surplus capacity
Capped or uncapped partitions
*on p5-590 and p5-595
** on p5-570, p5-590, and p5-595
11/15/13
39. IBM
Shared Advanced Technical Support - Americas
Processor Logical Partitions – Terminology
LPAR w/o SMT
LPAR w/o SMT
AIX 5.3
AIX 5.3
LPAR w/SMT
LPAR w/SMT
AIX 5.3
AIX 5.3
LPAR
LPAR
Shared Processor Logical Partition (splpar)
key terms that will be discussed:
Physical Processors (PP) – An 8-way p5
590.
For this configuration one MCM
houses 4 POWER5 chip and each
POWER5 chip has two processor cores.
With SMT enable each processor core can
simultaneous execute two instruction
threads.
Shared Processor Pool – 6 processors
have been allocated to the shared
processor pool and 2 processors have
been allocated to a dedicated partition.
Virtual Processors
Logical Processors
Virtual Processors (VP) – The operating
system views the virtual processors as a
“physical processor”.
Logical Processors – With SMT enabled
each VP is viewed by the operating system
has having two logical processors.
Process Capacity specification for splpars Each splpar has the entitled processing
capability, which is defined via a number of
partition configuration parameters.
Shared Processor Pool
Capacity of 6 Processing Units
POWER5 Chip
39
© 2008 IBM Corporation
Processor Core
The four POWER5 chips are
packaged on a Multi-Chip Module
(MCM).
Now, let’s discuss processor
specification in more detail.
11/15/13
capacity
40. IBM Advanced Technical Support - Americas
Capped Shared Processor LPAR
Pool Idle Capacity Available
Maximum Processor Capacity
Processor
Capacity
Utilization
Entitled Processor Capacity
LPAR Capacity Utilization
ceded capacity
minimum processor capacity
utilized capacity
Time
40
© 2008 IBM Corporation
11/15/13
41. IBM Advanced Technical Support - Americas
Uncapped Shared Processor LPAR
Pool Idle Capacity Available
Maximum Processor Capacity
Processor
Capacity
Utilization
Entitled Processor Capacity
ceded capacity
minimum processor capacity
Utilized Capacity
Time
41
© 2008 IBM Corporation
11/15/13
42. 42
© 2008 IBM Corporation
11/15/13
11:49
11:47
11:45
11:43
11:41
11:39
11:37
11:35
11:33
11:31
11:29
0
11:27
20
11:25
RunQueue
11:23
RunQueue
11:21
40
11:19
60
11:17
Wait%
11:15
Sys% Wait%
11:13
CPU Total AIX53 10/9/2004
11:11
CPU Total AIX52 3/9/2004
11:09
80
11:07
100
11:05
5
0
11:03
10
10
11:01
20
10:59
30
10:57
40
10:55
50
10:53
40
80
10:51
45
90
10:49
60
12:34
12:36
12:38
12:40
12:42
12:44
12:46
12:48
12:50
12:52
12:54
12:56
12:58
13:00
13:02
13:04
13:06
13:08
13:10
13:12
13:14
13:16
13:18
13:20
13:22
13:24
13:26
13:28
13:30
13:32
13:34
13:36
13:38
13:40
13:42
13:44
100
10:47
70
10:45
11:49
11:47
11:45
11:43
11:41
11:39
11:37
11:35
11:33
11:31
11:29
11:27
11:25
11:23
Sys%
11:21
User%
11:19
12:34
12:36
12:38
12:40
12:42
12:44
12:46
12:48
12:50
12:52
12:54
12:56
12:58
13:00
13:02
13:04
13:06
13:08
13:10
13:12
13:14
13:16
13:18
13:20
13:22
13:24
13:26
13:28
13:30
13:32
13:34
13:36
13:38
13:40
13:42
13:44
User%
11:17
11:15
11:13
11:11
11:09
11:07
11:05
11:03
11:01
10:59
10:57
10:55
10:53
10:51
10:49
10:47
10:45
IBM Advanced Technical Support - Americas
Simultaneous Multithreading (SMT) & Oracle
Without SMT:
Processes AIX52 3/9/2004
Swap-in
35
30
25
20
15
0
With SMT:
Processes AIX53 10/9/2004
25
Swap-in
90
70
20
50
15
30
10
10
5
0
43. IBM Advanced Technical Support - Americas
Performance Monitoring and Tuning Tools
CPU
Memory
I/O
Subsystem
Network
Processes &
Threads
vmstat, topas, vmstat, topas,
iostat, ps,
ps, lsps, ipcs
mpstat,
lparstat, sar,
time/timex,
emstat/alstat
vmstat, topas,
iostat,
lvmstat, lsps,
lsattr/lsdev,
lspv/lsvg/lslv
netstat, topas, ps, pstat,
atmstat,
topas,
entstat,
emstat/alstat
tokstat,
fddistat,
nfsstat, ifconfig
netpmon
svmon,
netpmon,
filemon
fileplace,
filemon
netpmon,
tcpdump
svmon, truss,
kdb, dbx,
gprof, kdb,
fuser, prof
Trace Level
Commands
tprof, curt,
splat, trace,
trcrpt
trace,trcrpt
trace, trcrpt
iptrace,
ipreport, trace,
trcrpt
truss, pprof
curt, splat,
trace, trcrpt
ioo, lvmo,
chdev,
migratepv,chl
v, reorgvg
no,
chdev,ifconfig
nfso,chdev
Tuning tools
schedo, fdpr, vmo,
bindprocessor rmss,fdpr,
, bindintcpu,
chps/mkps
nice/renice,
setpri
Status
Commands
Monitor
Commands
43
© 2008 IBM Corporation
11/15/13
44. IBM Advanced Technical Support - Americas
Reference Material:
Oracle Techical Documentation
http://technet.oracle.com
Oracle Support
http://metalink.oracle.com (requires support license)
Check metalink note ID 282036.1
IBM Redbooks on Oracle
http://www.redbooks.ibm.com
Advanced Technical Support (Techdocs)
http://www.ibm.com/support/techdocs
http://w3.ibm.com/support/techdocs (IBM Internal)
GPFS Documentation
http://publib.boulder.ibm.com/clresctr/library/gpfs_faqs.html
AIX Documentation
http://www.ibm.com/servers/eserver/pseries/library/
44
© 2008 IBM Corporation
11/15/13
46. IBM Advanced Technical Support - Americas
Trademarks
The following are trademarks of the International Business Machines Corporation in the United States and/or other countries. For a complete list of IBM Trademarks, see www.ibm.com/legal/copytrade.shtml: AS/400,
DBE, e-business logo, ESCO, eServer, FICON, IBM, IBM Logo, iSeries, MVS, OS/390, pSeries, RS/6000, S/30, VM/ESA, VSE/ESA, Websphere, xSeries, z/OS, zSeries, z/VM
The following are trademarks or registered trademarks of other companies
Lotus, Notes, and Domino are trademarks or registered trademarks of Lotus Development Corporation
Java and all Java-related trademarks and logos are trademarks of Sun Microsystems, Inc., in the United States and other countries
LINUX is a registered trademark of Linux Torvalds
UNIX is a registered trademark of The Open Group in the United States and other countries.
Microsoft, Windows and Windows NT are registered trademarks of Microsoft Corporation.
SET and Secure Electronic Transaction are trademarks owned by SET Secure Electronic Transaction LLC.
Intel is a registered trademark of Intel Corporation
* All other products may be trademarks or registered trademarks of their respective companies.
NOTES:
Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary
depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that
an individual user will achieve throughput improvements equivalent to the performance ratios stated here.
IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.
All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental
costs and performance characteristics will vary depending on individual customer configurations and conditions.
This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice.
Consult your local IBM business contact for information on the product or services available in your area.
All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any
other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.
References in this document to IBM products or services do not imply that IBM intends to make them available in every country.
Any proposed use of claims in this presentation outside of the United States must be reviewed by local IBM country counsel prior to such use.
The information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may
make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.
Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the
materials for this IBM product and use of those Web sites is at your own risk.
46
© 2008 IBM Corporation
11/15/13
Editor's Notes This shows the IO stack for AIX. When tuning, we have to be aware of all the layers, as each layer impacts performance, and there are knobs to turn at each layer.
IOs can be coalesced into fewer IOs, or broken up into more IOs as they go up and down thru the IO layers. Generally, one gets better performance, in MB/s, with fewer larger IOs, and with fewer IOs, there's less CPU overhead to handle the IOs.
Note that system setup, from a data layout viewpoint, is generally done from the bottom up. First the disk subsystem is configured, then the device layer (hdisks, vpaths, etc.), then the LVM layer (VGs then LVs) then the filesystems, and finally the files.
The disk interconnection technology exists below the device driver level, sometimes prior to the disk subsystem and within the disk subsystem if it exists. The advent of SANs, NAS and iSCSI have additional latencies for getting the IO across the disk network.
Direct IO (DIO), and concurrent IO (CIO), bypasses the use of JFS cache, and is beneficial in some circumstances, e.g., when updating log files. Direct IO can be specified either by a mount option mount -o dio or via a program opening a file with the O_DIRECT open flag.
Synchronous and asynchronous IO refer to whether or not the application is coded so that if the application must wait for the IO to complete, then it's a synchronous IO. Default write IOs to JFS or JFS2 are asynchronous unless specifically coded to be synchronous.
Note that most application use the character device (the r device, e.g. /dev/r<lvname>) for IO though it's also possible to use the block device.
NFS file attribute caching is specified via the actimeo, acregmin, acregmax, acdirmin and acdirmax attributes in /etc/filesystems. It also allows a cached filesystem on NFS clients via the cfsadmin command so that files from the NFS server will be copied to local disk.
Maxreqs min 8192- note – find out WHY this param has so much impact on performance; variable description makes no sense.