SlideShare a Scribd company logo
1 of 38
Download to read offline
August 2018
Refreshing our knowledge
HugePages:
Why, what and how
2© The Pythian Group Inc., 2018
What's up with
HugePages?
© The Pythian Group Inc., 2018 3
Jose Rodriguez
Project engineer at Pythian
● +10 years of experience, mainly Oracle but also SQL Server and
others like DB2 LUW or PostgreSQL
● Solaris, Linux and Windows RAC and HA with DG and GG
Other areas of expertise, i.e., things I like doing
● Scripting and automation (lazy DBA)
● Machine Learning
● Golden Gate replication
● Cloud related stuff (who doesn't nowadays, eh? )
About Pythian
Pythian’s 400+ IT professionals
help companies adopt
and manage disruptive data
technologies to better compete
© 2018 Pythian. Confidential 4
© 2018 Pythian. Confidential 5
Systems currently
managed by Pythian
EXPERIENCED
Pythian experts
in 35 countries
GLOBAL
Millennia of experience
gathered and shared
over 19 years
EXPERTS
11,800 2400+
What are you taking away
So you can leave now if you already have it
Agenda
7© The Pythian Group Inc., 2018
Why do we care?
What are HugePages?
How to implement?
What can happen - Case Studies
© The Pythian Group Inc., 2018 8
● It is 2019, but HugePages seem yet to be
understood and broadly implemented
● More memory -> more problems
● systems with >= 1TB RAM are common
nowadays
● Problems caused by lack of HugePages are
not always easy to spot
Why do we care?
9© The Pythian Group Inc., 2018
What Are HugePages
Behind the scenes
© The Pythian Group Inc., 2018 10
Virtual to physical memory mapping
0x00
...
0x01
...
0x23
0x24
42
157
245
userprocess
mainmemory
11© The Pythian Group Inc., 2018
Memory allocations are tracked in PageTables
virtual real
0x00 42
0x01 175
0x02 176
0x03 177
0x04 178
... ...
4kB
© The Pythian Group Inc., 2018 12
Virtual to physical memory mapping
...
virtual real
0x00 42
0x01 175
0x02 176
0x03 177
0x04 178
... ...
userprocess
physicalmemory
PageTable
© The Pythian Group Inc., 2018 13
● To allocate 100 GiB there will be 26,214,400 memory pages of 4KB
each
● An OS would typically group and map them hierarchically in frames
■ i.e. continuous space can be mapped more efficiently
● Each PageTable Entry (PTE) is around 8 bytes for 64 bits systems
● Vmem offset + Physical address + Flags
● PageTables are also stored in memory. Size would be 200 MiB in our
example
● For shared memory segments (e.g. SGA) each process has a copy of
the PageTable
● A regular single instance may have 1000 sessions * 200 MiB each leads
to 200GiB of RAM to track RAM.
What's up with PageTables?
14© The Pythian Group Inc., 2018
HugePages to the Rescue!
virtual real
0x00 1
0x02 ...
0x02 ...
0x03 ...
0x04 ...
... ...
2048KB2048KiB
4KiB
PageTable reduced 512 time to only ~400KiB
© The Pythian Group Inc., 2018 15
● Allocate only enough HugePages
● HugePages cannot be swapped out
● Oracle Automatic Memory Management (AMM) is incompatible
with HugePages
● Transparent HugePages (THP) do not go along well with
Oracle, disabled by default in UEK2+
● Platforms other than Linux x64 have even bigger choices of
large page sizes up to 1GiB
● In extreme cases, SGA of TiBs in size, may lead to slow
instance startup. PRE_PAGE_SGA may help here
● AMM is forbidden in 12.2 if RAM>4GiBs, so HP should be used
here.
HugePages additional facts
© The Pythian Group Inc., 2018 16
● Do we really need/want HugePages for ASM?
● ASM uses AMM by default so initially not HP compatible.
/dev/shm is important here.
● We don't for "regular" ASM instances. Documentation and best
practices say this clearly, although this may change in future
releases.
● Highly recommended for Exadata. MOS notes 2062068.1 and
2111010.1 clearly indicate that ASMM should be enabled and
HugePages available for ASM.
HugePages and ASM
© The Pythian Group Inc., 2018 17
● /dev/shm is automatically set to 50% of total RAM
● Oracle AMM uses /dev/shm to "store" shared memory pages
● We may be tempted to reduce the size of /dev/shm to allow
more room to HugePages. No need
● HugePages and AMM are incompatible
HugePages and /dev/shm
© The Pythian Group Inc., 2018 18
● Does Oracle use HugePages for PGA?
● No, it doesn't (currently)
● No hard evidence against it in docs or MOS
● Tests show that Oracle is not allocating HugePages for it
● Counterintuitive for small memory allocations
● May change in the future (DWH or DSS sort area)
HugesPages and PGA
19© The Pythian Group Inc., 2018
HugePages on the Cloud
● Supported on AWS RDS since July, 2017
but not enabled by default. There are
limitations to the type of instance you can
enable HP on.
● No official documentation on Azure, but a
recent test showed that we can set up HP
in a Linux VM running on Azure.
● Google Cloud platform supports
HugePages.
● Oracle Cloud Service – Officially
supported for Exadata Cloud Service.
OCI allows it but not by default.
Classic platform has it enabled by
default.
20© The Pythian Group Inc., 2018
Let's do it!
© The Pythian Group Inc., 2018 21
● Script provided in MOS: "Oracle Linux: Shell Script to Calculate
Values Recommended Linux HugePages / HugeTLB Configuration
(Doc ID 401749.1)"
● Or use the following formula:
SGA size (MiB) / 2 (MiB) + 42
How many HugePages do I need?
22© The Pythian Group Inc., 2018
● May need extra work on VMs
● Disable AMM
● Set use_large_pages=only
● Disable THP
● Set memlock user limit
● Set vm.nr_hugepages
● Set vm.hugetlb_shm_group as required
(SUSE)
● Reboot OS (not always required)
● Restart Oracle instance
● Use TuneD profiles on RHEL 7 and
above
Implementation steps
23© The Pythian Group Inc., 2018
Success!
2018-08-20T12:43:18.163509+00:00
Dump of system resources acquired for SHARED GLOBAL AREA (SGA)
2018-08-20T12:43:18.163653+00:00
Per process system memlock (soft) limit = 2048M
2018-08-20T12:43:18.163821+00:00
Expected per process system memlock (soft) limit to lock
SHARED GLOBAL AREA (SGA) into memory: 1540M
2018-08-20T12:43:18.163952+00:00
Available system pagesizes:
4K, 2048K
2018-08-20T12:43:18.164143+00:00
Supported system pagesize(s):
2018-08-20T12:43:18.164220+00:00
PAGESIZE AVAILABLE_PAGES EXPECTED_PAGES ALLOCATED_PAGES ERROR(s)
2018-08-20T12:43:18.164382+00:00
2048K 1056 770 770 NONE
[oracle@HPtesting ~]$ grep
^HugePages /proc/meminfo
HugePages_Total: 1056
HugePages_Free: 287
HugePages_Rsvd: 1
HugePages_Surp: 0
© The Pythian Group Inc., 2018 24
LargePages (A.K.A. HugePages in Windows)
● Available since Oracle 10.1
● Enabled by adding an entry into the registry, ideally for each SID
instead of general
● Again only used for SGA
● Not considered in the "Working Set" so memory usage metrics are
now somehow flawed
● Startup times may be slow and with high impact on the server
performance for older versions
● Oriented to DWH type databases
25© The Pythian Group Inc., 2018
Case Studies
Lack of HugePages causing trouble
26© The Pythian Group Inc., 2018
RAC node eviction
● 1 node of 2-node cluster evicted
● Logs show a timeout responding to
something prior to eviction
● We found no other errors or
evidence
● sar to the rescue!
27© The Pythian Group Inc., 2018
RAC node eviction – “sar -r”
05:20:01 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty
05:30:01 AM 361136 65389512 99.45 932 24477352 41041528 29.48 29509228 1880236 564
05:40:01 AM 354896 65395752 99.46 95164 24434320 41039432 29.48 29504552 1902320 552
05:50:01 AM 382940 65367708 99.42 87912 24420284 41021636 29.47 29474908 1902904 496
06:00:01 AM 385016 65365632 99.41 52432 24414712 41053708 29.49 29477412 1878860 488
06:10:01 AM 386796 65363852 99.41 596 24416944 41046880 29.48 29412032 1909420 628
06:20:02 AM 376484 65374164 99.43 596 24546212 41069336 29.50 29603108 2107020 460
06:30:01 AM 335176 65415472 99.49 596 24893684 41094396 29.52 29676840 2078424 648
06:40:05 AM 334152 65416496 99.49 596 24554064 41222332 29.61 29453168 2061660 0
06:50:03 AM 349392 65401256 99.47 596 22963852 41360864 29.71 28031816 1851900 72
07:00:10 AM 342752 65407896 99.48 596 21190320 41768848 30.00 26854676 1723480 0
07:10:04 AM 341756 65408892 99.48 596 20787592 41769828 30.00 26706944 1765980 12
Average: 414530 65336118 99.37 19907 24589646 41094908 29.52 29439910 1903200 2315
07:16:28 AM LINUX RESTART
28© The Pythian Group Inc., 2018
RAC node eviction – “sar -B”
05:20:01 AM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s %vmeff
05:30:01 AM 7257.49 91.49 8704.61 1.28 6571.59 44.62 0.00 30.63 68.65
05:40:01 AM 2021.61 2486.25 141607.78 5.77 60734.72 451.59 38.26 386.08 78.82
05:50:01 AM 6980.26 71.57 7241.62 0.56 6380.14 35.86 7.72 38.22 87.71
06:00:01 AM 7262.73 67.56 8717.63 1.10 6549.42 47.03 1.18 40.58 84.18
06:10:01 AM 1759.35 379.66 15556.00 4.59 7320.75 185.60 2.75 143.16 76.01
06:20:02 AM 63309.67 3624.60 34222.39 267.30 50019.66 42307.14 982.44 13754.06 31.77
06:30:01 AM 115962.81 2730.86 30665.11 373.74 86055.51 843180.51 16021.66 26924.73 3.13
06:40:05 AM 83609.10 1331.45 20393.23 235.71 62484.15 1104330.76 25458.10 20843.61 1.84
06:50:03 AM 158193.69 4252.68 27395.53 375.73 111261.98 1848753.28 61689.42 38619.69 2.02
07:00:10 AM 98699.51 4257.23 23771.15 292.99 70708.84 590100.80 12354.06 23573.29 3.91
07:10:04 AM 125777.83 2409.66 23413.33 415.65 91748.06 952301.48 24672.30 31401.32 3.21
Average: 20671.81 1108.76 15287.47 51.49 18890.14 126225.66 3483.02 4065.45 3.13
07:16:28 AM LINUX RESTART
29© The Pythian Group Inc., 2018
RAC node eviction - “cat /proc/meminfo” (after incident)
$ cat /proc/meminfo
MemTotal: 65918584 kB
MemFree: 1583912 kB
MemAvailable: 20034320 kB
Buffers: 416208 kB
Cached: 41349928 kB
SwapTotal: 73469948 kB
SwapFree: 73334068 kB
KernelStack: 23392 kB
PageTables: 12495120 kB
AnonHugePages: 1478656 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
30© The Pythian Group Inc., 2018
Unexpected Swapping
● Lots of notes about swapping in
alert log
● Small (2GB SGA)
● Rarely used database
● vm.swappiness was not reviewed,
probably at 60
WARNING: Heavy swapping observed on system in
last 5 mins.
pct of memory swapped in [0.27%] pct of memory
swapped out [2.22%].
Please make sure there is no memory pressure
and the SGA and PGA are configured correctly.
Look at DBRM trace file for more details.
© The Pythian Group Inc., 2018 31
● Once a month load > 400
● System unusable but no crash
CPU stealing
32© The Pythian Group Inc., 2018
Yet again - sar is the star
[oracle@oramstr01 oracle]$ sar -f sa07 -s 14:30:00 -e 18:30:00 -u
Linux 2.6.32-573.8.1.el6.x86_64 (oramstr01.testing.com) 12/07/2016 _x86_64_ (80 CPU)
02:30:01 PM CPU %user %nice %system %iowait %steal %idle
02:40:01 PM all 38.31 0.00 3.35 3.02 0.00 55.32
02:50:01 PM all 34.02 0.00 3.15 2.63 0.00 60.19
03:00:01 PM all 34.20 0.00 3.20 1.68 0.00 60.92
03:10:01 PM all 40.79 0.00 3.81 2.96 0.00 52.44
03:20:01 PM all 37.33 0.00 3.43 2.40 0.00 56.83
03:30:04 PM all 40.72 0.00 6.12 2.62 0.00 50.54
03:40:01 PM all 10.08 0.00 88.36 0.30 0.00 1.26
03:50:02 PM all 8.66 0.00 90.82 0.05 0.00 0.47
04:00:03 PM all 31.66 0.00 68.27 0.02 0.00 0.04
04:10:03 PM all 45.84 0.00 49.19 0.90 0.00 4.07
04:20:01 PM all 40.68 0.00 54.30 0.97 0.00 4.05
04:30:01 PM all 37.81 0.00 43.22 1.13 0.00 17.84
04:40:02 PM all 15.68 0.00 84.18 0.05 0.00 0.09
04:50:02 PM all 12.76 0.00 87.23 0.00 0.00 0.01
05:00:03 PM all 11.84 0.00 88.14 0.00 0.00 0.01
05:10:01 PM all 18.56 0.00 62.74 0.71 0.00 17.99
05:20:01 PM all 15.84 0.00 1.73 1.17 0.00 81.27
05:30:01 PM all 19.22 0.00 1.75 0.71 0.00 78.33
05:40:01 PM all 25.51 0.00 2.02 1.15 0.00 71.32
05:50:01 PM all 23.78 0.00 1.85 1.05 0.00 73.32
06:00:01 PM all 20.15 0.00 1.66 0.88 0.00 77.30
06:10:01 PM all 21.14 0.00 2.68 2.93 0.00 73.25
06:20:01 PM all 18.94 0.00 2.33 2.26 0.00 76.46
Average: all 26.23 0.00 32.81 1.29 0.00 39.67
© The Pythian Group Inc., 2018 33
[oracle@oramstr01 ~]$ date
Wed Dec 14 10:23:22 EST 2016
[oracle@oramstr01 ~]$ ps -ef | grep -c
oracleccxp
2717
[oracle@oramstr01 ~]$ grep PageTable
/proc/meminfo
PageTables: 461002976 kB
Yes, that is 440 GiBs of PageTables!
Sessions and pagetable memory
34© The Pythian Group Inc., 2018
Yet again - sar is the star
[oracle@oramstr01 oracle]$ sar -r -f sa07 -s 14:30:00 -e 17:30:00
Linux 2.6.32-573.8.1.el6.x86_64 (oramstr01.testing.com) 12/07/2016 _x86_64_ (80 CPU)
02:30:01 PM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit
02:40:01 PM 57063440 1001654588 94.61 1610536 412910108 311506624 26.72
02:50:01 PM 57747440 1000970588 94.55 1610564 412911112 306685964 26.31
03:00:01 PM 46592180 1012125848 95.60 1610596 412914608 308191992 26.44
03:10:01 PM 31680840 1027037188 97.01 1610608 412917688 310993660 26.68
03:20:01 PM 16457972 1042260056 98.45 1610628 412920472 309580976 26.56
03:30:04 PM 1739692 1056978336 99.84 1610628 411393436 317613764 27.25
03:40:01 PM 5066928 1053651100 99.52 1538352 395198928 324298580 27.82
03:50:02 PM 28196104 1030521924 97.34 1342292 381568100 324394208 27.83
04:00:03 PM 11313156 1047404872 98.93 1359396 378901468 326693864 28.03
04:10:03 PM 80061500 978656528 92.44 1359488 375162128 321167148 27.55
04:20:01 PM 64494004 994224024 93.91 1359508 375163768 322061964 27.63
04:30:01 PM 108230896 950487132 89.78 1359532 375166776 313685004 26.91
04:40:02 PM 135833716 922884312 87.17 1359548 375168248 318691876 27.34
04:50:02 PM 192323488 866394540 81.83 1359556 375169736 315572568 27.07
05:00:03 PM 235108136 823609892 77.79 1359648 375172216 312304460 26.79
05:10:01 PM 360281464 698436564 65.97 1359724 375173424 295083536 25.32
05:20:01 PM 357150032 701567996 66.27 1359748 375175952 296449248 25.43
35© The Pythian Group Inc., 2018
Summary
● HugePages are usually good to
have
● How to implement
● Know where to look
● /proc/meminfo
■ HugePages
■ Pagetables
● Remember the power of sar/OSwBB
● Following best practices prevents
issues
© The Pythian Group Inc., 2018 36
References
● Oracle 11g internals part 1: Automatic Memory Management by Tanel Poder
● Oracle SGA memory allocation on startup by Fritz Hoogland
● Oracle Linux: Shell Script to Calculate Values Recommended Linux HugePages / HugeTLB
Configuration (Doc ID 401749.1)
● Oracle Exadata Initialization Parameters and Diskgroup Attributes Best Practices (
Doc ID 2062068.1)
● 12.2 Grid Infrastructure and Database Upgrade steps for Exadata Database Machine running
11.2.0.3 and later on Oracle Linux (Doc ID 2111010.1)
● ASM & Shared Pool (ORA-4031) (Doc ID 437924.1)
Q&A
Ask now or reach out later, but don't keep the question for yourself
38© The Pythian Group Inc., 2018
THANK YOU
Hope you enjoyed it

More Related Content

What's hot

Introduction to CNI (Container Network Interface)
Introduction to CNI (Container Network Interface)Introduction to CNI (Container Network Interface)
Introduction to CNI (Container Network Interface)HungWei Chiu
 
iptables 101- bottom-up
iptables 101- bottom-upiptables 101- bottom-up
iptables 101- bottom-upHungWei Chiu
 
Ceph Object Storage Reference Architecture Performance and Sizing Guide
Ceph Object Storage Reference Architecture Performance and Sizing GuideCeph Object Storage Reference Architecture Performance and Sizing Guide
Ceph Object Storage Reference Architecture Performance and Sizing GuideKaran Singh
 
PostgreSQL on EXT4, XFS, BTRFS and ZFS
PostgreSQL on EXT4, XFS, BTRFS and ZFSPostgreSQL on EXT4, XFS, BTRFS and ZFS
PostgreSQL on EXT4, XFS, BTRFS and ZFSTomas Vondra
 
TripleOの光と闇
TripleOの光と闇TripleOの光と闇
TripleOの光と闇Manabu Ori
 
Nick Fisk - low latency Ceph
Nick Fisk - low latency CephNick Fisk - low latency Ceph
Nick Fisk - low latency CephShapeBlue
 
Container Networking Deep Dive
Container Networking Deep DiveContainer Networking Deep Dive
Container Networking Deep DiveHirofumi Ichihara
 
Crimson: Ceph for the Age of NVMe and Persistent Memory
Crimson: Ceph for the Age of NVMe and Persistent MemoryCrimson: Ceph for the Age of NVMe and Persistent Memory
Crimson: Ceph for the Age of NVMe and Persistent MemoryScyllaDB
 
ceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-shortceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-shortNAVER D2
 
ML2/OVN アーキテクチャ概観
ML2/OVN アーキテクチャ概観ML2/OVN アーキテクチャ概観
ML2/OVN アーキテクチャ概観Yamato Tanaka
 
PCE 〜MPLSネットワークのSDN化を本気で実現する"唯一の"方法〜
PCE 〜MPLSネットワークのSDN化を本気で実現する"唯一の"方法〜PCE 〜MPLSネットワークのSDN化を本気で実現する"唯一の"方法〜
PCE 〜MPLSネットワークのSDN化を本気で実現する"唯一の"方法〜Takuya Miyasaka
 
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li XiaoyanPerformance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li XiaoyanCeph Community
 
Writing the Container Network Interface(CNI) plugin in golang
Writing the Container Network Interface(CNI) plugin in golangWriting the Container Network Interface(CNI) plugin in golang
Writing the Container Network Interface(CNI) plugin in golangHungWei Chiu
 
Overview of Distributed Virtual Router (DVR) in Openstack/Neutron
Overview of Distributed Virtual Router (DVR) in Openstack/NeutronOverview of Distributed Virtual Router (DVR) in Openstack/Neutron
Overview of Distributed Virtual Router (DVR) in Openstack/Neutronvivekkonnect
 
BlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year InBlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year InSage Weil
 
VPP事始め
VPP事始めVPP事始め
VPP事始めnpsg
 
Kvm performance optimization for ubuntu
Kvm performance optimization for ubuntuKvm performance optimization for ubuntu
Kvm performance optimization for ubuntuSim Janghoon
 
AF Ceph: Ceph Performance Analysis and Improvement on Flash
AF Ceph: Ceph Performance Analysis and Improvement on FlashAF Ceph: Ceph Performance Analysis and Improvement on Flash
AF Ceph: Ceph Performance Analysis and Improvement on FlashCeph Community
 
Linux Kernel vs DPDK: HTTP Performance Showdown
Linux Kernel vs DPDK: HTTP Performance ShowdownLinux Kernel vs DPDK: HTTP Performance Showdown
Linux Kernel vs DPDK: HTTP Performance ShowdownScyllaDB
 

What's hot (20)

Introduction to CNI (Container Network Interface)
Introduction to CNI (Container Network Interface)Introduction to CNI (Container Network Interface)
Introduction to CNI (Container Network Interface)
 
iptables 101- bottom-up
iptables 101- bottom-upiptables 101- bottom-up
iptables 101- bottom-up
 
Ceph Object Storage Reference Architecture Performance and Sizing Guide
Ceph Object Storage Reference Architecture Performance and Sizing GuideCeph Object Storage Reference Architecture Performance and Sizing Guide
Ceph Object Storage Reference Architecture Performance and Sizing Guide
 
PostgreSQL on EXT4, XFS, BTRFS and ZFS
PostgreSQL on EXT4, XFS, BTRFS and ZFSPostgreSQL on EXT4, XFS, BTRFS and ZFS
PostgreSQL on EXT4, XFS, BTRFS and ZFS
 
TripleOの光と闇
TripleOの光と闇TripleOの光と闇
TripleOの光と闇
 
Nick Fisk - low latency Ceph
Nick Fisk - low latency CephNick Fisk - low latency Ceph
Nick Fisk - low latency Ceph
 
TripleO Deep Dive 1.1
TripleO Deep Dive 1.1TripleO Deep Dive 1.1
TripleO Deep Dive 1.1
 
Container Networking Deep Dive
Container Networking Deep DiveContainer Networking Deep Dive
Container Networking Deep Dive
 
Crimson: Ceph for the Age of NVMe and Persistent Memory
Crimson: Ceph for the Age of NVMe and Persistent MemoryCrimson: Ceph for the Age of NVMe and Persistent Memory
Crimson: Ceph for the Age of NVMe and Persistent Memory
 
ceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-shortceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-short
 
ML2/OVN アーキテクチャ概観
ML2/OVN アーキテクチャ概観ML2/OVN アーキテクチャ概観
ML2/OVN アーキテクチャ概観
 
PCE 〜MPLSネットワークのSDN化を本気で実現する"唯一の"方法〜
PCE 〜MPLSネットワークのSDN化を本気で実現する"唯一の"方法〜PCE 〜MPLSネットワークのSDN化を本気で実現する"唯一の"方法〜
PCE 〜MPLSネットワークのSDN化を本気で実現する"唯一の"方法〜
 
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li XiaoyanPerformance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
 
Writing the Container Network Interface(CNI) plugin in golang
Writing the Container Network Interface(CNI) plugin in golangWriting the Container Network Interface(CNI) plugin in golang
Writing the Container Network Interface(CNI) plugin in golang
 
Overview of Distributed Virtual Router (DVR) in Openstack/Neutron
Overview of Distributed Virtual Router (DVR) in Openstack/NeutronOverview of Distributed Virtual Router (DVR) in Openstack/Neutron
Overview of Distributed Virtual Router (DVR) in Openstack/Neutron
 
BlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year InBlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year In
 
VPP事始め
VPP事始めVPP事始め
VPP事始め
 
Kvm performance optimization for ubuntu
Kvm performance optimization for ubuntuKvm performance optimization for ubuntu
Kvm performance optimization for ubuntu
 
AF Ceph: Ceph Performance Analysis and Improvement on Flash
AF Ceph: Ceph Performance Analysis and Improvement on FlashAF Ceph: Ceph Performance Analysis and Improvement on Flash
AF Ceph: Ceph Performance Analysis and Improvement on Flash
 
Linux Kernel vs DPDK: HTTP Performance Showdown
Linux Kernel vs DPDK: HTTP Performance ShowdownLinux Kernel vs DPDK: HTTP Performance Showdown
Linux Kernel vs DPDK: HTTP Performance Showdown
 

Similar to Huge pages why-what-how

Deploying MariaDB for HA on Google Cloud Platform
Deploying MariaDB for HA on Google Cloud PlatformDeploying MariaDB for HA on Google Cloud Platform
Deploying MariaDB for HA on Google Cloud PlatformMariaDB plc
 
Monitoring with Clickhouse
Monitoring with ClickhouseMonitoring with Clickhouse
Monitoring with Clickhouseunicast
 
Implementing MySQL Database-as-a-Service using open source tools
Implementing MySQL Database-as-a-Service using open source toolsImplementing MySQL Database-as-a-Service using open source tools
Implementing MySQL Database-as-a-Service using open source toolsAll Things Open
 
One bridge to connect them all. Oracle GoldenGate for Big Data.UKOUG Tech 2018
One bridge to connect them all. Oracle GoldenGate for Big Data.UKOUG Tech 2018One bridge to connect them all. Oracle GoldenGate for Big Data.UKOUG Tech 2018
One bridge to connect them all. Oracle GoldenGate for Big Data.UKOUG Tech 2018Gleb Otochkin
 
Yarn optimization (Real life use case)
Yarn optimization (Real life use case)Yarn optimization (Real life use case)
Yarn optimization (Real life use case)Jean-Louis Quéguiner
 
Full scan frenzy at amadeus
Full scan frenzy at amadeusFull scan frenzy at amadeus
Full scan frenzy at amadeusMongoDB
 
[RakutenTechConf2013] [C-1] Rakuten new infrastructure
[RakutenTechConf2013] [C-1] Rakuten new infrastructure[RakutenTechConf2013] [C-1] Rakuten new infrastructure
[RakutenTechConf2013] [C-1] Rakuten new infrastructureRakuten Group, Inc.
 
Make your data fly - Building data platform in AWS
Make your data fly - Building data platform in AWSMake your data fly - Building data platform in AWS
Make your data fly - Building data platform in AWSKimmo Kantojärvi
 
Key considerations in productionizing streaming applications
Key considerations in productionizing streaming applicationsKey considerations in productionizing streaming applications
Key considerations in productionizing streaming applicationsKafkaZone
 
“Building consistent and highly available distributed systems with Apache Ign...
“Building consistent and highly available distributed systems with Apache Ign...“Building consistent and highly available distributed systems with Apache Ign...
“Building consistent and highly available distributed systems with Apache Ign...Tom Diederich
 
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28Amazon Web Services
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentJean-François Gagné
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGaiKohei KaiGai
 
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...Equnix Business Solutions
 
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControl
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControlWebinar slides: How to Automate & Manage PostgreSQL with ClusterControl
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControlSeveralnines
 
From the trenches: scaling a large log management deployment
From the trenches: scaling a large log management deploymentFrom the trenches: scaling a large log management deployment
From the trenches: scaling a large log management deploymentFaithWestdorp
 
Container Attached Storage (CAS) with OpenEBS - SDC 2018
Container Attached Storage (CAS) with OpenEBS -  SDC 2018Container Attached Storage (CAS) with OpenEBS -  SDC 2018
Container Attached Storage (CAS) with OpenEBS - SDC 2018OpenEBS
 
Scalable Multi-Node Deep Learning Training in the Cloud (CMP368-R1) - AWS re:...
Scalable Multi-Node Deep Learning Training in the Cloud (CMP368-R1) - AWS re:...Scalable Multi-Node Deep Learning Training in the Cloud (CMP368-R1) - AWS re:...
Scalable Multi-Node Deep Learning Training in the Cloud (CMP368-R1) - AWS re:...Amazon Web Services
 

Similar to Huge pages why-what-how (20)

Deploying MariaDB for HA on Google Cloud Platform
Deploying MariaDB for HA on Google Cloud PlatformDeploying MariaDB for HA on Google Cloud Platform
Deploying MariaDB for HA on Google Cloud Platform
 
Monitoring with Clickhouse
Monitoring with ClickhouseMonitoring with Clickhouse
Monitoring with Clickhouse
 
Implementing MySQL Database-as-a-Service using open source tools
Implementing MySQL Database-as-a-Service using open source toolsImplementing MySQL Database-as-a-Service using open source tools
Implementing MySQL Database-as-a-Service using open source tools
 
One bridge to connect them all. Oracle GoldenGate for Big Data.UKOUG Tech 2018
One bridge to connect them all. Oracle GoldenGate for Big Data.UKOUG Tech 2018One bridge to connect them all. Oracle GoldenGate for Big Data.UKOUG Tech 2018
One bridge to connect them all. Oracle GoldenGate for Big Data.UKOUG Tech 2018
 
Yarn optimization (Real life use case)
Yarn optimization (Real life use case)Yarn optimization (Real life use case)
Yarn optimization (Real life use case)
 
Full scan frenzy at amadeus
Full scan frenzy at amadeusFull scan frenzy at amadeus
Full scan frenzy at amadeus
 
[RakutenTechConf2013] [C-1] Rakuten new infrastructure
[RakutenTechConf2013] [C-1] Rakuten new infrastructure[RakutenTechConf2013] [C-1] Rakuten new infrastructure
[RakutenTechConf2013] [C-1] Rakuten new infrastructure
 
Make your data fly - Building data platform in AWS
Make your data fly - Building data platform in AWSMake your data fly - Building data platform in AWS
Make your data fly - Building data platform in AWS
 
Key considerations in productionizing streaming applications
Key considerations in productionizing streaming applicationsKey considerations in productionizing streaming applications
Key considerations in productionizing streaming applications
 
“Building consistent and highly available distributed systems with Apache Ign...
“Building consistent and highly available distributed systems with Apache Ign...“Building consistent and highly available distributed systems with Apache Ign...
“Building consistent and highly available distributed systems with Apache Ign...
 
ch9_virMem.pdf
ch9_virMem.pdfch9_virMem.pdf
ch9_virMem.pdf
 
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28
 
Cloud Native with Kyma
Cloud Native with KymaCloud Native with Kyma
Cloud Native with Kyma
 
MySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated EnvironmentMySQL Scalability and Reliability for Replicated Environment
MySQL Scalability and Reliability for Replicated Environment
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai
 
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
PGConf.ASIA 2019 Bali - Full-throttle Running on Terabytes Log-data - Kohei K...
 
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControl
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControlWebinar slides: How to Automate & Manage PostgreSQL with ClusterControl
Webinar slides: How to Automate & Manage PostgreSQL with ClusterControl
 
From the trenches: scaling a large log management deployment
From the trenches: scaling a large log management deploymentFrom the trenches: scaling a large log management deployment
From the trenches: scaling a large log management deployment
 
Container Attached Storage (CAS) with OpenEBS - SDC 2018
Container Attached Storage (CAS) with OpenEBS -  SDC 2018Container Attached Storage (CAS) with OpenEBS -  SDC 2018
Container Attached Storage (CAS) with OpenEBS - SDC 2018
 
Scalable Multi-Node Deep Learning Training in the Cloud (CMP368-R1) - AWS re:...
Scalable Multi-Node Deep Learning Training in the Cloud (CMP368-R1) - AWS re:...Scalable Multi-Node Deep Learning Training in the Cloud (CMP368-R1) - AWS re:...
Scalable Multi-Node Deep Learning Training in the Cloud (CMP368-R1) - AWS re:...
 

Recently uploaded

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 

Recently uploaded (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 

Huge pages why-what-how

  • 1. August 2018 Refreshing our knowledge HugePages: Why, what and how
  • 2. 2© The Pythian Group Inc., 2018 What's up with HugePages?
  • 3. © The Pythian Group Inc., 2018 3 Jose Rodriguez Project engineer at Pythian ● +10 years of experience, mainly Oracle but also SQL Server and others like DB2 LUW or PostgreSQL ● Solaris, Linux and Windows RAC and HA with DG and GG Other areas of expertise, i.e., things I like doing ● Scripting and automation (lazy DBA) ● Machine Learning ● Golden Gate replication ● Cloud related stuff (who doesn't nowadays, eh? )
  • 4. About Pythian Pythian’s 400+ IT professionals help companies adopt and manage disruptive data technologies to better compete © 2018 Pythian. Confidential 4
  • 5. © 2018 Pythian. Confidential 5 Systems currently managed by Pythian EXPERIENCED Pythian experts in 35 countries GLOBAL Millennia of experience gathered and shared over 19 years EXPERTS 11,800 2400+
  • 6. What are you taking away So you can leave now if you already have it
  • 7. Agenda 7© The Pythian Group Inc., 2018 Why do we care? What are HugePages? How to implement? What can happen - Case Studies
  • 8. © The Pythian Group Inc., 2018 8 ● It is 2019, but HugePages seem yet to be understood and broadly implemented ● More memory -> more problems ● systems with >= 1TB RAM are common nowadays ● Problems caused by lack of HugePages are not always easy to spot Why do we care?
  • 9. 9© The Pythian Group Inc., 2018 What Are HugePages Behind the scenes
  • 10. © The Pythian Group Inc., 2018 10 Virtual to physical memory mapping 0x00 ... 0x01 ... 0x23 0x24 42 157 245 userprocess mainmemory
  • 11. 11© The Pythian Group Inc., 2018 Memory allocations are tracked in PageTables virtual real 0x00 42 0x01 175 0x02 176 0x03 177 0x04 178 ... ... 4kB
  • 12. © The Pythian Group Inc., 2018 12 Virtual to physical memory mapping ... virtual real 0x00 42 0x01 175 0x02 176 0x03 177 0x04 178 ... ... userprocess physicalmemory PageTable
  • 13. © The Pythian Group Inc., 2018 13 ● To allocate 100 GiB there will be 26,214,400 memory pages of 4KB each ● An OS would typically group and map them hierarchically in frames ■ i.e. continuous space can be mapped more efficiently ● Each PageTable Entry (PTE) is around 8 bytes for 64 bits systems ● Vmem offset + Physical address + Flags ● PageTables are also stored in memory. Size would be 200 MiB in our example ● For shared memory segments (e.g. SGA) each process has a copy of the PageTable ● A regular single instance may have 1000 sessions * 200 MiB each leads to 200GiB of RAM to track RAM. What's up with PageTables?
  • 14. 14© The Pythian Group Inc., 2018 HugePages to the Rescue! virtual real 0x00 1 0x02 ... 0x02 ... 0x03 ... 0x04 ... ... ... 2048KB2048KiB 4KiB PageTable reduced 512 time to only ~400KiB
  • 15. © The Pythian Group Inc., 2018 15 ● Allocate only enough HugePages ● HugePages cannot be swapped out ● Oracle Automatic Memory Management (AMM) is incompatible with HugePages ● Transparent HugePages (THP) do not go along well with Oracle, disabled by default in UEK2+ ● Platforms other than Linux x64 have even bigger choices of large page sizes up to 1GiB ● In extreme cases, SGA of TiBs in size, may lead to slow instance startup. PRE_PAGE_SGA may help here ● AMM is forbidden in 12.2 if RAM>4GiBs, so HP should be used here. HugePages additional facts
  • 16. © The Pythian Group Inc., 2018 16 ● Do we really need/want HugePages for ASM? ● ASM uses AMM by default so initially not HP compatible. /dev/shm is important here. ● We don't for "regular" ASM instances. Documentation and best practices say this clearly, although this may change in future releases. ● Highly recommended for Exadata. MOS notes 2062068.1 and 2111010.1 clearly indicate that ASMM should be enabled and HugePages available for ASM. HugePages and ASM
  • 17. © The Pythian Group Inc., 2018 17 ● /dev/shm is automatically set to 50% of total RAM ● Oracle AMM uses /dev/shm to "store" shared memory pages ● We may be tempted to reduce the size of /dev/shm to allow more room to HugePages. No need ● HugePages and AMM are incompatible HugePages and /dev/shm
  • 18. © The Pythian Group Inc., 2018 18 ● Does Oracle use HugePages for PGA? ● No, it doesn't (currently) ● No hard evidence against it in docs or MOS ● Tests show that Oracle is not allocating HugePages for it ● Counterintuitive for small memory allocations ● May change in the future (DWH or DSS sort area) HugesPages and PGA
  • 19. 19© The Pythian Group Inc., 2018 HugePages on the Cloud ● Supported on AWS RDS since July, 2017 but not enabled by default. There are limitations to the type of instance you can enable HP on. ● No official documentation on Azure, but a recent test showed that we can set up HP in a Linux VM running on Azure. ● Google Cloud platform supports HugePages. ● Oracle Cloud Service – Officially supported for Exadata Cloud Service. OCI allows it but not by default. Classic platform has it enabled by default.
  • 20. 20© The Pythian Group Inc., 2018 Let's do it!
  • 21. © The Pythian Group Inc., 2018 21 ● Script provided in MOS: "Oracle Linux: Shell Script to Calculate Values Recommended Linux HugePages / HugeTLB Configuration (Doc ID 401749.1)" ● Or use the following formula: SGA size (MiB) / 2 (MiB) + 42 How many HugePages do I need?
  • 22. 22© The Pythian Group Inc., 2018 ● May need extra work on VMs ● Disable AMM ● Set use_large_pages=only ● Disable THP ● Set memlock user limit ● Set vm.nr_hugepages ● Set vm.hugetlb_shm_group as required (SUSE) ● Reboot OS (not always required) ● Restart Oracle instance ● Use TuneD profiles on RHEL 7 and above Implementation steps
  • 23. 23© The Pythian Group Inc., 2018 Success! 2018-08-20T12:43:18.163509+00:00 Dump of system resources acquired for SHARED GLOBAL AREA (SGA) 2018-08-20T12:43:18.163653+00:00 Per process system memlock (soft) limit = 2048M 2018-08-20T12:43:18.163821+00:00 Expected per process system memlock (soft) limit to lock SHARED GLOBAL AREA (SGA) into memory: 1540M 2018-08-20T12:43:18.163952+00:00 Available system pagesizes: 4K, 2048K 2018-08-20T12:43:18.164143+00:00 Supported system pagesize(s): 2018-08-20T12:43:18.164220+00:00 PAGESIZE AVAILABLE_PAGES EXPECTED_PAGES ALLOCATED_PAGES ERROR(s) 2018-08-20T12:43:18.164382+00:00 2048K 1056 770 770 NONE [oracle@HPtesting ~]$ grep ^HugePages /proc/meminfo HugePages_Total: 1056 HugePages_Free: 287 HugePages_Rsvd: 1 HugePages_Surp: 0
  • 24. © The Pythian Group Inc., 2018 24 LargePages (A.K.A. HugePages in Windows) ● Available since Oracle 10.1 ● Enabled by adding an entry into the registry, ideally for each SID instead of general ● Again only used for SGA ● Not considered in the "Working Set" so memory usage metrics are now somehow flawed ● Startup times may be slow and with high impact on the server performance for older versions ● Oriented to DWH type databases
  • 25. 25© The Pythian Group Inc., 2018 Case Studies Lack of HugePages causing trouble
  • 26. 26© The Pythian Group Inc., 2018 RAC node eviction ● 1 node of 2-node cluster evicted ● Logs show a timeout responding to something prior to eviction ● We found no other errors or evidence ● sar to the rescue!
  • 27. 27© The Pythian Group Inc., 2018 RAC node eviction – “sar -r” 05:20:01 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty 05:30:01 AM 361136 65389512 99.45 932 24477352 41041528 29.48 29509228 1880236 564 05:40:01 AM 354896 65395752 99.46 95164 24434320 41039432 29.48 29504552 1902320 552 05:50:01 AM 382940 65367708 99.42 87912 24420284 41021636 29.47 29474908 1902904 496 06:00:01 AM 385016 65365632 99.41 52432 24414712 41053708 29.49 29477412 1878860 488 06:10:01 AM 386796 65363852 99.41 596 24416944 41046880 29.48 29412032 1909420 628 06:20:02 AM 376484 65374164 99.43 596 24546212 41069336 29.50 29603108 2107020 460 06:30:01 AM 335176 65415472 99.49 596 24893684 41094396 29.52 29676840 2078424 648 06:40:05 AM 334152 65416496 99.49 596 24554064 41222332 29.61 29453168 2061660 0 06:50:03 AM 349392 65401256 99.47 596 22963852 41360864 29.71 28031816 1851900 72 07:00:10 AM 342752 65407896 99.48 596 21190320 41768848 30.00 26854676 1723480 0 07:10:04 AM 341756 65408892 99.48 596 20787592 41769828 30.00 26706944 1765980 12 Average: 414530 65336118 99.37 19907 24589646 41094908 29.52 29439910 1903200 2315 07:16:28 AM LINUX RESTART
  • 28. 28© The Pythian Group Inc., 2018 RAC node eviction – “sar -B” 05:20:01 AM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s %vmeff 05:30:01 AM 7257.49 91.49 8704.61 1.28 6571.59 44.62 0.00 30.63 68.65 05:40:01 AM 2021.61 2486.25 141607.78 5.77 60734.72 451.59 38.26 386.08 78.82 05:50:01 AM 6980.26 71.57 7241.62 0.56 6380.14 35.86 7.72 38.22 87.71 06:00:01 AM 7262.73 67.56 8717.63 1.10 6549.42 47.03 1.18 40.58 84.18 06:10:01 AM 1759.35 379.66 15556.00 4.59 7320.75 185.60 2.75 143.16 76.01 06:20:02 AM 63309.67 3624.60 34222.39 267.30 50019.66 42307.14 982.44 13754.06 31.77 06:30:01 AM 115962.81 2730.86 30665.11 373.74 86055.51 843180.51 16021.66 26924.73 3.13 06:40:05 AM 83609.10 1331.45 20393.23 235.71 62484.15 1104330.76 25458.10 20843.61 1.84 06:50:03 AM 158193.69 4252.68 27395.53 375.73 111261.98 1848753.28 61689.42 38619.69 2.02 07:00:10 AM 98699.51 4257.23 23771.15 292.99 70708.84 590100.80 12354.06 23573.29 3.91 07:10:04 AM 125777.83 2409.66 23413.33 415.65 91748.06 952301.48 24672.30 31401.32 3.21 Average: 20671.81 1108.76 15287.47 51.49 18890.14 126225.66 3483.02 4065.45 3.13 07:16:28 AM LINUX RESTART
  • 29. 29© The Pythian Group Inc., 2018 RAC node eviction - “cat /proc/meminfo” (after incident) $ cat /proc/meminfo MemTotal: 65918584 kB MemFree: 1583912 kB MemAvailable: 20034320 kB Buffers: 416208 kB Cached: 41349928 kB SwapTotal: 73469948 kB SwapFree: 73334068 kB KernelStack: 23392 kB PageTables: 12495120 kB AnonHugePages: 1478656 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB
  • 30. 30© The Pythian Group Inc., 2018 Unexpected Swapping ● Lots of notes about swapping in alert log ● Small (2GB SGA) ● Rarely used database ● vm.swappiness was not reviewed, probably at 60 WARNING: Heavy swapping observed on system in last 5 mins. pct of memory swapped in [0.27%] pct of memory swapped out [2.22%]. Please make sure there is no memory pressure and the SGA and PGA are configured correctly. Look at DBRM trace file for more details.
  • 31. © The Pythian Group Inc., 2018 31 ● Once a month load > 400 ● System unusable but no crash CPU stealing
  • 32. 32© The Pythian Group Inc., 2018 Yet again - sar is the star [oracle@oramstr01 oracle]$ sar -f sa07 -s 14:30:00 -e 18:30:00 -u Linux 2.6.32-573.8.1.el6.x86_64 (oramstr01.testing.com) 12/07/2016 _x86_64_ (80 CPU) 02:30:01 PM CPU %user %nice %system %iowait %steal %idle 02:40:01 PM all 38.31 0.00 3.35 3.02 0.00 55.32 02:50:01 PM all 34.02 0.00 3.15 2.63 0.00 60.19 03:00:01 PM all 34.20 0.00 3.20 1.68 0.00 60.92 03:10:01 PM all 40.79 0.00 3.81 2.96 0.00 52.44 03:20:01 PM all 37.33 0.00 3.43 2.40 0.00 56.83 03:30:04 PM all 40.72 0.00 6.12 2.62 0.00 50.54 03:40:01 PM all 10.08 0.00 88.36 0.30 0.00 1.26 03:50:02 PM all 8.66 0.00 90.82 0.05 0.00 0.47 04:00:03 PM all 31.66 0.00 68.27 0.02 0.00 0.04 04:10:03 PM all 45.84 0.00 49.19 0.90 0.00 4.07 04:20:01 PM all 40.68 0.00 54.30 0.97 0.00 4.05 04:30:01 PM all 37.81 0.00 43.22 1.13 0.00 17.84 04:40:02 PM all 15.68 0.00 84.18 0.05 0.00 0.09 04:50:02 PM all 12.76 0.00 87.23 0.00 0.00 0.01 05:00:03 PM all 11.84 0.00 88.14 0.00 0.00 0.01 05:10:01 PM all 18.56 0.00 62.74 0.71 0.00 17.99 05:20:01 PM all 15.84 0.00 1.73 1.17 0.00 81.27 05:30:01 PM all 19.22 0.00 1.75 0.71 0.00 78.33 05:40:01 PM all 25.51 0.00 2.02 1.15 0.00 71.32 05:50:01 PM all 23.78 0.00 1.85 1.05 0.00 73.32 06:00:01 PM all 20.15 0.00 1.66 0.88 0.00 77.30 06:10:01 PM all 21.14 0.00 2.68 2.93 0.00 73.25 06:20:01 PM all 18.94 0.00 2.33 2.26 0.00 76.46 Average: all 26.23 0.00 32.81 1.29 0.00 39.67
  • 33. © The Pythian Group Inc., 2018 33 [oracle@oramstr01 ~]$ date Wed Dec 14 10:23:22 EST 2016 [oracle@oramstr01 ~]$ ps -ef | grep -c oracleccxp 2717 [oracle@oramstr01 ~]$ grep PageTable /proc/meminfo PageTables: 461002976 kB Yes, that is 440 GiBs of PageTables! Sessions and pagetable memory
  • 34. 34© The Pythian Group Inc., 2018 Yet again - sar is the star [oracle@oramstr01 oracle]$ sar -r -f sa07 -s 14:30:00 -e 17:30:00 Linux 2.6.32-573.8.1.el6.x86_64 (oramstr01.testing.com) 12/07/2016 _x86_64_ (80 CPU) 02:30:01 PM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit 02:40:01 PM 57063440 1001654588 94.61 1610536 412910108 311506624 26.72 02:50:01 PM 57747440 1000970588 94.55 1610564 412911112 306685964 26.31 03:00:01 PM 46592180 1012125848 95.60 1610596 412914608 308191992 26.44 03:10:01 PM 31680840 1027037188 97.01 1610608 412917688 310993660 26.68 03:20:01 PM 16457972 1042260056 98.45 1610628 412920472 309580976 26.56 03:30:04 PM 1739692 1056978336 99.84 1610628 411393436 317613764 27.25 03:40:01 PM 5066928 1053651100 99.52 1538352 395198928 324298580 27.82 03:50:02 PM 28196104 1030521924 97.34 1342292 381568100 324394208 27.83 04:00:03 PM 11313156 1047404872 98.93 1359396 378901468 326693864 28.03 04:10:03 PM 80061500 978656528 92.44 1359488 375162128 321167148 27.55 04:20:01 PM 64494004 994224024 93.91 1359508 375163768 322061964 27.63 04:30:01 PM 108230896 950487132 89.78 1359532 375166776 313685004 26.91 04:40:02 PM 135833716 922884312 87.17 1359548 375168248 318691876 27.34 04:50:02 PM 192323488 866394540 81.83 1359556 375169736 315572568 27.07 05:00:03 PM 235108136 823609892 77.79 1359648 375172216 312304460 26.79 05:10:01 PM 360281464 698436564 65.97 1359724 375173424 295083536 25.32 05:20:01 PM 357150032 701567996 66.27 1359748 375175952 296449248 25.43
  • 35. 35© The Pythian Group Inc., 2018 Summary ● HugePages are usually good to have ● How to implement ● Know where to look ● /proc/meminfo ■ HugePages ■ Pagetables ● Remember the power of sar/OSwBB ● Following best practices prevents issues
  • 36. © The Pythian Group Inc., 2018 36 References ● Oracle 11g internals part 1: Automatic Memory Management by Tanel Poder ● Oracle SGA memory allocation on startup by Fritz Hoogland ● Oracle Linux: Shell Script to Calculate Values Recommended Linux HugePages / HugeTLB Configuration (Doc ID 401749.1) ● Oracle Exadata Initialization Parameters and Diskgroup Attributes Best Practices ( Doc ID 2062068.1) ● 12.2 Grid Infrastructure and Database Upgrade steps for Exadata Database Machine running 11.2.0.3 and later on Oracle Linux (Doc ID 2111010.1) ● ASM & Shared Pool (ORA-4031) (Doc ID 437924.1)
  • 37. Q&A Ask now or reach out later, but don't keep the question for yourself
  • 38. 38© The Pythian Group Inc., 2018 THANK YOU Hope you enjoyed it