SlideShare a Scribd company logo
1 of 33
Linux Memory Management
Kamal Maiti
Sr. Linux System Engineer
Amdocs DVCI, Pune, India
AGENDA
 Basic concept of computer
 Hardware, firmware, driver, software, application
 CPU, RAM, How RAM used
 Moving Information within Computer
 Primary & Other Memory,
 Segment of RAM
 Memory Mapping, Process Address Space
 Page, Frame, Hugepage, MMU etc.
 Virtual Memory, PageCache
 Memory nodes, zones, lowmem
 NUMA
 Kernel Memory allocator
 Pagefault Handling, Tools, Memory leak, Memory related issues
 Hands-on Troubleshooting : sysrq, backtrace analysis, OOM messages investigation etc
BASIC CONCEPTS OF COMPUTER HARDWARE
 This model of the typical digital computer is often called the von
Neumann computer.
 Programs and data are stored in the same memory: primary memory
CPU
(Central Processing Unit)
Input
Units
Output
Units
Primary Memory
HARDWARE, FIRMWARE, DRIVER, SOFTWARE, APPLICATION
Hardware : All computer devices like - Input, Output
devices, Motherboard, mouse, keyboard
Firmware : Vendor provided low level codes that
interacts with hardware to get the output of instructions
passed to device.
Driver : On top of firmware, driver is used to interacts with
firmware or hardware directly.
Software/Application: which interacts with system calls
to call kernel and kernel interacts with driver to get the
output.
CPU
 The three major components of the CPU are:
1. Arithmetic Unit (Computations performed)
Accumulator (Results of computations kept here)
2. Control Unit (Has two locations where numbers are kept)
Instruction Register (Instruction placed here for
analysis)
Program Counter (Which instruction will be
performed next?)
3. Instruction Decoding Unit (Decodes the instruction)
 Motherboard: The place where most of the electronics including
the CPU are mounted.
RAM
 Commonly known as random access memory, or just
RAM
 Holds instructions and data needed for programs that
are currently running
 RAM is usually a volatile type of memory
 Contents of RAM are lost when power is turned off
HOW RAM USED ?
Memory is used to store:
 i) instructions - > to execute a program
 ii) data -> When the computer is doing any job, the data that
have to be processed are stored in the primary memory. This
data may come from an input device like keyboard or from a
secondary storage device like a floppy disk.
MOVING INFORMATION WITHIN THE COMPUTER
 How do binary numerals move into, out of, and within the computer?
 Information is moved about in bytes, or multiple bytes called
words.
 Words are the fundamental units of information.
 The number of bits per word may vary per computer.
 A word length for most large IBM computers is 32 bits:
MOVING INFORMATION WITHIN THE COMPUTER …
 Bits that compose a word are passed in parallel from place to
place.
 Ribbon cables:
 Consist of several wires, molded together.
 One wire for each bit of the word or byte.
 Additional wires coordinate the activity of moving
information.
 Each wire sends information in the form of a voltage
pulse.
MOVING INFORMATION WITHIN THE COMPUTER …
 Example of sending the word WOW over the ribbon cable
 Voltage pulses corresponding to the ASCII codes would pass
through the cable.
PRIMARY MEMORY
 Primary storage or memory: Where the data & program that are
currently in operation or being accessed are stored during use.
 Consists of electronic circuits: Extremely fast and expensive.
 Two types:
 RAM (non-permanent)
 Programs and data can be stored here for the
computer’s use.
 Volatile: All information will be lost once the computer
shuts down.
 ROM (permanent)
 Contents do not change.
 ROM : a transistor [storing video game software, electronic musical
instruments]. ROM is mostly used for firmware updates.
 EROM : Erasable programmable read-only memory
 EEPROM :Electrically Erasable Programmable Read-Only Memory
 Cache : Location in RAM where data is stored for a certain amount of time of
that it can be reused.
 Registers : various flip flop register[RS, D, JK, shift etc] holds information
 Swap : External disk is used to accommodate the demand of more RAM.
OTHER MEMORY
SEGMENT OF RAM
 Low mem, high mem, Normal mem, DMA, DMA32
 On a 32-bit architecture[DMA, Normal & HighMem] : the
address space range for addressing RAM is:
0x00000000 - 0xffffffff or 4'294'967'295 (4 GB).
The user space range: 0x00000000 - 0xbfffffff or 3 GB
The kernel space range: 0xc0000000 - 0xffffffff or 1 GB
Linux splits the 1GB kernel space into 2 pieces: LOWMEM and HIGHMEM.
 On 64 bit machine[DMA, DMA32 & Normal] : Normal
memory available beyond 4 GB
MEMORY MAPPING
 Linux uses only 4 segments in 32 bit arch:
 2 segments (code and data/stack) for KERNEL SPACE from [0xC000 0000] (3 GB) to [0xFFFF FFFF] (4 GB)
 2 segments (code and data/stack) for USER SPACE from [0] (0 GB) to [0xBFFF FFFF] (3 GB)
See virtual Map : $ pmap <PID> , see stack : $pstack <PID>
 Segmentation, Paging [To overcome flaw in segmentation] –
 allocating virtual small pages to each process so that they will be fit in RAM with out wasting it.
PROCESS ADDRESS SPACE – 31 BIT ARCH
Kernel
0xC0000000
File name, Environment
Arguments
Stack
Bss[Block started by Symbol]
_end
_bss_start
Data
_edata
_etext Text/code
Header
0x84000000
Shared Libs
Text/Code Segment: contains the actual
code
Data: contains global variables
BSS: contains uninitialized global variables
Heap: dynamic memory
Stack: collection of frames/functions
Heap
Unused Memory
4 GB -->
3 GB -->
0 GB -->
Kernel Space
User Space
PAGE & FRAME
 Paging, Demand Paging, Swapping
 Page Tables [64 bit 4, 32 bit 2]: Page Global Directory, Page Upper Directory,
Page Middle Directory, Page
 Min page size : getconf -a|grep -i page
 Life cycle of page: active----> inactive list --> dirty --> clean
SWAP, HUGE PAGE, MMU,TLB
 SWAP : All pages can’t be fit in RAM, need to call/send data from and to storage
disk
 Hugepage : default page is 4MB but large program uses chunks of memory area.
Hence, allow large page. [sysctl -a|grep -i huge]
 MMU/TLB : Responsible for translating logical address to physical address. TLB is buffer
that is used by MMU.
 Active/Inactive regions [cat /proc/meminfo]
 Shmem : shared memory area[ipcs -m]
 Buddyinfo : view memory fragmentation/ allocation[cat /proc/buddyinfo]
 Cache : For speeding up, sync to flush out and forcefully write on disk, bdflush does
at background [flush-253:0 in rhel 6]
buffer's policy is first-in, first-out
cache's policy is Least Recently Used[LRU] [$ vmstat -S M 1]
VIRTUAL MEMORY, HOW PROGRAM MAPS?
 Executable text
 Executable data
 Heap space
 Stack
 Get exact required memory by process :
 $ pmap -x <pid>,
 $cat /proc/<pid>/status
PAGE CACHE MEMORY CONTROL
 vm.dirty_expire_centisecs=2000
 vm.dirty_writeback_centisecs=400 //how long they’ll wait
 vm.dirty_background_ratio=5 // when percentage of total RAM filled, pdflush/flush daemon will
start write dirty data on disk
 vm.dirty_ratio=20 //when percentage of total RAM filled, process will start write data on disk
 vfs_cache_pressure [100] : controls the tendency of the kernel to reclaim the memory which is
used for caching of directory and inode objects
 Swappiness[60] : controls how kernel will use swap space.
 To free pagecache:
To free pagecache: echo 1 > /proc/sys/vm/drop_caches
To free dentries and inodes : echo 2 > /proc/sys/vm/drop_caches
To free pagecache, dentries and inodes: echo 3 > /proc/sys/vm/drop_caches
 cache writes done by : kernel thread pdflush/bdflush, now in rhel 6 it is flush.
 Life cycle of pages :
active---->inactive list -->dirty > clean
Link : https://www.kernel.org/doc/Documentation/sysctl/vm.txt
PHYSICAL MEMORY ALLOCATION LIMIT
 CommitLimit : total mem to be allocated based on ovcercommit_ratio
 Committed_AS : currently allocated
 overcommit_memory : from 0 to 2 << Start from here
0 = allow available memory on the system to be overloaded //default
1 = no memory over commit handling
2 = allocate best on overcommit_ratio // allocate best on condition
 Overcommit_ratio: % of RAM when overcommit_memory is set 2, default value 50
Example : 4 GB RAM, 2 GB Swap, overcommit_memory=2, Overcommit_ratio=50 , so
commitLimit = 2+ (4*50/100)=2+2= 4 GB
Issue : Application failed to start due to shortage of memory, Needed to disable
WHY MEMORY CACHE IS REALLY REQUIRED
Speed up processing :
 $ cat > XYZ
 $ echo 3 > /proc/sys/vm/drop_caches
 $ time cat XYZ //much time
 $ time cat XYZ //less time
MEMORY NODES, ZONES IN 32 BIT & 64 BIT
 Below zones are in 32 bits :
 Zone_DMA (0-16MB)
 Zone_Normal (16MB-896MB)
 ZONE_HIGH_MEM (896MB-above)
HIGHMEM's lower zone is NORMAL+DMA , NORMAL's lower zone is DMA.
 Below zones are in 64 bits :
 Normal : Beyond 4 GB
 DMA : till 16 MB
 DMA32 : till 4GB
 $ cat /proc/zoneinfo
 $ cat /proc/pagetypeinfo
 $cat /proc/<pid>/numa_maps
 $ cat /proc/buddyinfo
LOW MEMORY, ZONE_RECLAIM
 "lowmem" often means NORMAL+DMA
 “lowmem” is not present in RHEL 6, 64bit
 Reservation is controlled by : lowmem_reserve_ratio [DMA NORMAL HIGMEM]
 cat /proc/sys/vm/lowmem_reserve_ratio
256 256 32 // (1/256)*100 % = 0.39% of nearset zone is reserved
 zone_reclaim_mode: How more or less aggressive approaches to reclaim
memory when a zone runs out of memory
1 = Zone reclaim on
2 = Zone reclaim writes dirty pages out
4 = Zone reclaim swaps pages
NON-UNIFORM MEMORY ACCESS(NUMA)
 Numa concept :
Numa Placement – placement of processor & Memory, manual – application,
MPI(Message Passing Interface)
 Place application in correct node
 Two memory policy – Node Local[after linux boot], Interleave [during kernel boot]
 cat /proc/<pid>/numa_maps
 numactl -s //show policy
 numactl –hardware
 numactl [ --interleave nodes ] [ --preferred node ] [ --membind nodes ] [ --cpunodebind nodes ] [ --physcpubind cpus ] [ --
localalloc ] [--] command {arguments ...}
Ref : http://www.redhat.com/summit/2012/pdf/2012-DevDay-Lab-NUMA-Hacker.pdf
NUMA MANAGEMENT
 numactl --physcpubind=0,1,2,3 example_process
 numactl --physcpubind=0-3 example_process
 numactl --cpunodebind=2 example_process //run on this cpu
 numactl --physcpubind=0 --localalloc example_process
 numactl --membind=4 example_process
 numactl --cpunodebind=0 example_process //Only execute command on the CPUs of 0
 numactl --cpubind=0 --membind=0,1 process // Run process on node 0 with memory allocated on
node 0 and 1
 numactl –hardware
 cat /sys/devices/system/node/node*/numastat
 Allocation : $watch -n1 numastat
KERNEL MEMORY ALLOCATORS
 Low-level page allocator :
 Buddy system for contiguous multi-page allocations
 Provides pages for
 in-kernel allocations (slab cache)
 vmalloc areas (kernel modules, multi-page data areas)
 page cache, anonymous user pages
 misc. other users
 Slab cache :
 Manages allocations of objects of the same type
 Large-scale users: inodes, dentries, block I/O, network ...
 kmalloc (generic allocator) implemented on top
 Tool : slabtop
PAGE FAULT HANDLING
 Hardware support :
 Accessing invalid pages causes 'page translation' check
 Writing to protected pages causes 'protection exception'
 Translation-exception identification provides address
 'Suppression on protection' facility essential!
 Linux kernel page fault handler :
 Determine address/access validity according to VMA
 Invalid accesses cause SIGSEGV delivery
 Valid accesses trigger: page-in, swap-in, copy-on-write
 Extra support for stack VMA: grows automatically
 Out-of-memory if overcommitted causes SIGBUS
TOOLS TO CHECK MEMORY USAGE
 Report paging statistics : sar -B
 Report memory utilization statistics : sar –r
 Report memory statistics : sar –R
 Report swap space utilization statistics: sar –S
 Current memory usage :
 free –m|k|g
 Cat /proc/meminfo
 Memory allocation :
 cat /proc/buddyinfo
 VM memory allocation:
 pmap -x <PID>
 Cat /proc/<pid>/status
 Display kernel slab cache & memory information in real time:
 slabtop
 vmstat
 ps
 top
 cat /proc/meminfo
 strace, gcore
MEMORY LEAK CHECK
 Usage check : historical sar report
 mtrace : builtin c function.
 Valgrind :
 valgrind --tool=memcheck --leak-check=full --show-reachable=yes snmpd -f –Lo
ISSUES RELATED TO MEMORY
 TCP/IP communication delay – RH cluster broken
 High cache usage : slowdown application / system
 Memory pressure : Memory leak, App is not tuned properly
 Memory fragmentation : hugepage not used
 OOM killer kills application: Memory pressure, OOM is enabled
by default, kills based on badness value.
 Segmentation fault : Kernel reclaims in normal/low memory
region, hence no room for kernel, encounters segmentation
fault.
 Faulty Memory : Hardware failure or circuit failure in chip, need
a diagnosis and replace chip
TROUBLESHOOTING MEMORY ISSUE
 Memory & swap usage test :
swap_tendency = mapped_ratio/2 + distress + vm_swappiness
mapped_ratio= % of physical memory in use
distress = how much trouble kernel in freeing memory
vm_swappiness= default 60
swap_tendency >= 100, eligible for swap
swap_tendency < 100, reclaim from page cache
 Sysrq :
echo 1 > /proc/sys/kernel/sysrq
echo m > /proc/sysrq-trigger
 backtrace analysis
TROUBLESHOOTING
 OOM messages investigation :
Messages :
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461588] [] oom_kill_process+0x5c/0x80
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461591] [] out_of_memory+0xc5/0x1c0
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461595] [] __alloc_pages_nodemask+0x72c/0x740
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461599] [] __get_free_pages+0x1c/0x30
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461602] [] get_zeroed_page+0x12/0x20
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461606] [] fill_read_buffer.isra.8+0xaa/0xd0
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461609] [] sysfs_read_file+0x7d/0x90
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461613] [] vfs_read+0x8c/0x160
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461616] [] ? fill_read_buffer.isra.8+0xd0/0xd0
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461619] [] sys_read+0x3d/0x70
 Oct 25 07:28:34 nldedip4k031 kernel: [87976.461624] [] sysenter_do_call+0x12/0x28
Q/A
Ref :
https://www.kernel.org/
https://www.redhat.com/en
http://www.tldp.org/LDP/tlk/mm/memory.html
https://en.wikipedia.org/wiki/Virtual_memory
https://lwn.net/

More Related Content

What's hot

What's hot (20)

Vmlinux: anatomy of bzimage and how x86 64 processor is booted
Vmlinux: anatomy of bzimage and how x86 64 processor is bootedVmlinux: anatomy of bzimage and how x86 64 processor is booted
Vmlinux: anatomy of bzimage and how x86 64 processor is booted
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
 
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...
qemu + gdb + sample_code: Run sample code in QEMU OS and observe Linux Kernel...
 
Page reclaim
Page reclaimPage reclaim
Page reclaim
 
LISA2019 Linux Systems Performance
LISA2019 Linux Systems PerformanceLISA2019 Linux Systems Performance
LISA2019 Linux Systems Performance
 
Continguous Memory Allocator in the Linux Kernel
Continguous Memory Allocator in the Linux KernelContinguous Memory Allocator in the Linux Kernel
Continguous Memory Allocator in the Linux Kernel
 
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
qemu + gdb: The efficient way to understand/debug Linux kernel code/data stru...
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File System
 
Memory management in Linux kernel
Memory management in Linux kernelMemory management in Linux kernel
Memory management in Linux kernel
 
semaphore & mutex.pdf
semaphore & mutex.pdfsemaphore & mutex.pdf
semaphore & mutex.pdf
 
Linux scheduler
Linux schedulerLinux scheduler
Linux scheduler
 
Physical Memory Models.pdf
Physical Memory Models.pdfPhysical Memory Models.pdf
Physical Memory Models.pdf
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems Performance
 
Physical Memory Management.pdf
Physical Memory Management.pdfPhysical Memory Management.pdf
Physical Memory Management.pdf
 
Memory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdfMemory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdf
 
Linux dma engine
Linux dma engineLinux dma engine
Linux dma engine
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
 
Linux kernel memory allocators
Linux kernel memory allocatorsLinux kernel memory allocators
Linux kernel memory allocators
 
Linux memory
Linux memoryLinux memory
Linux memory
 
Reverse Mapping (rmap) in Linux Kernel
Reverse Mapping (rmap) in Linux KernelReverse Mapping (rmap) in Linux Kernel
Reverse Mapping (rmap) in Linux Kernel
 

Viewers also liked

Christo kutrovsky oracle, memory & linux
Christo kutrovsky   oracle, memory & linuxChristo kutrovsky   oracle, memory & linux
Christo kutrovsky oracle, memory & linux
Kyle Hailey
 
Crash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_TizenCrash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_Tizen
Lex Yu
 

Viewers also liked (20)

Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory Management
 
Linux memory consumption
Linux memory consumptionLinux memory consumption
Linux memory consumption
 
Memory management in linux
Memory management in linuxMemory management in linux
Memory management in linux
 
Linux Memory Management
Linux Memory ManagementLinux Memory Management
Linux Memory Management
 
Christo kutrovsky oracle, memory & linux
Christo kutrovsky   oracle, memory & linuxChristo kutrovsky   oracle, memory & linux
Christo kutrovsky oracle, memory & linux
 
Tuning Android for low RAM
Tuning Android for low RAMTuning Android for low RAM
Tuning Android for low RAM
 
Linux Memory
Linux MemoryLinux Memory
Linux Memory
 
DLL Injection
DLL InjectionDLL Injection
DLL Injection
 
Os Linux
Os LinuxOs Linux
Os Linux
 
Linux memorymanagement
Linux memorymanagementLinux memorymanagement
Linux memorymanagement
 
Shared memory
Shared memoryShared memory
Shared memory
 
(120513) #fitalk an introduction to linux memory forensics
(120513) #fitalk   an introduction to linux memory forensics(120513) #fitalk   an introduction to linux memory forensics
(120513) #fitalk an introduction to linux memory forensics
 
Understanding of linux kernel memory model
Understanding of linux kernel memory modelUnderstanding of linux kernel memory model
Understanding of linux kernel memory model
 
Linux Memory Basics for SysAdmins - ChinaNetCloud Training
Linux Memory Basics for SysAdmins - ChinaNetCloud TrainingLinux Memory Basics for SysAdmins - ChinaNetCloud Training
Linux Memory Basics for SysAdmins - ChinaNetCloud Training
 
Debugging Native heap OOM - JavaOne 2013
Debugging Native heap OOM - JavaOne 2013Debugging Native heap OOM - JavaOne 2013
Debugging Native heap OOM - JavaOne 2013
 
Input output in linux
Input output in linuxInput output in linux
Input output in linux
 
Crash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_TizenCrash_Report_Mechanism_In_Tizen
Crash_Report_Mechanism_In_Tizen
 
Devel::NYTProf v5 at YAPC::NA 201406
Devel::NYTProf v5 at YAPC::NA 201406Devel::NYTProf v5 at YAPC::NA 201406
Devel::NYTProf v5 at YAPC::NA 201406
 
PCD - Process control daemon - Presentation
PCD - Process control daemon - PresentationPCD - Process control daemon - Presentation
PCD - Process control daemon - Presentation
 
Memory leak
Memory leakMemory leak
Memory leak
 

Similar to Linux memory-management-kamal

5.6 Basic computer structure microprocessors
5.6 Basic computer structure   microprocessors5.6 Basic computer structure   microprocessors
5.6 Basic computer structure microprocessors
lpapadop
 
Basic computer hardware terminology
Basic computer hardware terminologyBasic computer hardware terminology
Basic computer hardware terminology
Imtiyaz Husaini
 
Presentacion pujol
Presentacion pujolPresentacion pujol
Presentacion pujol
Dylan Real G
 

Similar to Linux memory-management-kamal (20)

My presentation on 'computer hardware component' {hardware}
My presentation on 'computer hardware component' {hardware}My presentation on 'computer hardware component' {hardware}
My presentation on 'computer hardware component' {hardware}
 
5.6 Basic computer structure microprocessors
5.6 Basic computer structure   microprocessors5.6 Basic computer structure   microprocessors
5.6 Basic computer structure microprocessors
 
Information processing cycle
Information processing cycleInformation processing cycle
Information processing cycle
 
Computer Hardware
Computer HardwareComputer Hardware
Computer Hardware
 
Memory hierarchy.pdf
Memory hierarchy.pdfMemory hierarchy.pdf
Memory hierarchy.pdf
 
Introduction to Computer Hardware slides ppt
Introduction to Computer Hardware slides pptIntroduction to Computer Hardware slides ppt
Introduction to Computer Hardware slides ppt
 
Memory management
Memory managementMemory management
Memory management
 
Basic computer hardware terminology
Basic computer hardware terminologyBasic computer hardware terminology
Basic computer hardware terminology
 
Presentacion pujol
Presentacion pujolPresentacion pujol
Presentacion pujol
 
Lecture 2 - Computer Hardware & Operating Systems
Lecture 2 - Computer Hardware & Operating SystemsLecture 2 - Computer Hardware & Operating Systems
Lecture 2 - Computer Hardware & Operating Systems
 
Multimedia Technology
Multimedia TechnologyMultimedia Technology
Multimedia Technology
 
Computer Fundamentals
Computer FundamentalsComputer Fundamentals
Computer Fundamentals
 
Ram and types of ram.Cache
Ram and types of ram.CacheRam and types of ram.Cache
Ram and types of ram.Cache
 
Computer Memory Finder
Computer Memory FinderComputer Memory Finder
Computer Memory Finder
 
Computer hardware ppt1
Computer hardware ppt1Computer hardware ppt1
Computer hardware ppt1
 
Chapter5 the memory-system-jntuworld
Chapter5 the memory-system-jntuworldChapter5 the memory-system-jntuworld
Chapter5 the memory-system-jntuworld
 
Coa presentation3
Coa presentation3Coa presentation3
Coa presentation3
 
Computer Introduction-Lecture02
Computer Introduction-Lecture02Computer Introduction-Lecture02
Computer Introduction-Lecture02
 
Hardware
HardwareHardware
Hardware
 
Hardware
HardwareHardware
Hardware
 

Recently uploaded

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 

Recently uploaded (20)

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 

Linux memory-management-kamal

  • 1. Linux Memory Management Kamal Maiti Sr. Linux System Engineer Amdocs DVCI, Pune, India
  • 2. AGENDA  Basic concept of computer  Hardware, firmware, driver, software, application  CPU, RAM, How RAM used  Moving Information within Computer  Primary & Other Memory,  Segment of RAM  Memory Mapping, Process Address Space  Page, Frame, Hugepage, MMU etc.  Virtual Memory, PageCache  Memory nodes, zones, lowmem  NUMA  Kernel Memory allocator  Pagefault Handling, Tools, Memory leak, Memory related issues  Hands-on Troubleshooting : sysrq, backtrace analysis, OOM messages investigation etc
  • 3. BASIC CONCEPTS OF COMPUTER HARDWARE  This model of the typical digital computer is often called the von Neumann computer.  Programs and data are stored in the same memory: primary memory CPU (Central Processing Unit) Input Units Output Units Primary Memory
  • 4. HARDWARE, FIRMWARE, DRIVER, SOFTWARE, APPLICATION Hardware : All computer devices like - Input, Output devices, Motherboard, mouse, keyboard Firmware : Vendor provided low level codes that interacts with hardware to get the output of instructions passed to device. Driver : On top of firmware, driver is used to interacts with firmware or hardware directly. Software/Application: which interacts with system calls to call kernel and kernel interacts with driver to get the output.
  • 5. CPU  The three major components of the CPU are: 1. Arithmetic Unit (Computations performed) Accumulator (Results of computations kept here) 2. Control Unit (Has two locations where numbers are kept) Instruction Register (Instruction placed here for analysis) Program Counter (Which instruction will be performed next?) 3. Instruction Decoding Unit (Decodes the instruction)  Motherboard: The place where most of the electronics including the CPU are mounted.
  • 6. RAM  Commonly known as random access memory, or just RAM  Holds instructions and data needed for programs that are currently running  RAM is usually a volatile type of memory  Contents of RAM are lost when power is turned off
  • 7. HOW RAM USED ? Memory is used to store:  i) instructions - > to execute a program  ii) data -> When the computer is doing any job, the data that have to be processed are stored in the primary memory. This data may come from an input device like keyboard or from a secondary storage device like a floppy disk.
  • 8. MOVING INFORMATION WITHIN THE COMPUTER  How do binary numerals move into, out of, and within the computer?  Information is moved about in bytes, or multiple bytes called words.  Words are the fundamental units of information.  The number of bits per word may vary per computer.  A word length for most large IBM computers is 32 bits:
  • 9. MOVING INFORMATION WITHIN THE COMPUTER …  Bits that compose a word are passed in parallel from place to place.  Ribbon cables:  Consist of several wires, molded together.  One wire for each bit of the word or byte.  Additional wires coordinate the activity of moving information.  Each wire sends information in the form of a voltage pulse.
  • 10. MOVING INFORMATION WITHIN THE COMPUTER …  Example of sending the word WOW over the ribbon cable  Voltage pulses corresponding to the ASCII codes would pass through the cable.
  • 11. PRIMARY MEMORY  Primary storage or memory: Where the data & program that are currently in operation or being accessed are stored during use.  Consists of electronic circuits: Extremely fast and expensive.  Two types:  RAM (non-permanent)  Programs and data can be stored here for the computer’s use.  Volatile: All information will be lost once the computer shuts down.  ROM (permanent)  Contents do not change.
  • 12.  ROM : a transistor [storing video game software, electronic musical instruments]. ROM is mostly used for firmware updates.  EROM : Erasable programmable read-only memory  EEPROM :Electrically Erasable Programmable Read-Only Memory  Cache : Location in RAM where data is stored for a certain amount of time of that it can be reused.  Registers : various flip flop register[RS, D, JK, shift etc] holds information  Swap : External disk is used to accommodate the demand of more RAM. OTHER MEMORY
  • 13. SEGMENT OF RAM  Low mem, high mem, Normal mem, DMA, DMA32  On a 32-bit architecture[DMA, Normal & HighMem] : the address space range for addressing RAM is: 0x00000000 - 0xffffffff or 4'294'967'295 (4 GB). The user space range: 0x00000000 - 0xbfffffff or 3 GB The kernel space range: 0xc0000000 - 0xffffffff or 1 GB Linux splits the 1GB kernel space into 2 pieces: LOWMEM and HIGHMEM.  On 64 bit machine[DMA, DMA32 & Normal] : Normal memory available beyond 4 GB
  • 14. MEMORY MAPPING  Linux uses only 4 segments in 32 bit arch:  2 segments (code and data/stack) for KERNEL SPACE from [0xC000 0000] (3 GB) to [0xFFFF FFFF] (4 GB)  2 segments (code and data/stack) for USER SPACE from [0] (0 GB) to [0xBFFF FFFF] (3 GB) See virtual Map : $ pmap <PID> , see stack : $pstack <PID>  Segmentation, Paging [To overcome flaw in segmentation] –  allocating virtual small pages to each process so that they will be fit in RAM with out wasting it.
  • 15. PROCESS ADDRESS SPACE – 31 BIT ARCH Kernel 0xC0000000 File name, Environment Arguments Stack Bss[Block started by Symbol] _end _bss_start Data _edata _etext Text/code Header 0x84000000 Shared Libs Text/Code Segment: contains the actual code Data: contains global variables BSS: contains uninitialized global variables Heap: dynamic memory Stack: collection of frames/functions Heap Unused Memory 4 GB --> 3 GB --> 0 GB --> Kernel Space User Space
  • 16. PAGE & FRAME  Paging, Demand Paging, Swapping  Page Tables [64 bit 4, 32 bit 2]: Page Global Directory, Page Upper Directory, Page Middle Directory, Page  Min page size : getconf -a|grep -i page  Life cycle of page: active----> inactive list --> dirty --> clean
  • 17. SWAP, HUGE PAGE, MMU,TLB  SWAP : All pages can’t be fit in RAM, need to call/send data from and to storage disk  Hugepage : default page is 4MB but large program uses chunks of memory area. Hence, allow large page. [sysctl -a|grep -i huge]  MMU/TLB : Responsible for translating logical address to physical address. TLB is buffer that is used by MMU.  Active/Inactive regions [cat /proc/meminfo]  Shmem : shared memory area[ipcs -m]  Buddyinfo : view memory fragmentation/ allocation[cat /proc/buddyinfo]  Cache : For speeding up, sync to flush out and forcefully write on disk, bdflush does at background [flush-253:0 in rhel 6] buffer's policy is first-in, first-out cache's policy is Least Recently Used[LRU] [$ vmstat -S M 1]
  • 18. VIRTUAL MEMORY, HOW PROGRAM MAPS?  Executable text  Executable data  Heap space  Stack  Get exact required memory by process :  $ pmap -x <pid>,  $cat /proc/<pid>/status
  • 19. PAGE CACHE MEMORY CONTROL  vm.dirty_expire_centisecs=2000  vm.dirty_writeback_centisecs=400 //how long they’ll wait  vm.dirty_background_ratio=5 // when percentage of total RAM filled, pdflush/flush daemon will start write dirty data on disk  vm.dirty_ratio=20 //when percentage of total RAM filled, process will start write data on disk  vfs_cache_pressure [100] : controls the tendency of the kernel to reclaim the memory which is used for caching of directory and inode objects  Swappiness[60] : controls how kernel will use swap space.  To free pagecache: To free pagecache: echo 1 > /proc/sys/vm/drop_caches To free dentries and inodes : echo 2 > /proc/sys/vm/drop_caches To free pagecache, dentries and inodes: echo 3 > /proc/sys/vm/drop_caches  cache writes done by : kernel thread pdflush/bdflush, now in rhel 6 it is flush.  Life cycle of pages : active---->inactive list -->dirty > clean Link : https://www.kernel.org/doc/Documentation/sysctl/vm.txt
  • 20. PHYSICAL MEMORY ALLOCATION LIMIT  CommitLimit : total mem to be allocated based on ovcercommit_ratio  Committed_AS : currently allocated  overcommit_memory : from 0 to 2 << Start from here 0 = allow available memory on the system to be overloaded //default 1 = no memory over commit handling 2 = allocate best on overcommit_ratio // allocate best on condition  Overcommit_ratio: % of RAM when overcommit_memory is set 2, default value 50 Example : 4 GB RAM, 2 GB Swap, overcommit_memory=2, Overcommit_ratio=50 , so commitLimit = 2+ (4*50/100)=2+2= 4 GB Issue : Application failed to start due to shortage of memory, Needed to disable
  • 21. WHY MEMORY CACHE IS REALLY REQUIRED Speed up processing :  $ cat > XYZ  $ echo 3 > /proc/sys/vm/drop_caches  $ time cat XYZ //much time  $ time cat XYZ //less time
  • 22. MEMORY NODES, ZONES IN 32 BIT & 64 BIT  Below zones are in 32 bits :  Zone_DMA (0-16MB)  Zone_Normal (16MB-896MB)  ZONE_HIGH_MEM (896MB-above) HIGHMEM's lower zone is NORMAL+DMA , NORMAL's lower zone is DMA.  Below zones are in 64 bits :  Normal : Beyond 4 GB  DMA : till 16 MB  DMA32 : till 4GB  $ cat /proc/zoneinfo  $ cat /proc/pagetypeinfo  $cat /proc/<pid>/numa_maps  $ cat /proc/buddyinfo
  • 23. LOW MEMORY, ZONE_RECLAIM  "lowmem" often means NORMAL+DMA  “lowmem” is not present in RHEL 6, 64bit  Reservation is controlled by : lowmem_reserve_ratio [DMA NORMAL HIGMEM]  cat /proc/sys/vm/lowmem_reserve_ratio 256 256 32 // (1/256)*100 % = 0.39% of nearset zone is reserved  zone_reclaim_mode: How more or less aggressive approaches to reclaim memory when a zone runs out of memory 1 = Zone reclaim on 2 = Zone reclaim writes dirty pages out 4 = Zone reclaim swaps pages
  • 24. NON-UNIFORM MEMORY ACCESS(NUMA)  Numa concept : Numa Placement – placement of processor & Memory, manual – application, MPI(Message Passing Interface)  Place application in correct node  Two memory policy – Node Local[after linux boot], Interleave [during kernel boot]  cat /proc/<pid>/numa_maps  numactl -s //show policy  numactl –hardware  numactl [ --interleave nodes ] [ --preferred node ] [ --membind nodes ] [ --cpunodebind nodes ] [ --physcpubind cpus ] [ -- localalloc ] [--] command {arguments ...} Ref : http://www.redhat.com/summit/2012/pdf/2012-DevDay-Lab-NUMA-Hacker.pdf
  • 25. NUMA MANAGEMENT  numactl --physcpubind=0,1,2,3 example_process  numactl --physcpubind=0-3 example_process  numactl --cpunodebind=2 example_process //run on this cpu  numactl --physcpubind=0 --localalloc example_process  numactl --membind=4 example_process  numactl --cpunodebind=0 example_process //Only execute command on the CPUs of 0  numactl --cpubind=0 --membind=0,1 process // Run process on node 0 with memory allocated on node 0 and 1  numactl –hardware  cat /sys/devices/system/node/node*/numastat  Allocation : $watch -n1 numastat
  • 26. KERNEL MEMORY ALLOCATORS  Low-level page allocator :  Buddy system for contiguous multi-page allocations  Provides pages for  in-kernel allocations (slab cache)  vmalloc areas (kernel modules, multi-page data areas)  page cache, anonymous user pages  misc. other users  Slab cache :  Manages allocations of objects of the same type  Large-scale users: inodes, dentries, block I/O, network ...  kmalloc (generic allocator) implemented on top  Tool : slabtop
  • 27. PAGE FAULT HANDLING  Hardware support :  Accessing invalid pages causes 'page translation' check  Writing to protected pages causes 'protection exception'  Translation-exception identification provides address  'Suppression on protection' facility essential!  Linux kernel page fault handler :  Determine address/access validity according to VMA  Invalid accesses cause SIGSEGV delivery  Valid accesses trigger: page-in, swap-in, copy-on-write  Extra support for stack VMA: grows automatically  Out-of-memory if overcommitted causes SIGBUS
  • 28. TOOLS TO CHECK MEMORY USAGE  Report paging statistics : sar -B  Report memory utilization statistics : sar –r  Report memory statistics : sar –R  Report swap space utilization statistics: sar –S  Current memory usage :  free –m|k|g  Cat /proc/meminfo  Memory allocation :  cat /proc/buddyinfo  VM memory allocation:  pmap -x <PID>  Cat /proc/<pid>/status  Display kernel slab cache & memory information in real time:  slabtop  vmstat  ps  top  cat /proc/meminfo  strace, gcore
  • 29. MEMORY LEAK CHECK  Usage check : historical sar report  mtrace : builtin c function.  Valgrind :  valgrind --tool=memcheck --leak-check=full --show-reachable=yes snmpd -f –Lo
  • 30. ISSUES RELATED TO MEMORY  TCP/IP communication delay – RH cluster broken  High cache usage : slowdown application / system  Memory pressure : Memory leak, App is not tuned properly  Memory fragmentation : hugepage not used  OOM killer kills application: Memory pressure, OOM is enabled by default, kills based on badness value.  Segmentation fault : Kernel reclaims in normal/low memory region, hence no room for kernel, encounters segmentation fault.  Faulty Memory : Hardware failure or circuit failure in chip, need a diagnosis and replace chip
  • 31. TROUBLESHOOTING MEMORY ISSUE  Memory & swap usage test : swap_tendency = mapped_ratio/2 + distress + vm_swappiness mapped_ratio= % of physical memory in use distress = how much trouble kernel in freeing memory vm_swappiness= default 60 swap_tendency >= 100, eligible for swap swap_tendency < 100, reclaim from page cache  Sysrq : echo 1 > /proc/sys/kernel/sysrq echo m > /proc/sysrq-trigger  backtrace analysis
  • 32. TROUBLESHOOTING  OOM messages investigation : Messages :  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461588] [] oom_kill_process+0x5c/0x80  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461591] [] out_of_memory+0xc5/0x1c0  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461595] [] __alloc_pages_nodemask+0x72c/0x740  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461599] [] __get_free_pages+0x1c/0x30  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461602] [] get_zeroed_page+0x12/0x20  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461606] [] fill_read_buffer.isra.8+0xaa/0xd0  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461609] [] sysfs_read_file+0x7d/0x90  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461613] [] vfs_read+0x8c/0x160  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461616] [] ? fill_read_buffer.isra.8+0xd0/0xd0  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461619] [] sys_read+0x3d/0x70  Oct 25 07:28:34 nldedip4k031 kernel: [87976.461624] [] sysenter_do_call+0x12/0x28