Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Memory management in Linux kernel

34,239 views

Published on

Основные темы, затронутые на семинаре:
Задачи и компоненты подсистемы управления памятью;
Аппаратные возможности платформы x86_64;
Как описывается в ядре физическая и виртуальная память;
API подсистемы управления памятью;
Высвобождение ранее занятой памяти;
Инструменты мониторинга;
Memory Cgroups;
Compaction — дефрагментация физической памяти.

Published in: Technology
  • Be the first to comment

Memory management in Linux kernel

  1. 1. 1
  2. 2. 2 Memory management in Linux kernel
  3. 3. 3 Memory management tasks • Physical memory allocator • Physical memory management • Virtual memory allocator • PTE management • Memory allocator for kernel needs
  4. 4. 4 Memory management subsystem • >100K lines • Buddy allocator • Page replacement (“LRU” reclaim model) • PTE management • Slab/slob/slub kernel allocator • Pagecache/writeback/readahead/swap • Cgroup memory controller • Compaction
  5. 5. 5 Hardware • X86_64 • Paging (MMU, TLB, ...) • 4KB, 2MB and 1GB pages • NUMA • 4-level PTE's • Hardware referenced bit
  6. 6. 6 Physical memory description • Node (pg_data_t) • Zone (struct zone) • Page (struct page) $ cat /proc/zoneinfo | grep Node Node 0, zone DMA Node 0, zone DMA32 Node 0, zone Normal Node 1, zone Normal
  7. 7. 7 Virtual memory description• Address space (struct mm_struct) • VM area (struct vm_area_struct) $ cat /proc/self/maps 00400000-0040c000 r-xp 00000000 08:03 2359718 /usr/bin/cat 0060b000-0060c000 r--p 0000b000 08:03 2359718 /usr/bin/cat 0060c000-0060d000 rw-p 0000c000 08:03 2359718 /usr/bin/cat 011a7000-011c8000 rw-p 00000000 00:00 0 [heap] 7f4d072e5000-7f4d0d80e000 r--p 00000000 08:03 2369473 /usr/lib/locale/locale-archive 7f4d0d80e000-7f4d0d9c2000 r-xp 00000000 08:03 2366682 /usr/lib64/libc-2.18.so 7f4d0d9c2000-7f4d0dbc2000 ---p 001b4000 08:03 2366682 /usr/lib64/libc-2.18.so 7f4d0dbc2000-7f4d0dbc6000 r--p 001b4000 08:03 2366682 /usr/lib64/libc-2.18.so ...
  8. 8. 8 File mappings • File mappings (struct address_space) • Radix tree with all resident pages • Pagecache • Major/minor pagefault
  9. 9. 9 Kernel API • __get_free_page() • kmalloc()/kfree() • vmalloc() • ...
  10. 10. 10 Userspace API • pagefault • mmap()/munmap() • brk() • mlock()/munlock() • fadvise(), madvise() • ...
  11. 11. 11 Memory reclaim • Normal/direct reclaim (free pool) • Per-node kswapd • Working set • Memory pressure • File memory vs anonymous memory • Swap • OOM
  12. 12. 12 “LRU” model • 5 double linked lists: inactive file, active file, inactive anon, active anon, unevictable • Referenced flag in struct page_struct flag
  13. 13. 13 List transition rules • mark_page_accessed(): – unreferenced -> referenced – inactive && referenced -> active • shrink_inactive_list(): – if (ptes referenced) • anonymous -> active • referenced -> active • (ptes referenced > 1) -> active (3.2) • (vm_flags & VM_EXEC) -> active (3.2) • set referenced • rotate – else • reclaim • shrink_active_list(): – If referenced • file & VM_EXEC -> rotate – -> inactive
  14. 14. 14 Memory pressure balancing • nr_pages_to_scan = nr_pages/2^priority • priority = [12..0] 1/4096, 1/2048, 1/1024, ... • swappiness • active > inactive
  15. 15. 15 Yasearch-specific problems & solutions • Working set > 1/2 available memory • Memory thrashing • promote_mapped_pages • file_inactive_ratio
  16. 16. 16 Monitoring & tools • top • vmtouch • /proc/vmstat • /proc/buddyinfo • /proc/slabinfo • perf top • oom-message in dmesg
  17. 17. 17 Demonstration
  18. 18. 18 Cgroups • Each cgroup has own LRU lists. • No common LRU (since 3.3)! • Common free pool(s) • Common kswapd thread(s) • Global reclaim vs target reclaim
  19. 19. 19 Memory controller • memory.limit_in_bytes • memory.soft_limit_in_bytes (will be deprecated) • memory.use_hierarchy • ...
  20. 20. 20 Monitoring • memory.usage_in_bytes • memory.max_usage_in_bytes • memory.stat
  21. 21. 21 Accounting • Each page belongs to one cgroup • First accessed - owner • memory.move_charge_at_immigr ate
  22. 22. 22 Yasearch-specific problems & solutions • memory.low_limit_in_bytes • First accessed – owner? mlock()? low_limit? • memory.recharge_on_pgfault
  23. 23. 23 Compaction • Physical pages migration to zone's top • https://lwn.net/Articles/368869 • Broken in 3.3-3.7 • Replacement for lumpy reclaim • Use perf top for problem diagnostics
  24. 24. 24 Спасибо за внимание!

×