Memory management


Published on

this is my basic pointers on what is important when starting of with memory.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Memory management

  1. 1. Basics of memory management By Frankie Onuonga Twitter: FOnuonga
  2. 2. Describing the physical memory• We must understand that memory is not always in a uniform way especially in servers.• We see the concept of Non uniform memory access• Memory is arranged in a way that cost is the distance from the processor• Therefore memory that requires quick direct access is placed near device cards or processor
  3. 3. Nodes• Each bank is called a node represented as struct• Even in UMA it is referenced by its typedef• Each node is kept on a list that ends with a null called pgdata.list where one node is linked to the next with ->• In UMA it is only one called
  4. 4. Zones• Smaller region of a node representing memory ranges• It is represented by struct zone.struct type deffed to zone.t and each is: – Zone_DMA: first 16 mib of mem – Zone_normal:16 mib to 896 mib – Zone_highmem:896 mib upwards• Only zone_normal is mapped by the kernel in to the upper region
  5. 5. • Page frames are fixed size chunks of memory• Each one is represented by struct page and all structs are kept in global array kept at the beginning of zone_normal or after area reserved for for loaded kernel image in low memory machines.
  6. 6. Relationship diagram
  7. 7. Nodes• Linux uses a closest node allocation policy to reduce on computing costs.• This is because also processes that relate also tend to run on the same CPU.• From the code below we can see this easily
  8. 8. • typedef struct pglist_data { zone_t node_zones[MAX_NR_ZONES]; zonelist_t node_zonelists[GFP_ZONEMASK+1]; int nr_zones; struct page *node_mem_map; unsigned long *valid_addr_bitmap; struct bootmem_data *bdata; unsigned long node_start_paddr; unsigned long node_start_mapnr; }pg_data_t; unsigned long node_size; int node_id; struct pglist_data *node_next;
  9. 9. • All nodes on the system are maintained on a list called pgdata_list• All nodes are added here as initialized by the init_bootmem_core() function
  10. 10. Zones• Each zone is maintained by a struct zone_struct• They keep information such as page usage statistics, free area information and locks, as declared in /mmzone.h
  11. 11. Zone watermarks• There are three general page watermarks, pages.low, pages.min, pages.high• When system memory is low the pageout daemon kswapd is woken up to free pages.• If pressure is high it will do this synchronously, sometimes referred to as direct-reclaim path• Page.min is calculated by free_area_init_core() during memory init and is based on a ratio to the size of the zone in pages
  12. 12. Calculating the Sizes of zones• Sizes of zones are all calculated during setup.memory()• The min_low_pfn is located at the beginning of the first page and at the end of the which is the end of the loaded kernel image and is stored as a variable in mm/bootmem.c• The value of last page is calculated differently depending on architecture, max_pfn is calculated in x86 by the find_max_low_pfn() and it marks the end of ZONE_NORMAL.
  13. 13. Zone wait queue table• When i/o operations are taking place in a page there is a lock in place to ensure that it is not affected with inconsistent data.• Processes have to wait before any access is allowed by calling wait_on_page() stored in zone.t• When done the memory will be unlocked with unlockpage() and the process on wait_on_page will be woken up
  14. 14. • To avoid collision there is a way to caluclate this by use of hash tables of wait queues stored in zone.t->wait_table_size
  15. 15. Zone initialization• They are set after the kernel pages table have been full set up by paging_init()• The objective here is to determine which parameters to send either to free_area_init in UMA or free_area_init_node in NUMA• In UMA th only parameter required is zones_size
  16. 16. Other parameters• Nid:- is the logical identifier of th node whose zones are being initialized• Pgdat is the node’s pg_data that is being initialized. In UMA this will be contig page_data• Pmap is set later by free_area_init_core() to point to the beginning of the local mem_map array allocation for the node. This is ignored in NUMA – In UMA this pointer is the global mem_map variable which is now mem_map and gets initialized in UMA.
  17. 17. • Zone_sizes is the array containing the size of each zone in pages• Zone_stat_paddr is the starting physical address for the first zone• Zone_holes is an array containing the total size of memory holes in zones• The core function free area init core() is responsible for filling in each zone t with the relevant information and the allocation of the mem map array for the node
  18. 18. Initializing mem_map• This is done during system start up in 2 fashions• In NUMA the global mem_map is treated as a virtual array starting at PAGE_OFFSET.• Free_area_init_node() is called for each active node in the system which allocates the portion of this array for the node being initialized• On UMA systems free_area_init() uses contig_page_data as the node and the global mem map as the local mem map for this node
  19. 19. • The core function being free_area_init_core allocates a local mem_map for the node being initialized• The memory for the array is allocated from the boot memory allocator with alloc_bootmem_node(). With UMA architecture this newly allocated memory becomes the global mem_map but it is slightly different for NUMA
  20. 20. • NUMA architectures allocate the memory for lmem map within their own memory node. The global mem map never gets explicitly allocated, but instead is set to PAGE OFFSET where it is treated as a virtual array. The address of the local map is stored in pg data t→node mem map, which exists somewhere within the virtual mem map. For each zone that exists in the node, the address within the virtual mem map for the zone is stored in zone t→zone mem map. All the rest of the code then treats mem map as a real array bacause only valid regions within it will be used by nodes.
  21. 21. PAGES• Every physical page frame in the system has an associated struct page that keeps track of its status
  22. 22. Mapping pages to zones• Until the 2.4 kernel a struct page stored a reference to its zone as page->zone which was considered wasteful because a small pointer consumes a lot of memory when thousands of struct pages exist• The shift then moved to removing the zone field for the top ZONE_SHIFT(8 in the x86) bits of the page->flags are used to determine a zone that a page belongs to
  23. 23. • First a zone_table of zones is set up which is declared in mm/page_alloc.c – Zone_t *zone_table[max_no_zones*max_no_nodes]; – Export_symbol (zone_table); The export makes it accesible to other loadable modules. The table it treated like a multidimensional array Duting free_area_init_core all pages in a node are initalized.
  24. 24. High memory• Because the address that can be used by kernel is limited in size we at times use the high memory• On a 32 bit system we have 2 thresholds: – 4 Gib(related to the amount of memory that may be accessed by a 32 bit physical address) and – 64 Gib
  25. 25. • To access memory from within (1-4Gib) range we use Kmap() which temporarily maps pages from high memory to zone normal• The second was invented by intel to allow more rum to be used by adding four extra bits to the addressing• This is mostly in theory because in practice Linux can not access all this because the virtual access space is still only 4Gib.• Sorry to th guys who tried malloc() all RAM