This document describes a case study of a system suffering from a memory leakage issue caused by a process that continuously requests more memory using malloc but fails to free it. This causes the process to grow virtually and physically, displacing other processes and degrading overall system performance. Analysis with the atop tool shows the process inflating over time until it consumes over 50% of physical memory, forcing heavy swapping that slows the entire system. After the leaking process is terminated, the system gradually recovers as processes reclaim swapped-out memory pages.
This document provides an outline and overview of key concepts related to virtual memory, including:
- Demand paging loads pages into memory only when needed, reducing I/O compared to loading all pages upfront. Page faults trigger loading needed pages from disk.
- Copy-on-write sharing allows processes to initially share read-only pages, only copying when a page is written to avoid unnecessary duplication.
- When memory is full and a page is needed, page replacement selects a page in memory to swap to disk to free up space using algorithms like FIFO, LRU, etc. This avoids having to swap out the entire process.
- Memory-mapped files allow sharing memory between processes by mapping files or
This document provides a report on a project to develop a frequency analysis software using both CPU and GPU implementations. The CPU version parses documents into a tree structure and uses multiple threads to count word frequencies. Testing showed it was fast but limited to ASCII. The GPU version improves performance by offloading counting to GPU threads. It develops techniques to eliminate contention and achieves speedups over the CPU version while supporting UTF-8 encoding and content filtering. Metrics show the GPU version is faster, processing a 128MB XML file around 4.3 seconds on average compared to 4.9 seconds for the CPU version.
Disk and terminal drivers allow processes to communicate with peripheral devices like disks, terminals, printers and networks. A disk driver translates file system addresses to specific disk sectors. A terminal driver controls data transmission to and from terminals and processes special characters through line disciplines. Device drivers are essential for all input/output in the UNIX operating system.
The document discusses memory management techniques in UNIX, including swapping, demand paging, and how they are implemented in Intel Pentium hardware. Swapping involves copying processes from memory to disk swap space to free up memory. Demand paging only loads pages into memory when accessed. The Pentium supports segmentation and paging, where logical addresses are translated to linear then physical addresses using segment descriptors, page directories and tables.
This document discusses the key components of Windows threads. It explains that a Windows thread consists of three main components: the thread kernel object, stack, and Thread Environment Block (TEB). The thread kernel object contains context and statistical information about the thread. A thread has both a user-mode and kernel-mode stack. The TEB stores thread-specific data like exception handling information. The document also discusses thread states, scheduler queues, and how the OS schedules threads for execution every 20 milliseconds by examining threads in the ready queue.
Dulo: an effective buffer cache management scheme to exploit both temporal an...915086731
This document proposes using disk layout information stored in a block table to improve disk performance. The block table tracks access times of disk blocks through their logical block numbers (LBNs) in a multi-level structure similar to a page table. Comparing timestamps of neighboring entries allows identification of sequential access patterns. While using memory, the table can be efficiently reclaimed by removing entries with timestamps below a threshold. This approach aims to better inform caching policies of sequential versus random access costs for improved disk I/O.
This document discusses several memory management techniques:
1. Contiguous allocation allocates processes to contiguous regions of memory but can lead to fragmentation.
2. Paging divides memory into pages and processes into page tables to map virtual to physical addresses, reducing fragmentation. It uses translation lookaside buffers (TLBs) to speed address translation.
3. Segmentation divides processes into logical segments and uses segment tables to map segments to physical addresses. It provides a modular view of memory but external fragmentation remains an issue.
Swapping is the process of exchanging memory pages between main memory and secondary storage, such as a hard disk. There are three types of swapping that occur. When memory becomes full, inactive processes are swapped out to disk to free up space, and are swapped back in when needed. The first UNIX systems constantly monitored free memory and swapped out processes to disk when levels fell below a threshold. Swap space is used on Linux when RAM is full, with inactive memory pages moved to the swap file to free up space. The swap cache helps avoid race conditions when processes access pages being swapped by collecting shared pages that have been copied to swap space.
This document provides an outline and overview of key concepts related to virtual memory, including:
- Demand paging loads pages into memory only when needed, reducing I/O compared to loading all pages upfront. Page faults trigger loading needed pages from disk.
- Copy-on-write sharing allows processes to initially share read-only pages, only copying when a page is written to avoid unnecessary duplication.
- When memory is full and a page is needed, page replacement selects a page in memory to swap to disk to free up space using algorithms like FIFO, LRU, etc. This avoids having to swap out the entire process.
- Memory-mapped files allow sharing memory between processes by mapping files or
This document provides a report on a project to develop a frequency analysis software using both CPU and GPU implementations. The CPU version parses documents into a tree structure and uses multiple threads to count word frequencies. Testing showed it was fast but limited to ASCII. The GPU version improves performance by offloading counting to GPU threads. It develops techniques to eliminate contention and achieves speedups over the CPU version while supporting UTF-8 encoding and content filtering. Metrics show the GPU version is faster, processing a 128MB XML file around 4.3 seconds on average compared to 4.9 seconds for the CPU version.
Disk and terminal drivers allow processes to communicate with peripheral devices like disks, terminals, printers and networks. A disk driver translates file system addresses to specific disk sectors. A terminal driver controls data transmission to and from terminals and processes special characters through line disciplines. Device drivers are essential for all input/output in the UNIX operating system.
The document discusses memory management techniques in UNIX, including swapping, demand paging, and how they are implemented in Intel Pentium hardware. Swapping involves copying processes from memory to disk swap space to free up memory. Demand paging only loads pages into memory when accessed. The Pentium supports segmentation and paging, where logical addresses are translated to linear then physical addresses using segment descriptors, page directories and tables.
This document discusses the key components of Windows threads. It explains that a Windows thread consists of three main components: the thread kernel object, stack, and Thread Environment Block (TEB). The thread kernel object contains context and statistical information about the thread. A thread has both a user-mode and kernel-mode stack. The TEB stores thread-specific data like exception handling information. The document also discusses thread states, scheduler queues, and how the OS schedules threads for execution every 20 milliseconds by examining threads in the ready queue.
Dulo: an effective buffer cache management scheme to exploit both temporal an...915086731
This document proposes using disk layout information stored in a block table to improve disk performance. The block table tracks access times of disk blocks through their logical block numbers (LBNs) in a multi-level structure similar to a page table. Comparing timestamps of neighboring entries allows identification of sequential access patterns. While using memory, the table can be efficiently reclaimed by removing entries with timestamps below a threshold. This approach aims to better inform caching policies of sequential versus random access costs for improved disk I/O.
This document discusses several memory management techniques:
1. Contiguous allocation allocates processes to contiguous regions of memory but can lead to fragmentation.
2. Paging divides memory into pages and processes into page tables to map virtual to physical addresses, reducing fragmentation. It uses translation lookaside buffers (TLBs) to speed address translation.
3. Segmentation divides processes into logical segments and uses segment tables to map segments to physical addresses. It provides a modular view of memory but external fragmentation remains an issue.
Swapping is the process of exchanging memory pages between main memory and secondary storage, such as a hard disk. There are three types of swapping that occur. When memory becomes full, inactive processes are swapped out to disk to free up space, and are swapped back in when needed. The first UNIX systems constantly monitored free memory and swapped out processes to disk when levels fell below a threshold. Swap space is used on Linux when RAM is full, with inactive memory pages moved to the swap file to free up space. The swap cache helps avoid race conditions when processes access pages being swapped by collecting shared pages that have been copied to swap space.
Virtual memory allows processes to access memory addresses that exceed the amount of physical memory available. When a process references a memory page that is not in RAM, a page fault occurs which brings the missing page into memory from disk. Page replacement algorithms are used to determine which page to remove from RAM to make room for the faulting page. The working set model aims to keep the active pages used by each process in memory to reduce thrashing, which occurs when the total memory demand exceeds the available RAM.
The document discusses cache design and organization. It describes how caches work, sitting between the CPU and main memory to provide fast access to frequently used data. The key aspects covered include cache size, block size, mapping techniques, replacement algorithms, write policies, and the evolution of cache hierarchies in processors like the Pentium IV with multiple levels of on-chip and off-chip caches.
This document discusses the structure and components of disk drives and operating system disk management. It covers topics like disk formatting, logical block addressing, zones and rotational speeds on disks, and different disk scheduling algorithms like FCFS, SSTF, SCAN, C-SCAN, and LOOK. It also summarizes file systems, swap space management, and networking protocols in operating systems.
This document summarizes key characteristics of cache memory including location, capacity, access methods, performance, and organization. It discusses the memory hierarchy from registers to external memory. Common cache mapping techniques like direct mapping, associative mapping, and set associative mapping are explained. The document also covers cache design considerations such as replacement algorithms and write policies.
This document discusses different memory management techniques including:
1. Contiguous allocation allocates processes to contiguous regions of memory but can lead to fragmentation. Paging and segmentation address this by allowing non-contiguous allocation.
2. Paging maps logical addresses to physical frames through a page table. It supports non-contiguous allocation but has translation overhead that is reduced using translation lookaside buffers.
3. Segmentation divides memory into logical segments and uses a segment table to map logical to physical addresses. It matches the user's view of memory but external fragmentation remained an issue until combined with paging.
Blosc: Sending Data from Memory to CPU (and back) Faster than Memcpy by Franc...PyData
Blosc is a library that can compress and decompress data faster than memcpy by taking advantage of the large performance gap between CPU and memory. It works by dividing data into blocks that fit CPU caches, using multithreading, and SIMD instructions. Real-world tests show Blosc can compress genomics data 15-40x smaller with similar access speeds. Blosc is used in libraries like Bloscpack and BLZ to provide efficient compressed containers for large NumPy and other datasets while maintaining close to uncompressed performance.
The Google File System was designed by Google to store and manage large files across thousands of commodity servers. It uses a single master to manage metadata and track file locations across chunkservers. Chunks are replicated for reliability and placed across racks to improve bandwidth utilization. The system provides high throughput for concurrent reads and writes through leases to maintain consistency and pipelining of data flows. Logs and replication are used to provide fault tolerance against server failures.
This document discusses virtual memory management. It begins with background on virtual memory and how it allows logical address spaces to be larger than physical memory. It describes demand paging, where pages are only loaded into memory when needed rather than all at once. Copy-on-write is explained as a way to share pages between processes initially. When memory runs out, page replacement algorithms are used to select pages to remove from memory. Optimizations like faster swap space I/O and demand paging from files rather than swapping are also covered.
This document discusses memory management techniques used in operating systems, including:
- Base and limit registers that define the logical address space and protect memory accesses.
- Address binding from source code to executable addresses at different stages.
- The memory management unit (MMU) that maps virtual to physical addresses using base/limit registers.
- Segmentation architecture that divides memory into logical segments like code, data, stack, heap.
Paging and Segmentation in Operating SystemRaj Mohan
The document discusses different types of memory used in computers including physical memory, logical memory, and virtual memory. It describes how virtual memory uses paging and segmentation techniques to allow programs to access more memory than is physically available. Paging divides memory into fixed-size pages that can be swapped between RAM and secondary storage, while segmentation divides memory into variable-length, protected segments. The combination of paging and segmentation provides memory protection and efficient use of available RAM.
Paging is a memory management scheme that allows the physical address space of a process to be non-contiguous. The logical memory is divided into pages of a fixed size, while physical memory is divided into frames of the same size. When accessing a memory location, the CPU generates a page number and page offset. The page number is used to index into a page table stored in main memory to map the logical page to a physical frame. A Translation Lookaside Buffer (TLB) cache is used to improve performance by caching recent page table lookups.
The document discusses paging and monolithic kernels. Paging is a memory management technique that moves pages between main memory and secondary storage. It allows for faster access to data by copying pages from storage to memory when needed. A monolithic kernel is an operating system architecture where the entire OS operates in kernel space for efficient communication but makes the system difficult to maintain and debug.
Virtual memory allows programs to execute without requiring their entire address space to be resident in physical memory. It uses virtual addresses that are translated to physical addresses by the hardware. This translation occurs via page tables managed by the operating system. When a virtual address is accessed, its virtual page number is used as an index into the page table to obtain the corresponding physical page frame number. If the page is not in memory, a page fault occurs and the OS handles loading it from disk. Paging partitions both physical and virtual memory into fixed-sized pages to address fragmentation issues. Segmentation further partitions the virtual address space into logical segments. Hardware support for segmentation involves a segment table containing base/limit pairs for each segment. Translation lookaside buffers
Memory management is the act of managing computer memory. The essential requirement of memory management is to provide ways to dynamically allocate portions of memory to programs at their request, and free it for reuse when no longer needed. This is critical to any advanced computer system where more than a single process might be underway at any time
Deterministic Memory Abstraction and Supporting Multicore System ArchitectureHeechul Yun
Presentation slides of the following paper at ECRTS'18.
Farzad Farshchi, Prathap Kumar Valsan, Renato Mancuso, Heechul Yun. "Deterministic Memory Abstraction and Supporting Multicore System Architecture." Euromicro Conference on Real-Time Systems (ECRTS), 2018
Ppt cache vs virtual memory without animationanimesh jain
Cache memory is a type of faster memory between the CPU and RAM that stores frequently accessed data to reduce memory access time. Virtual memory is a memory management technique that uses RAM and hard disk space to provide programs with virtual memory addresses that are larger than the physical RAM. Cache memory exists as a hardware component whereas virtual memory is a software and hardware concept using RAM, hard disk, and memory mappings. The key difference is that cache memory improves memory access time using actual hardware, while virtual memory allows for larger isolated memory spaces by simulating memory using RAM and disk space managed by the operating system.
A woman is seen feeling isolated and insecure in a bird's eye view surrounded by dead trees, trapping her. A glove on the floor brings an ominous tone, hinting something strange is happening. An over the shoulder shot reveals the woman is being watched, shocking the viewer and suggesting this may be a ghostly place or she is being stalked. An extreme long shot conveys the impression someone is stalking the girl and preparing to attack her.
The document describes the process of creating a logo for a film title sequence representing the producing institution. Key steps included:
1. Placing an image into Photoshop and making it smaller to fit the institution name "Long Road".
2. Making the image smaller and centering it on the canvas.
3. Changing the saturation to make the image appear warmer.
4. Adding a dark background to make the image more prominent than a white background would.
5. Applying the text of the institution name in gray to avoid making the whole image too dark and draw attention to the text.
Virtual memory allows processes to access memory addresses that exceed the amount of physical memory available. When a process references a memory page that is not in RAM, a page fault occurs which brings the missing page into memory from disk. Page replacement algorithms are used to determine which page to remove from RAM to make room for the faulting page. The working set model aims to keep the active pages used by each process in memory to reduce thrashing, which occurs when the total memory demand exceeds the available RAM.
The document discusses cache design and organization. It describes how caches work, sitting between the CPU and main memory to provide fast access to frequently used data. The key aspects covered include cache size, block size, mapping techniques, replacement algorithms, write policies, and the evolution of cache hierarchies in processors like the Pentium IV with multiple levels of on-chip and off-chip caches.
This document discusses the structure and components of disk drives and operating system disk management. It covers topics like disk formatting, logical block addressing, zones and rotational speeds on disks, and different disk scheduling algorithms like FCFS, SSTF, SCAN, C-SCAN, and LOOK. It also summarizes file systems, swap space management, and networking protocols in operating systems.
This document summarizes key characteristics of cache memory including location, capacity, access methods, performance, and organization. It discusses the memory hierarchy from registers to external memory. Common cache mapping techniques like direct mapping, associative mapping, and set associative mapping are explained. The document also covers cache design considerations such as replacement algorithms and write policies.
This document discusses different memory management techniques including:
1. Contiguous allocation allocates processes to contiguous regions of memory but can lead to fragmentation. Paging and segmentation address this by allowing non-contiguous allocation.
2. Paging maps logical addresses to physical frames through a page table. It supports non-contiguous allocation but has translation overhead that is reduced using translation lookaside buffers.
3. Segmentation divides memory into logical segments and uses a segment table to map logical to physical addresses. It matches the user's view of memory but external fragmentation remained an issue until combined with paging.
Blosc: Sending Data from Memory to CPU (and back) Faster than Memcpy by Franc...PyData
Blosc is a library that can compress and decompress data faster than memcpy by taking advantage of the large performance gap between CPU and memory. It works by dividing data into blocks that fit CPU caches, using multithreading, and SIMD instructions. Real-world tests show Blosc can compress genomics data 15-40x smaller with similar access speeds. Blosc is used in libraries like Bloscpack and BLZ to provide efficient compressed containers for large NumPy and other datasets while maintaining close to uncompressed performance.
The Google File System was designed by Google to store and manage large files across thousands of commodity servers. It uses a single master to manage metadata and track file locations across chunkservers. Chunks are replicated for reliability and placed across racks to improve bandwidth utilization. The system provides high throughput for concurrent reads and writes through leases to maintain consistency and pipelining of data flows. Logs and replication are used to provide fault tolerance against server failures.
This document discusses virtual memory management. It begins with background on virtual memory and how it allows logical address spaces to be larger than physical memory. It describes demand paging, where pages are only loaded into memory when needed rather than all at once. Copy-on-write is explained as a way to share pages between processes initially. When memory runs out, page replacement algorithms are used to select pages to remove from memory. Optimizations like faster swap space I/O and demand paging from files rather than swapping are also covered.
This document discusses memory management techniques used in operating systems, including:
- Base and limit registers that define the logical address space and protect memory accesses.
- Address binding from source code to executable addresses at different stages.
- The memory management unit (MMU) that maps virtual to physical addresses using base/limit registers.
- Segmentation architecture that divides memory into logical segments like code, data, stack, heap.
Paging and Segmentation in Operating SystemRaj Mohan
The document discusses different types of memory used in computers including physical memory, logical memory, and virtual memory. It describes how virtual memory uses paging and segmentation techniques to allow programs to access more memory than is physically available. Paging divides memory into fixed-size pages that can be swapped between RAM and secondary storage, while segmentation divides memory into variable-length, protected segments. The combination of paging and segmentation provides memory protection and efficient use of available RAM.
Paging is a memory management scheme that allows the physical address space of a process to be non-contiguous. The logical memory is divided into pages of a fixed size, while physical memory is divided into frames of the same size. When accessing a memory location, the CPU generates a page number and page offset. The page number is used to index into a page table stored in main memory to map the logical page to a physical frame. A Translation Lookaside Buffer (TLB) cache is used to improve performance by caching recent page table lookups.
The document discusses paging and monolithic kernels. Paging is a memory management technique that moves pages between main memory and secondary storage. It allows for faster access to data by copying pages from storage to memory when needed. A monolithic kernel is an operating system architecture where the entire OS operates in kernel space for efficient communication but makes the system difficult to maintain and debug.
Virtual memory allows programs to execute without requiring their entire address space to be resident in physical memory. It uses virtual addresses that are translated to physical addresses by the hardware. This translation occurs via page tables managed by the operating system. When a virtual address is accessed, its virtual page number is used as an index into the page table to obtain the corresponding physical page frame number. If the page is not in memory, a page fault occurs and the OS handles loading it from disk. Paging partitions both physical and virtual memory into fixed-sized pages to address fragmentation issues. Segmentation further partitions the virtual address space into logical segments. Hardware support for segmentation involves a segment table containing base/limit pairs for each segment. Translation lookaside buffers
Memory management is the act of managing computer memory. The essential requirement of memory management is to provide ways to dynamically allocate portions of memory to programs at their request, and free it for reuse when no longer needed. This is critical to any advanced computer system where more than a single process might be underway at any time
Deterministic Memory Abstraction and Supporting Multicore System ArchitectureHeechul Yun
Presentation slides of the following paper at ECRTS'18.
Farzad Farshchi, Prathap Kumar Valsan, Renato Mancuso, Heechul Yun. "Deterministic Memory Abstraction and Supporting Multicore System Architecture." Euromicro Conference on Real-Time Systems (ECRTS), 2018
Ppt cache vs virtual memory without animationanimesh jain
Cache memory is a type of faster memory between the CPU and RAM that stores frequently accessed data to reduce memory access time. Virtual memory is a memory management technique that uses RAM and hard disk space to provide programs with virtual memory addresses that are larger than the physical RAM. Cache memory exists as a hardware component whereas virtual memory is a software and hardware concept using RAM, hard disk, and memory mappings. The key difference is that cache memory improves memory access time using actual hardware, while virtual memory allows for larger isolated memory spaces by simulating memory using RAM and disk space managed by the operating system.
A woman is seen feeling isolated and insecure in a bird's eye view surrounded by dead trees, trapping her. A glove on the floor brings an ominous tone, hinting something strange is happening. An over the shoulder shot reveals the woman is being watched, shocking the viewer and suggesting this may be a ghostly place or she is being stalked. An extreme long shot conveys the impression someone is stalking the girl and preparing to attack her.
The document describes the process of creating a logo for a film title sequence representing the producing institution. Key steps included:
1. Placing an image into Photoshop and making it smaller to fit the institution name "Long Road".
2. Making the image smaller and centering it on the canvas.
3. Changing the saturation to make the image appear warmer.
4. Adding a dark background to make the image more prominent than a white background would.
5. Applying the text of the institution name in gray to avoid making the whole image too dark and draw attention to the text.
This document provides guidance for analysing existing media products by considering their target audience and representation. It discusses analysing features like images, language, camera shots and how they represent things and appeal to audiences. It provides examples of analysing magazine covers and suggests considering elements like color, font, images, layout and how they target needs and create moods. The homework is to find magazine contents pages to analyse.
WordPress Security Tips By Catch Internet:
http://catchinternet.com
This slide will cover WordPress Hosting Servers, Example of Link Injection Hacks, How to Secure your WordPress site basics and WordPress Security Plugins
Talk on Securing WordPress site at WordCamp Nepal 2012. I will be covering Top 10 Myths That We Live By and Building Secure WordPress Sites in Simple 10 Steps. Watch Video at http://wordpress.tv/2013/02/26/sakin-shrestha-building-secure-wordpress-sites/
Virtual memory allows processes to have a virtual address space larger than physical memory. When a process accesses a page not in memory, a page fault occurs which is handled by reading the requested page from disk into a free frame. Page replacement algorithms select a victim page to replace when there are no free frames, such as FIFO which replaces the oldest page. The optimal page replacement algorithm OPT replaces the page that will not be used for the longest time in the future but cannot be implemented as it requires knowing the future.
Virtual memory allows processes to have a virtual address space larger than physical memory. When a process accesses a page not in memory, a page fault occurs which is handled by reading the requested page from disk into a free frame. Page replacement algorithms select a victim page to replace when there are no free frames, such as FIFO which replaces the oldest page. The optimal page replacement algorithm OPT replaces the page that will not be used for the longest time in the future but cannot be implemented as it requires knowing the future.
The document discusses the differences between user programs and the kernel. It explains that the kernel runs in supervisor mode and manages system resources, loading and allocating memory for user programs. It also covers virtual memory systems, how they use page tables to map virtual to physical addresses, and how the page fault handler works to load pages from disk when needed. Finally, it discusses file systems and techniques used like block buffer caches to reduce disk access and improve performance.
Virtual memory is a memory management technique that allows programs to access memory addresses beyond their actual physical RAM size. It maps virtual addresses to physical addresses stored in RAM or on a hard disk using page tables and a translation process. When a program requests a page not in RAM, a page fault occurs and the OS moves a page from disk to RAM, suspending the program until the page is loaded. Page replacement algorithms like LRU then select pages to remove from RAM and write to disk when RAM is full to make space for new pages. This allows for larger memory sizes, more efficient memory usage, and multitasking.
Virtual memory allows processes to access more memory than is physically available by storing unused process memory on disk. When a process attempts to access a "page" of memory that is not in RAM, a page fault occurs which brings the page in from disk. There are several algorithms for determining which page to remove from RAM, called page replacement, when a page fault occurs and no page frames are available. The optimal page replacement algorithm would be to replace the page not used for the longest time in the future but this is not possible so algorithms try to approximate it by predicting which page will not be used soon.
This document is a research report on the virtual memory management of Linux. It discusses several key aspects of Linux's virtual memory system:
1) Linux uses a page replacement algorithm based on a clock algorithm, which cycles through physical memory pages checking for recent access using hardware-supported reference bits. This approximates an LRU replacement strategy.
2) Techniques like demand paging, copy-on-write, and memory mapping are used to improve efficiency. Only accessed pages are loaded into memory.
3) The report focuses on page replacement algorithms and swapping/caching technology, and identifies some problems with virtual memory management.
Traditional Unix tools report surprisingly little free memory on Linux systems after some time due to memory being used for disk caching. The disk cache in RAM is reported as "cached" and isn't counted as free, but it doesn't impact performance since it can be quickly replaced. By default, Linux can only use memory up to 880MB on x86 systems but recompiling the kernel allows using all RAM on machines with 1GB or more. There are also options to tune how aggressively Linux will swap out processes versus freeing disk cache.
The document discusses various memory management techniques used by operating systems including memory allocation, paging, segmentation, virtual memory and page replacement algorithms. It provides details on how operating systems manage processes in memory using techniques like memory mapping, context switching, swapping and fragmentation handling. Various address translation mechanisms like logical, physical and virtual addresses are also summarized along with common page replacement algorithms like FIFO, LRU, OPT and their working.
Virtual memory is a memory management capability of an OS that uses hardware and software to allow a computer to compensate for physical memory shortages by temporarily transferring data from random access memory (RAM) to disk storage.
Process' Virtual Address Space in GNU/LinuxVarun Mahajan
The document discusses the virtual address space of a process in GNU/Linux. It explains that a process has both a user space and kernel space in virtual memory. The process' virtual address space contains text, data, and shared library segments. Functions like brk, sbrk, mmap, malloc, and free are used to allocate and free memory in the data segment to grow the process heap.
This document discusses several memory management techniques:
1. Contiguous allocation allocates processes to contiguous regions of memory but can lead to fragmentation.
2. Paging divides memory into pages and processes into page tables to map virtual to physical addresses, reducing fragmentation. It uses translation lookaside buffers (TLBs) to speed address translation.
3. Segmentation divides processes into logical segments and uses segment tables to map segments to physical addresses. It provides a modular view of memory but external fragmentation remains an issue.
Virtual memory allows processes to have a virtual address space larger than physical memory by storing portions of processes (pages) in secondary storage like a hard drive when not actively being used. When a process attempts to access a page not in memory, a page fault occurs which triggers paging the needed page back into memory from secondary storage. Demand paging loads pages on demand at the time they are accessed rather than all at once. When a page fault occurs and no free frames are available, a page replacement algorithm must select a page to remove from memory to make room. Common page replacement algorithms discussed include FIFO, OPT, and LRU which aims to approximate OPT by tracking recency of page usage.
Fetch policy determines when pages are brought into main memory, with demand paging bringing pages in only when referenced while prepaging brings in more pages, often pages not referenced. The TLB caches page table entries to map virtual to physical addresses, with TLB misses similarly slow as instruction/data cache misses. For page replacement when no free frames exist, a page replacement algorithm selects a victim frame to swap out and free the frame.
ECECS 472572 Final Exam ProjectRemember to check the errat.docxtidwellveronique
ECE/CS 472/572 Final Exam Project
Remember to check the errata section (at the very bottom of the page) for updates.
Your submission should be comprised of two items:
a
.pdf
file containing your written report and a
.tar
file containing a directory structure with your C or C++ source code. Your grade will be reduced if you do not follow the submission instructions.
All written reports (for both 472 and 572 students) must be composed in MS Word, LaTeX, or some other word processor and submitted as a PDF file.
Please take the time to read this entire document. If you have questions there is a high likelihood that another section of the document provides answers.
Introduction
In this final project you will implement a cache simulator. Your simulator will be configurable and will be able to handle caches with varying capacities, block sizes, levels of associativity, replacement policies, and write policies. The simulator will operate on trace files that indicate memory access properties. All input files to your simulator will follow a specific structure so that you can parse the contents and use the information to set the properties of your simulator.
After execution is finished, your simulator will generate an output file containing information on the number of cache misses, hits, and miss evictions (i.e. the number of block replacements). In addition, the file will also record the total number of (simulated) clock cycles used during the situation. Lastly, the file will indicate how many read and write operations were requested by the CPU.
It is important to note that your simulator is required to make several significant assumptions for the sake of simplicity.
1. You do not have to simulate the actual data contents. We simply pretend that we copied data from main memory and keep track of the hypothetical time that would have elapsed.
2. Accessing a sub-portion of a cache block takes the exact same time as it would require to access the entire block. Imagine that you are working with a cache that uses a 32 byte block size and has an access time of 15 clock cycles. Reading a 32 byte block from this cache will require 15 clock cycles. However, the same amount of time is required to read 1 byte from the cache.
3. In this project assume that main memory RAM is always accessed in units of 8 bytes (i.e. 64 bits at a time).
When accessing main memory, it's expensive to access the first unit. However, DDR memory typically includes buffering which means that the RAM can provide access to the successive memory (in 8 byte chunks) with minimal overhead. In this project we assume an
overhead of 1 additional clock cycle per contiguous unit
.
For example, suppose that it costs 255 clock cycles to access the first unit from main memory. Based on our assumption, it would only cost 257 clock cycles to access 24 bytes of memory.
4. Assume that all caches utilize a "fetch-on-write" scheme if a miss occurs on a Store operation. This means that .
ECECS 472572 Final Exam ProjectRemember to check the err.docxtidwellveronique
ECE/CS 472/572 Final Exam Project
Remember to check the errata section (at the very bottom of the page) for updates.
Your submission should be comprised of two items:
a
.pdf
file containing your written report and a
.tar
file containing a directory structure with your C or C++ source code. Your grade will be reduced if you do not follow the submission instructions.
All written reports (for both 472 and 572 students) must be composed in MS Word, LaTeX, or some other word processor and submitted as a PDF file.
Please take the time to read this entire document. If you have questions there is a high likelihood that another section of the document provides answers.
Introduction
In this final project you will implement a cache simulator. Your simulator will be configurable and will be able to handle caches with varying capacities, block sizes, levels of associativity, replacement policies, and write policies. The simulator will operate on trace files that indicate memory access properties. All input files to your simulator will follow a specific structure so that you can parse the contents and use the information to set the properties of your simulator.
After execution is finished, your simulator will generate an output file containing information on the number of cache misses, hits, and miss evictions (i.e. the number of block replacements). In addition, the file will also record the total number of (simulated) clock cycles used during the situation. Lastly, the file will indicate how many read and write operations were requested by the CPU.
It is important to note that your simulator is required to make several significant assumptions for the sake of simplicity.
1. You do not have to simulate the actual data contents. We simply pretend that we copied data from main memory and keep track of the hypothetical time that would have elapsed.
2. Accessing a sub-portion of a cache block takes the exact same time as it would require to access the entire block. Imagine that you are working with a cache that uses a 32 byte block size and has an access time of 15 clock cycles. Reading a 32 byte block from this cache will require 15 clock cycles. However, the same amount of time is required to read 1 byte from the cache.
3. In this project assume that main memory RAM is always accessed in units of 8 bytes (i.e. 64 bits at a time).
When accessing main memory, it's expensive to access the first unit. However, DDR memory typically includes buffering which means that the RAM can provide access to the successive memory (in 8 byte chunks) with minimal overhead. In this project we assume an
overhead of 1 additional clock cycle per contiguous unit
.
For example, suppose that it costs 255 clock cycles to access the first unit from main memory. Based on our assumption, it would only cost 257 clock cycles to access 24 bytes of memory.
4. Assume that all caches utilize a "fetch-on-write" scheme if a miss occurs on a Store operation. This means that.
The document discusses different memory management techniques used in operating systems. It begins with an overview of processes entering memory from an input queue. It then covers binding of instructions and data to memory at compile time, load time, or execution time. Key concepts discussed include logical vs physical addresses, the memory management unit (MMU), dynamic loading and linking, overlays, swapping, contiguous allocation, paging using page tables and frames, and fragmentation. Hierarchical paging, hashed page tables, and inverted page tables are also summarized.
This document discusses user-level memory management in Linux programming. It describes the different memory segments of a Linux process including the code, data, BSS, heap and stack segments. It explains how programs can allocate and free dynamic memory at runtime using library calls like malloc(), calloc(), realloc() and free(), as well as system calls like brk() and sbrk(). Examples of allocating, changing the size, and freeing memory are also provided.
Virtual memory allows programs to access memory beyond the available physical memory by storing inactive memory pages on disk. This separation of physical and virtual memory makes programming easier and allows memory to be shared between processes. When a program requests a page not in memory, demand paging loads it from disk. If no frames are available, page replacement writes the contents of an unused frame to disk to free it for the requested page. Thrashing occurs when too many pages are continuously replaced from disk due to insufficient memory, degrading system performance.
Virtual memory allows a process to have a logical address space that is larger than physical memory by paging portions of memory to disk as needed. When a process accesses a page that is not in memory, a page fault occurs which brings the needed page into memory from disk. If no free frames are available, a page replacement algorithm must select a page to remove from memory and write to disk to make space for the new page. Common page replacement algorithms aim to minimize page faults by selecting pages that are least recently used.
ECECS 472572 Final Exam ProjectRemember to check the errata EvonCanales257
ECE/CS 472/572 Final Exam Project
Remember to check the errata section (at the very bottom of the page) for updates.
Your submission should be comprised of two items: a .pdf file containing your written report and a .tar file containing a directory structure with your C or C++ source code. Your grade will be reduced if you do not follow the submission instructions.
All written reports (for both 472 and 572 students) must be composed in MS Word, LaTeX, or some other word processor and submitted as a PDF file.
Please take the time to read this entire document. If you have questions there is a high likelihood that another section of the document provides answers.
Introduction
In this final project you will implement a cache simulator. Your simulator will be configurable and will be able to handle caches with varying capacities, block sizes, levels of associativity, replacement policies, and write policies. The simulator will operate on trace files that indicate memory access properties. All input files to your simulator will follow a specific structure so that you can parse the contents and use the information to set the properties of your simulator.
After execution is finished, your simulator will generate an output file containing information on the number of cache misses, hits, and miss evictions (i.e. the number of block replacements). In addition, the file will also record the total number of (simulated) clock cycles used during the situation. Lastly, the file will indicate how many read and write operations were requested by the CPU.
It is important to note that your simulator is required to make several significant assumptions for the sake of simplicity.
1. You do not have to simulate the actual data contents. We simply pretend that we copied data from main memory and keep track of the hypothetical time that would have elapsed.
2. Accessing a sub-portion of a cache block takes the exact same time as it would require to access the entire block. Imagine that you are working with a cache that uses a 32 byte block size and has an access time of 15 clock cycles. Reading a 32 byte block from this cache will require 15 clock cycles. However, the same amount of time is required to read 1 byte from the cache.
3. In this project assume that main memory RAM is always accessed in units of 8 bytes (i.e. 64 bits at a time).
When accessing main memory, it's expensive to access the first unit. However, DDR memory typically includes buffering which means that the RAM can provide access to the successive memory (in 8 byte chunks) with minimal overhead. In this project we assume an overhead of 1 additional clock cycle per contiguous unit.
For example, suppose that it costs 255 clock cycles to access the first unit from main memory. Based on our assumption, it would only cost 257 clock cycles to access 24 bytes of memory.
4. Assume that all caches utilize a "fetch-on-write" scheme if a miss occurs on a Store operation. This means that you must always fetch ...
This presentation provides valuable insights into effective cost-saving techniques on AWS. Learn how to optimize your AWS resources by rightsizing, increasing elasticity, picking the right storage class, and choosing the best pricing model. Additionally, discover essential governance mechanisms to ensure continuous cost efficiency. Whether you are new to AWS or an experienced user, this presentation provides clear and practical tips to help you reduce your cloud costs and get the most out of your budget.
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on integration of Salesforce with Bonterra Impact Management.
Interested in deploying an integration with Salesforce for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
A Comprehensive Guide to DeFi Development Services in 2024Intelisync
DeFi represents a paradigm shift in the financial industry. Instead of relying on traditional, centralized institutions like banks, DeFi leverages blockchain technology to create a decentralized network of financial services. This means that financial transactions can occur directly between parties, without intermediaries, using smart contracts on platforms like Ethereum.
In 2024, we are witnessing an explosion of new DeFi projects and protocols, each pushing the boundaries of what’s possible in finance.
In summary, DeFi in 2024 is not just a trend; it’s a revolution that democratizes finance, enhances security and transparency, and fosters continuous innovation. As we proceed through this presentation, we'll explore the various components and services of DeFi in detail, shedding light on how they are transforming the financial landscape.
At Intelisync, we specialize in providing comprehensive DeFi development services tailored to meet the unique needs of our clients. From smart contract development to dApp creation and security audits, we ensure that your DeFi project is built with innovation, security, and scalability in mind. Trust Intelisync to guide you through the intricate landscape of decentralized finance and unlock the full potential of blockchain technology.
Ready to take your DeFi project to the next level? Partner with Intelisync for expert DeFi development services today!
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
Digital Marketing Trends in 2024 | Guide for Staying AheadWask
https://www.wask.co/ebooks/digital-marketing-trends-in-2024
Feeling lost in the digital marketing whirlwind of 2024? Technology is changing, consumer habits are evolving, and staying ahead of the curve feels like a never-ending pursuit. This e-book is your compass. Dive into actionable insights to handle the complexities of modern marketing. From hyper-personalization to the power of user-generated content, learn how to build long-term relationships with your audience and unlock the secrets to success in the ever-shifting digital landscape.
Trusted Execution Environment for Decentralized Process MiningLucaBarbaro3
Presentation of the paper "Trusted Execution Environment for Decentralized Process Mining" given during the CAiSE 2024 Conference in Cyprus on June 7, 2024.
Azure API Management to expose backend services securely
Case leakage
1. Case study with atop: memory leakage
Gerlof Langeveld & Jan Christiaan van Winkel
www.atoptool.nl
– May 2010 –
This document describes the analysis of a slow system suffering from a process with a memory
leakage. Such process regularly requests for more dynamic memory with the subroutine malloc,
while the programmer has “forgotten” to free the requested memory again. In this way the process
grows virtually as well as physically. Mainly by the physical growth (the process' resident set size
increases), the process inflates like a balloon and pushes other processes out of main memory. Instead
of a healthy system where processes reach a proper balance in their memory consumption, the total
system performance might degrade just by one process that leaks memory.
Notice that the Linux kernel does not limit the physical memory consumption of processes. Every
process, either running under root identity or non-root identity, can grow unlimited. In the last section
of this case study, some suggestions will be given to decrease the influence of leaking processes on
your overall system performance.
In order to be able to interpret the figures produced by atop, basic knowledge of Linux memory
management is required. The next sections describe the utilization of physical memory as well as the
impact of virtual memory before focussing on the details of the case itself.
Introduction to physical memory
The physical memory (RAM) of your system is subdivided in equally-sized portions, called memory
pages. The size of a memory page depends on the CPU-architecture and the settings issued by the
operating system. Let's assume for this article that the size of a memory page is 4 KiB.
At the moment that the system is booted, the compressed kernel image – known as the file
/boot/vmlinuz-.... – is loaded and decompressed in memory. This static part of the kernel is
loaded somewhere at the beginning of the RAM memory.
The running kernel requires however more space, e.g. for the administration of processes, open files,
network sockets, ... but also to load dynamic loadable modules. Therefore, the kernel dynamically
allocates memory using so-called “slab caches”, in short slab. When this dynamically allocated kernel
memory is not needed any more (removal of process administration when a process exits, unloading a
loaded module, ....), the kernel might free this memory. This means that the slab space will shrink
again. Notice that all pages in use by the kernel are memory resident and will never be swapped.
2. Apart from the kernel, also processes require physical pages for their text (code), static data and stack.
The physical space consumed by a process is called “Resident Set Size”, in short RSS. How a page
becomes part of the RSS will be discussed in the next section.
The remaining part of the physical memory – after the kernel and processes have taken their share – is
mainly used for the page cache. The page cache keeps as much data as possible from the disks
(filesystems) in memory in order to improve the access speed to disk data. The page cache consists of
two parts: the part where the data blocks of files are stored and the part where the metadata blocks
(superblocks, inodes, bitmaps, ...) of filesystems are stored. The latter part is called the “buffer cache”.
Most tools (like free, top, atop) show two separate values for these parts, resp. “cached” and
“buffer”. The sum of these two values is the total size of the page cache.
The size of the page cache varies. If there is plenty free memory, the page cache will grow and if there
is a lack of memory, the page cache will shrink again.
Finally, the kernel keeps a pool of free pages to be able to fullfill a request for a new page and deliver it
from stock straight away. When the number of free pages drops below a particular threshold, pages that
are currently occupied will be freed and added to the free page pool. Such page can be retrieved from a
process (current page contents might have to be swapped to swap space first) or it can be stolen from
the page cache (current page contents might have to be flushed to the filesystem first). In the first case,
the RSS of the concerning process shrinks. In the second case, the size of the page cache shrinks.
Even the slab might shrink in case of a lack of free pages. Also the slab contains data that is just meant
to speed up certain mechanisms, but can be shrunk in case of memory pressure. An example is the
incore inode cache that contains inodes of files that are currently open, but also contains inodes of files
that have recently been open but are currently closed. That last category is kept in memory, just in case
such file will be opened again in the near future (saves another inode retrieval from disk). If needed
however, the incore inodes of closed files can be removed. Another example is the directory name
cache (dentry cache) that holds the names of recently accessed files and directories. The dentry cache is
meant to speed up the pathname resolution by avoiding accesses to disk. In case of memory pressure,
the least-recently accessed names might be removed to shrink the slab.
3. In the output of atop, the sizes of the memory components that have just been discussed can be found:
In the line labeled MEM, the size of the RAM memory is shown (tot), the memory that is currently
free (free), the size of the page cache (cache+buff) and the size of the dynamically allocated
kernel memory (slab).
The RSS of the processes can be found in the process list (the lower part of screen):
The memory details (subcommand 'm') show the current RSS per process in the column RSIZE and (as
a percentage of the total memory installed) in the column MEM. The physical growth of the process
during the last interval is shown in the column RGROW.
Introduction to virtual memory
When a new program is activated, the kernel constructs a virtual addess space for the new process. This
virtual address space describes all memory that the process could possibly use. For a starting process,
the size of the virtual space is mainly determined by the text (T) and
data (D) pages in the executable file, with a few additional pages for
the process' stack.
Notice that the process does not consume any physical memory yet
during its early startup stage.
The illustration shows an executable file with 32KiB (8 pages) of text
and 32KiB (8 pages) of static data. The kernel has just built a virtual
address space for the pages of the executable file and 4 additional
pages (e.g. for stack).
4. After the virtual address space has been built, the kernel fills the program counter register of the CPU
with the address of the first instruction to be executed. The CPU tries to fetch this instruction, but
notices that the concerning text page is not in memory. Therefore the
CPU generates a fault (trap). The fault handling routine of the kernel
will load the requested text page from the executable file and restarts
the process at the some point. Now the CPU is able to fetch the first
instruction and executes it. When a branch is made to an instruction
that lies in another text page, that page will be loaded via fault
handling as well. The kernel will load a data page as soon as a
reference is made to a static variable. In this way, any page can be
physically loaded into memory at its first reference. Pages that are not
referenced at all, will not be loaded into memory.
The illustration shows that the process has a virtual size of 80KiB (20
pages) and a physical size (RSS) of 16KiB (4 pages). Obviously, the
physical size is always a subset of the virtual size and can never be
larger than that.
Suppose that the process allocates memory dynamically (with the
subroutine malloc), the requested space will initially only extend
the process' virtual address space. Only when the proces really refers
to a page in the dynamic area, that page is physically created and
filled with binary zeroes.
The illustration shows that the first malloc'ed area has a vrtual size of
48KiB (12 pages) while only 16KiB (4 pages) are physically created
by a reference. Also the pages in the second and third malloc'ed space
have not all been referenced.
In the process list shown by atop, information can be found about the virtual address space:
The memory details (subcommand 'm') show the current virtual size per process in the column VSIZE.
The virtual growth of a process during the last interval is shown in the column VGROW.
For process simpress.bin with pid 3456 a virtual growth of 1628KiB is shown (probably by
issueing a malloc) while the resident growth is 1596KiB. Process chrome with pid 29272 has not
been grown virtually (0KiB) but has referenced pages (60 pages of 4KiB) during the last interval that
have been allocated virtually during earlier intervals. Process Xorg with pid 1530 has only been grown
virtually (6116KiB) but has not referenced any of the new pages (yet).
Process firefox with pid 4680 has freed malloc'ed space with a virtual size of 272KiB. Apparently,
this implies the release of 272KiB resident space.
5. Case study: A quiet system
The first atop snapshot was taken when a program that leaks memory has just been started, called
lekker (Dutch for “leaker”). In this snapshot we see a system with net 3.8GiB of physical memory of
which 186MiB is free and more than 1GiB in the page cache (MEM line: cache + buff). The
kernel has dynamically allocated 273MiB slab. Thus, some 2.3GiB is in use by application processes.
Swap space (8GiB) is almost unused (so far...). We can see that the lekker process has grown
156Mib (virtual) during the last interval which was also made resident by really referencing the
allocated space. For now, there is no reason to be worried.
Please take notice of the six upload processes. They have allocated 256MiB each (virtual and
resident). We can also see multiple chrome processes that may have a large virtual size, but only a
small portion of that is resident: the cumulated virtual size is 4.1GiB (1+1+0.6+1.5 GiB) of which
229MiB (103+52+44+30 MiB) is resident. The process simpress.bin has made less than 10% of
its virtual footprint (1.1GiB) resident (91MiB). Also firefox has a relatively small portion of its
virtual footprint resident. A lot of these virtual sizes will be shared, not only for the same executable
file (4 chrome processes share at least the same code), but also for shared library code used by all
processes. Till now the system has “promised” 3.4GiB of virtual memory (MEM line, vmcom) of the
total limit 9.9GiB (vmlim, which is the size of swap space plus half of the physical memory).
6. Case study: It's getting a bit busy...
One snapshot of twenty seconds later, lekker has grown another 150MiB (virtual and resident). For a
large part that could be claimed from the free space, but not entirely. The number of free pages in stock
is getting very low, so the kernel tries to free memory somewhere else. We can see that the first victims
are the page cache and the slab. They both have to shrink.
The MEM line is displayed in cyan because the amount of memory that can quickly be claimed is small
(free plus the page cache). The processes are not yet in the danger zone because no pages are swapped
out (PAG line, swout). Better yet, firefox has physically referenced another 892KiB (that is a
small amount compared to the 156MiB that lekker got).
We can see that chrome has shrunk by 24KiB (6 pages). We later found out that this was caused by
malloc'ed memory pages that were freed, followed by a new malloc of 6 pages without referencing the
pages again. Hence the virtual and resident size at first shrunk by 6 pages, after which only the virtual
size grew by 6 pages. After all, the resident size shrunk without a change of the virtual size.
7. Case study: The kernel gets worried....
Four snapshots of 20 seconds later, we see that lekker has an unsatisfiable hunger: it has grown more
than 600MiB (virtual and resident) since the previous screen shot, which is about 150MiB per 20
seconds.
The page cache has been shrunk as well as the slab (e.g. in-core inodes and directory entries). Processes
weren't spared either. They didn't shrink virtually, but some of their resident pages were taken away by
the kernel (negative RGROW). Because the worried (but not yet desparate) kernel is looking hard for
memory to free, more and more pages are checked by the page scanner: 30712 pages were verified to
see if they are candidate to be removed from memory (PAG line, scan). If a page has to be removed
from memory that was modified, that page has to be saved to the swap space. In this snapshot, 322
pages were written to the swap disk (PAG line, swout). This resulted in 322 writes to the swap space
logical volume (vg00-lvswap) that were combined to about 60 writes to the physical disk (sda).
Because so many pages were swapped out in a short time, the PAG line is displayed in red.
However, processes don't sit still and some of the pages that were swapped out will be referenced
again. These pages are read again which happened 91 times (PAG line, swin). Fortunately atop itself
makes all its pages resident at startup and locks them in memory, thus preventing them to be swapped
out and making the measurements unreliable.
8. Case study: The kernel gets desparate as well as the users...
The memory-leaking process lekker cannot be stopped. We fast-forward 5 minutes:
By now, lekker has grown to a virtual size of 2.8GiB of which 2GiB is resident (more than 50% of
the physical memory of the system). In the past 20 seconds, lekker has tried to get hold of 128MiB
more virtual memory but has only been able to make 34MiB resident. We know from the past that
lekker tries to make all its virtual memory resident as soon as it can, so we can conclude that the
kernel is very busy swapping out pages. The page cache has already been minimized, as well as the
inode cache and directory entry cache (part of the slab). Obviously the processes will also have to
“donate” physical memory. One of the upload processes (PID 30926) is even donating 19MiB. We
can see for some of the upload processes that they had to give back quite a lot more physical memory
(RSIZE). They had 6 times 256MiB, of which they now have only two thirds.
The system is swapping out heavily (32135 pages in the last 20 seconds) but is also swapping in (8440
pages). Because of this, the PAG line is displayed in red. The disk is very busy (the DSK as well as
LVM lines are red). The average service times of requests for the logical volumes that are not related to
swapping (lvhome and lvusr) are getting longer because the requests to those areas are swamped by
requests to swap space (lvswap). Although a relatively small number of requests are related to lvusr
and lvhome, these logical volumes are busy respectively 75% and 92% of the time. The system feels
extremely slow now. Time to get rid of the leaking process.....
9. Case study: Relief...
Five minutes later, the big spender lekker has finished and thus not using memory any more:
However, the effect of lekker as a memory hog can be noticed for a long time. We can see that the
upload processes are slowly referencing their swapped-out pages resulting in a resident growth again.
Because there is an ocean of free space (1.8 GiB), nothing is swapped out any more and hardly any
scanning (PAG line, scan).
We see a lot of major page faults (MAJFLT) for processes: references to virtual pages that are retrieved
from disk. Either they were swapped out and now have to be swapped in, or they are read from the
executable file. The minor page faults (MINFLT) are references to virtual pages that can be made
resident without loading the page from disk: pages that need to be filled with zeroes (e.g. for malloc's)
or pages that were “accidentally” still available in the free page pool.
The disk is still very busy retrieving the virtual pages that are referenced again and need to be swapped
in (swin is 7384 which corresponds to read for logical volume lvswap). Therefore the DSK and
some LVM lines are shown in red. The physical disk sda is mainly busy due to the requests of logical
volume lvswap. However this also slows down the requests issued for the other logical volumes. One
request to lvtmp even takes 738ms!
10. Case study: Life's almost good again...
More than seven minutes later, we can see that the system is almost tranquil again:
There is far less disk I/O and certainly not all disk I/O is related to swapping any more. Processes (like
upload) still do not have all their resident memory back, because they simply haven't touched all of
their virtual pages since the “storm” has passed. Probably many of these pages have been used during
their initialization phase and will not even be referenced any more. As such a small brease might help
to clean up a dusty memory, however a stormy leaker as lekker can better be avoided.....
11. Possible solutions for memory leakage
From the case study, it is clear that only one misbehaving process can cause a heavy performance
degradation for the entire system. The most obvious solution is to solve the memory leakage in the
guilty program and take care that every malloc'ed area is sooner or later freed again. However, in
practice this might not be a trivial task since the leaking program will often be part of a third-party
application.
Suppose that a real solution is not possible (for the time being), it should be possible to avoid that the
leaking process is bothering other processes. Preferably it should only harm its own performance by
limiting the resident memory that the leaking process is allowed to consume. So not allowing the
balloon to expand unlimited (pushing out the others), but putting a bowl around it redirecting the
superfluous expansions outside the box....
The good news is: there is a standard ulimit value to limit the resident memory of a process.
$ ulimit -a
....
max memory size (kbytes, -m) unlimited
The default value is “unlimited”. The command ulimit can be used to set a limit on the resident
memory consumption of the shell and the processes started by this shell:
$ ulimit -m 409600
$ lekker &
The bad news however is: this method only works with kernel version 2.4, but not any more with
kernel version 2.6 (dummy value).
But there is other good news (without related bad news this time):
In the current 2.6 kernels a new mechanism is introduced called container groups (cgroups). Via
cgroups it is possible to partition a set of processes (threads) and specify certain resource limits for
such partition (container). A cgroup can be created for all kind of resources, also for memory. It is
beyond the scope of this document to go into detail about cgroups, but a small example can already
illustrate the power of this mechanism.
12. Cgroups are implemented via a filesystem module, so first of all the virtual cgroup filesystem (with
option “memory”) should be mounted to an arbitrary directory. This mount has to be done only once
after boot, so it's better to specify it in your /etc/fstab file:
# mkdir /cgroups/memo
# mount -t cgroup -o memory none /cgroups/memo
To define a new memory cgroup for the leaking process(es):
1. Create a subdirectory below the mount point of the virtual cgroup filesystem:
# mkdir /cgroups/memo/leakers
At the moment that you create a subdirectory, it is magically filled with all kind of pseudo files
and subdirectories that can be used to control the properties of this cgroup.
2. One of the “files” in the newly created subdirectory is called memory.limit_in_bytes
and can be used to set the total memory limit for all processes that will run in this cgroup:
# echo 420M > /cgroup/memo/leakers/memory.limit_in_bytes
3. Another “file” in the newly created directory is called tasks and can be used to specify the
id's of the processes/threads that must be part of the cgroup. If you assign a process to a cgroup,
also its descendents (started from then on) will be part of that cgroup. Suppose that the leaking
process lekker runs with PID 2627, it can be assigned to the cgroup leakers as follows:
# echo 2627 > /cgroup/memo/leakers/tasks
Now the leaking process can not use more resident memory than 420MiB. When it runs, atop might
show the following output:
The line labeled MEM shows that 1.9GiB memory is free. On the other hand, the line labeled PAG
shows that a lot of pages have been swapped out.
The process lekker has already grown to 3.6GiB virtual memory (VSIZE), but it only uses 374MiB
resident memory (RSIZE). During the last sample, the process has even grown 132MiB virtually
(VGROW), but it has shrunk 26MiB physically (RGROW). And what's more important, the other
processes are not harmed any more by the leaking process. Their resident growth is not negative.
The leakage is not fixed, though under control...