The document discusses memory management techniques used in operating systems. It covers basic concepts like swapping, virtual memory, and page replacement algorithms. It then discusses specific techniques like segmentation, base and limit registers, bitmaps, and linked lists for tracking memory usage. Paging is explained in detail, including how virtual addresses are translated to physical addresses using page tables. The goal of these techniques is to make the limited, real-world memory appear as large, fast, and non-volatile as possible.
This document discusses common mistakes made when implementing Oracle Exadata systems. It describes improperly sized SGAs which can hurt performance on data warehouses. It also discusses issues like not using huge pages, over or under use of indexing, too much parallelization, selecting the wrong disk types, failing to patch systems, and not implementing tools like Automatic Service Request and exachk. The document provides guidance on optimizing these areas to get the best performance from Exadata.
Hadoop can effectively utilize many-core systems with large amounts of processing power and storage. The author tested a Hadoop cluster on a personal supercomputer with two nodes, each containing 48 cores, 256GB RAM, and 64TB of storage connected by 40Gb Infiniband. Testing showed the clustered configuration completed a 100GB Terasort in 241 seconds, significantly faster than comparable Amazon clusters. While Hadoop works well on a single fat node, distributing data and tasks across clustered nodes provides even better performance for large workloads.
On X86 systems, using an Unbreakable Enterprise Kernel (UEK) is recommended over other enterprise distributions as it provides better hardware support, security patches, and testing from the larger Linux community. Key configuration recommendations include enabling maximum CPU performance in BIOS, using memory types validated by Oracle, ensuring proper NUMA and CPU frequency settings, and installing only Oracle-validated packages to avoid issues. Monitoring tools like top, iostat, sar and ksar help identify any CPU, memory, disk or I/O bottlenecks.
The document discusses two types of people who are dominant in technology: 1) those who understand what they do not manage, and 2) those who manage what they do not understand. The document is discussing how important it is for technology leaders to have a strong understanding of the systems and areas they are managing.
QNAP Multimedia Server and Player
The Turbo NAS is a media center with DLNA/UPnP support. By using DLNA/UPnP digital devices, such as AV Receivers, Smart TV,' Sony PS3, Microsoft Xbox 360 and Hi-Fi system, you can view photos and play videos, listen to music stored on the Turbo NAS, and even stream Internet radio stations from all over the world.
QAirplay
Available in the App Center, QAirplay allows you to steam your media content directly on TV via AirPlay-enabled devices such as Apple TV. The media contents are streaming on Apple TV, bypassing your mobile device, thus saves the energy of your mobile device.
Stream music on iTunes
With the iTunes server, you can share and play the MP3 files by iTunes player installed on any Mac® or Windows®PC on the home network.
Revisiting CephFS MDS and mClock QoS SchedulerYongseok Oh
This presents the CephFS performance scalability and evaluation results. Specifically, it addresses some technical issues such as multi core scalability, cache size, static pinning, recovery, and QoS.
Hadoop is an open-source framework that processes large datasets in a distributed manner across commodity hardware. It uses a distributed file system (HDFS) and MapReduce programming model to store and process data. Hadoop is highly scalable, fault-tolerant, and reliable. It can handle data volumes and variety including structured, semi-structured and unstructured data.
This document discusses common mistakes made when implementing Oracle Exadata systems. It describes improperly sized SGAs which can hurt performance on data warehouses. It also discusses issues like not using huge pages, over or under use of indexing, too much parallelization, selecting the wrong disk types, failing to patch systems, and not implementing tools like Automatic Service Request and exachk. The document provides guidance on optimizing these areas to get the best performance from Exadata.
Hadoop can effectively utilize many-core systems with large amounts of processing power and storage. The author tested a Hadoop cluster on a personal supercomputer with two nodes, each containing 48 cores, 256GB RAM, and 64TB of storage connected by 40Gb Infiniband. Testing showed the clustered configuration completed a 100GB Terasort in 241 seconds, significantly faster than comparable Amazon clusters. While Hadoop works well on a single fat node, distributing data and tasks across clustered nodes provides even better performance for large workloads.
On X86 systems, using an Unbreakable Enterprise Kernel (UEK) is recommended over other enterprise distributions as it provides better hardware support, security patches, and testing from the larger Linux community. Key configuration recommendations include enabling maximum CPU performance in BIOS, using memory types validated by Oracle, ensuring proper NUMA and CPU frequency settings, and installing only Oracle-validated packages to avoid issues. Monitoring tools like top, iostat, sar and ksar help identify any CPU, memory, disk or I/O bottlenecks.
The document discusses two types of people who are dominant in technology: 1) those who understand what they do not manage, and 2) those who manage what they do not understand. The document is discussing how important it is for technology leaders to have a strong understanding of the systems and areas they are managing.
QNAP Multimedia Server and Player
The Turbo NAS is a media center with DLNA/UPnP support. By using DLNA/UPnP digital devices, such as AV Receivers, Smart TV,' Sony PS3, Microsoft Xbox 360 and Hi-Fi system, you can view photos and play videos, listen to music stored on the Turbo NAS, and even stream Internet radio stations from all over the world.
QAirplay
Available in the App Center, QAirplay allows you to steam your media content directly on TV via AirPlay-enabled devices such as Apple TV. The media contents are streaming on Apple TV, bypassing your mobile device, thus saves the energy of your mobile device.
Stream music on iTunes
With the iTunes server, you can share and play the MP3 files by iTunes player installed on any Mac® or Windows®PC on the home network.
Revisiting CephFS MDS and mClock QoS SchedulerYongseok Oh
This presents the CephFS performance scalability and evaluation results. Specifically, it addresses some technical issues such as multi core scalability, cache size, static pinning, recovery, and QoS.
Hadoop is an open-source framework that processes large datasets in a distributed manner across commodity hardware. It uses a distributed file system (HDFS) and MapReduce programming model to store and process data. Hadoop is highly scalable, fault-tolerant, and reliable. It can handle data volumes and variety including structured, semi-structured and unstructured data.
Deep Dive on Amazon EBS Elastic Volumes - March 2017 AWS Online Tech TalksAmazon Web Services
This document provides an overview of Amazon Elastic Block Storage (EBS) elastic volumes, which allow customers to modify existing EBS volumes non-disruptively. The key capabilities include increasing volume size, changing volume types, and increasing or decreasing provisioned IOPS on io1 volumes. The modifications can be made using the AWS Management Console, command line interface, or SDKs. During a modification, volume performance is between the original and target characteristics. Customers should monitor the modification state and may need to extend the file system if increasing volume size. EBS elastic volumes provide an easy way to modify volumes without downtime or performance impact.
Learn tips and techniques that will improve the performance of your applications and databases running on Amazon EC2 instance storage and/or Amazon Elastic Block Store (EBS). This advanced session discusses when to use HI1, HS1, and Amazon EBS. We will share an "under the hood" view to tune the performance of your Elastic Block Store and best practices for running workloads on Amazon EBS, such as relational databases (MySQL, Oracle, SQL Server, postgres) and NoSQL data stores, such as MongoDB and Riak.
Amazon Elastic Block Store (Amazon EBS) provides persistent block level storage volumes for use with Amazon EC2 instances. In this technical session, we conduct a detailed analysis of the differences among the three types of Amazon EBS block storage: General Purpose (SSD), Provisioned IOPS (SSD), and Magnetic. We discuss how to maximize Amazon EBS performance, with a special eye towards low-latency, high-throughput applications like databases. We discuss Amazon EBS encryption and share best practices for Amazon EBS snapshot management. Throughout, we share tips for success.
Speakers:
Tom Maddox, AWS Solutions Architect
Building the Perfect SharePoint 2010 Farm - MS Days Bulgaria 2012Michael Noel
This document discusses best practices for building a highly available and optimized SharePoint 2010 farm. It covers farm architecture including recommended server roles and sizing. It also discusses virtualization options and performance monitoring considerations. The document outlines strategies for data management including content database distribution, remote BLOB storage, SQL database optimization, and maintenance plans. Finally, it compares high availability and disaster recovery options for SQL Server like AlwaysOn availability groups and failover clustering.
The document provides tips for optimizing PostgreSQL performance on hardware and configuration settings. It recommends starting with hard drive optimization using RAID 1 or RAID 10 configurations on an SSD or SAS drive array. It also recommends optimizing memory settings like shared_buffers, work_mem and maintenance_work_mem as well as I/O settings like checkpoint_timeout. The document emphasizes the importance of hardware specifications and configuration tuning to improve PostgreSQL performance.
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld
This document provides an overview and best practices for storage technologies. It discusses factors that affect storage performance like interconnect bandwidth versus IOPS and command sizing. It covers tiering strategies and when auto-tiering may not be effective. It also discusses SSDs versus spinning disks, large VMDK and VMFS support, thin provisioning at the VM and LUN level, and architecting storage for failure including individual component failure, temporary and permanent site loss. It provides examples of how to implement a low-cost disaster recovery site using inexpensive hardware.
This document discusses performance improvements to the Lustre parallel file system in versions 2.5 through large I/O patches, metadata improvements, and metadata scaling with distributed namespace (DNE). It summarizes evaluations showing improved throughput from 4MB RPC, reduced degradation with large numbers of threads using SSDs over NL-SAS, high random read performance from SSD pools, and significant metadata performance gains in Lustre 2.4 from DNE allowing nearly linear scaling. Key requirements for next-generation storage include extreme IOPS, tiered architectures using local flash with parallel file systems, and reducing infrastructure needs while maintaining throughput.
Yfrog uses HBase as its scalable database backend to store and serve 250 million photos from over 60 million monthly users across 4 HBase clusters ranging from 50TB to 1PB in size. The authors provide best practices for configuring and monitoring HBase, including using smaller commodity servers, tuning JVM garbage collection, monitoring metrics like thread usage and disk I/O, and implementing caching and replication for high performance and reliability. Following these practices has allowed Yfrog's HBase deployment to run smoothly and efficiently.
The document discusses several key factors for optimizing HBase performance including:
1. Reads and writes compete for disk, network, and thread resources so they can cause bottlenecks.
2. Memory allocation needs to balance space for memstores, block caching, and Java heap usage.
3. The write-ahead log can be a major bottleneck and increasing its size or number of logs can improve write performance.
4. Flushes and compactions need to be tuned to avoid premature flushes causing "compaction storms".
This document summarizes key aspects of mass storage systems used in operating systems. It describes the physical structure of magnetic disks including platters, seek time, and rotational latency. It discusses various disk bus interfaces and performance characteristics. It then covers disk scheduling algorithms like FCFS, SSTF, SCAN, C-SCAN, and C-LOOK. The document also discusses disk management by the operating system including formatting, partitioning and file systems. It briefly introduces solid-state disks, magnetic tape, storage arrays, storage area networks and network attached storage.
Srihitha Technologies provides IBM-AIX Online Training in Ameerpet by real time Experts. For more information about IBM-AIX online training in Ameerpet call 9885144200 / 9394799566.
Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...Amazon Web Services
This document discusses optimizing performance for EC2 instances and EBS volumes. It provides guidance on provisioning IOPS for different types of storage workloads and database software. The key recommendations are to use EBS-optimized instances with Provisioned IOPS (PIOPS) volumes for random I/O workloads like databases, size volumes appropriately based on the needed IOPS and throughput, and architect for consistent low latency by adjusting the queue depth.
Srihitha Technologies provides IBM AIX Training in Ameerpet by real time Experts. For more information about IBM AIX training in Ameerpet call 9394799566 / 9290641808.
RAID controllers use multiple physical disks that appear as a single logical drive. RAID levels 0, 1, 5 are commonly used. RAID 0 stripes data across disks for speed but has no redundancy. RAID 1 mirrors data onto two disks for redundancy but is expensive. RAID 5 stripes data across disks and uses parity for redundancy, avoiding bottlenecks of RAID 4. Larger RAID groups can implement dual distributed parity for fault tolerance from two drive failures. Nesting RAID levels can boost performance by combining redundancy with RAID 0 striping. Rebuilding failed drives uses parity calculation with XOR to reconstruct lost data.
Amazon Elastic Block Store (Amazon EBS) provides persistent block level storage volumes for use with Amazon EC2 instances. In this technical session, we discuss how to maximize Amazon EBS performance, with a special eye toward low-latency, high-throughput applications like databases. We explain how to monitor your application and share real-world examples.
Windows Server 2012 R2 Software-Defined StorageAidan Finn
In this presentation I taught attendees how to build a Scale-Out File Server (SOFS) using Windows Server 2012 R2, JBODs, Storage Spaces, Failover Clustering, and SMB 3.0 Networking, suitable for storing application data such as Hyper-V and SQL Server.
PostgreSQL Portland Performance Practice Project - Database Test 2 Filesystem...Mark Wong
Fifth presentation in a speaker series sponsored by the Portland State University Computer Science Department. The series covers PostgreSQL performance with an OLTP (on-line transaction processing) workload called Database Test 2 (DBT-2). This presentation goes through results of different hardware RAID configurations to show why it is important to test your own hardware: it might be performing in way you don't expect.
The document summarizes a presentation on optimizing Linux, Windows, and Firebird for heavy workloads. It describes two customer implementations using Firebird - a medical company with 17 departments and over 700 daily users, and a repair services company with over 500 daily users. It discusses tuning the operating system, hardware, CPU, RAM, I/O, network, and Firebird configuration to improve performance under heavy loads. Specific recommendations are provided for Linux and Windows configuration.
La atención farmacéutica es un proceso cooperativo para proporcionar terapia farmacológica a pacientes individuales de manera responsable, con el objetivo de mejorar la calud del paciente mediante la búsqueda, prevención y resolución de problemas relacionados con los medicamentos. El farmacéutico desempeña varios roles como cuidador del servicio de medicamentos, tomador de decisiones eficaz y seguro sobre medicamentos, comunicador entre el prescriptor y el paciente, gestor de recursos, e investigador basado en evidencia cient
Las enfermedades autoinmunes se producen cuando el sistema inmunitario ataca células y tejidos sanos del propio organismo. Estas enfermedades incluyen la artritis reumatoide, el lupus eritematoso sistémico y la esclerosis múltiple. El documento analiza las enfermedades autoinmunes como parte de un curso de fisiología en la Universidad Técnica de Ambato, Ecuador.
Deep Dive on Amazon EBS Elastic Volumes - March 2017 AWS Online Tech TalksAmazon Web Services
This document provides an overview of Amazon Elastic Block Storage (EBS) elastic volumes, which allow customers to modify existing EBS volumes non-disruptively. The key capabilities include increasing volume size, changing volume types, and increasing or decreasing provisioned IOPS on io1 volumes. The modifications can be made using the AWS Management Console, command line interface, or SDKs. During a modification, volume performance is between the original and target characteristics. Customers should monitor the modification state and may need to extend the file system if increasing volume size. EBS elastic volumes provide an easy way to modify volumes without downtime or performance impact.
Learn tips and techniques that will improve the performance of your applications and databases running on Amazon EC2 instance storage and/or Amazon Elastic Block Store (EBS). This advanced session discusses when to use HI1, HS1, and Amazon EBS. We will share an "under the hood" view to tune the performance of your Elastic Block Store and best practices for running workloads on Amazon EBS, such as relational databases (MySQL, Oracle, SQL Server, postgres) and NoSQL data stores, such as MongoDB and Riak.
Amazon Elastic Block Store (Amazon EBS) provides persistent block level storage volumes for use with Amazon EC2 instances. In this technical session, we conduct a detailed analysis of the differences among the three types of Amazon EBS block storage: General Purpose (SSD), Provisioned IOPS (SSD), and Magnetic. We discuss how to maximize Amazon EBS performance, with a special eye towards low-latency, high-throughput applications like databases. We discuss Amazon EBS encryption and share best practices for Amazon EBS snapshot management. Throughout, we share tips for success.
Speakers:
Tom Maddox, AWS Solutions Architect
Building the Perfect SharePoint 2010 Farm - MS Days Bulgaria 2012Michael Noel
This document discusses best practices for building a highly available and optimized SharePoint 2010 farm. It covers farm architecture including recommended server roles and sizing. It also discusses virtualization options and performance monitoring considerations. The document outlines strategies for data management including content database distribution, remote BLOB storage, SQL database optimization, and maintenance plans. Finally, it compares high availability and disaster recovery options for SQL Server like AlwaysOn availability groups and failover clustering.
The document provides tips for optimizing PostgreSQL performance on hardware and configuration settings. It recommends starting with hard drive optimization using RAID 1 or RAID 10 configurations on an SSD or SAS drive array. It also recommends optimizing memory settings like shared_buffers, work_mem and maintenance_work_mem as well as I/O settings like checkpoint_timeout. The document emphasizes the importance of hardware specifications and configuration tuning to improve PostgreSQL performance.
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...VMworld
This document provides an overview and best practices for storage technologies. It discusses factors that affect storage performance like interconnect bandwidth versus IOPS and command sizing. It covers tiering strategies and when auto-tiering may not be effective. It also discusses SSDs versus spinning disks, large VMDK and VMFS support, thin provisioning at the VM and LUN level, and architecting storage for failure including individual component failure, temporary and permanent site loss. It provides examples of how to implement a low-cost disaster recovery site using inexpensive hardware.
This document discusses performance improvements to the Lustre parallel file system in versions 2.5 through large I/O patches, metadata improvements, and metadata scaling with distributed namespace (DNE). It summarizes evaluations showing improved throughput from 4MB RPC, reduced degradation with large numbers of threads using SSDs over NL-SAS, high random read performance from SSD pools, and significant metadata performance gains in Lustre 2.4 from DNE allowing nearly linear scaling. Key requirements for next-generation storage include extreme IOPS, tiered architectures using local flash with parallel file systems, and reducing infrastructure needs while maintaining throughput.
Yfrog uses HBase as its scalable database backend to store and serve 250 million photos from over 60 million monthly users across 4 HBase clusters ranging from 50TB to 1PB in size. The authors provide best practices for configuring and monitoring HBase, including using smaller commodity servers, tuning JVM garbage collection, monitoring metrics like thread usage and disk I/O, and implementing caching and replication for high performance and reliability. Following these practices has allowed Yfrog's HBase deployment to run smoothly and efficiently.
The document discusses several key factors for optimizing HBase performance including:
1. Reads and writes compete for disk, network, and thread resources so they can cause bottlenecks.
2. Memory allocation needs to balance space for memstores, block caching, and Java heap usage.
3. The write-ahead log can be a major bottleneck and increasing its size or number of logs can improve write performance.
4. Flushes and compactions need to be tuned to avoid premature flushes causing "compaction storms".
This document summarizes key aspects of mass storage systems used in operating systems. It describes the physical structure of magnetic disks including platters, seek time, and rotational latency. It discusses various disk bus interfaces and performance characteristics. It then covers disk scheduling algorithms like FCFS, SSTF, SCAN, C-SCAN, and C-LOOK. The document also discusses disk management by the operating system including formatting, partitioning and file systems. It briefly introduces solid-state disks, magnetic tape, storage arrays, storage area networks and network attached storage.
Srihitha Technologies provides IBM-AIX Online Training in Ameerpet by real time Experts. For more information about IBM-AIX online training in Ameerpet call 9885144200 / 9394799566.
Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...Amazon Web Services
This document discusses optimizing performance for EC2 instances and EBS volumes. It provides guidance on provisioning IOPS for different types of storage workloads and database software. The key recommendations are to use EBS-optimized instances with Provisioned IOPS (PIOPS) volumes for random I/O workloads like databases, size volumes appropriately based on the needed IOPS and throughput, and architect for consistent low latency by adjusting the queue depth.
Srihitha Technologies provides IBM AIX Training in Ameerpet by real time Experts. For more information about IBM AIX training in Ameerpet call 9394799566 / 9290641808.
RAID controllers use multiple physical disks that appear as a single logical drive. RAID levels 0, 1, 5 are commonly used. RAID 0 stripes data across disks for speed but has no redundancy. RAID 1 mirrors data onto two disks for redundancy but is expensive. RAID 5 stripes data across disks and uses parity for redundancy, avoiding bottlenecks of RAID 4. Larger RAID groups can implement dual distributed parity for fault tolerance from two drive failures. Nesting RAID levels can boost performance by combining redundancy with RAID 0 striping. Rebuilding failed drives uses parity calculation with XOR to reconstruct lost data.
Amazon Elastic Block Store (Amazon EBS) provides persistent block level storage volumes for use with Amazon EC2 instances. In this technical session, we discuss how to maximize Amazon EBS performance, with a special eye toward low-latency, high-throughput applications like databases. We explain how to monitor your application and share real-world examples.
Windows Server 2012 R2 Software-Defined StorageAidan Finn
In this presentation I taught attendees how to build a Scale-Out File Server (SOFS) using Windows Server 2012 R2, JBODs, Storage Spaces, Failover Clustering, and SMB 3.0 Networking, suitable for storing application data such as Hyper-V and SQL Server.
PostgreSQL Portland Performance Practice Project - Database Test 2 Filesystem...Mark Wong
Fifth presentation in a speaker series sponsored by the Portland State University Computer Science Department. The series covers PostgreSQL performance with an OLTP (on-line transaction processing) workload called Database Test 2 (DBT-2). This presentation goes through results of different hardware RAID configurations to show why it is important to test your own hardware: it might be performing in way you don't expect.
The document summarizes a presentation on optimizing Linux, Windows, and Firebird for heavy workloads. It describes two customer implementations using Firebird - a medical company with 17 departments and over 700 daily users, and a repair services company with over 500 daily users. It discusses tuning the operating system, hardware, CPU, RAM, I/O, network, and Firebird configuration to improve performance under heavy loads. Specific recommendations are provided for Linux and Windows configuration.
La atención farmacéutica es un proceso cooperativo para proporcionar terapia farmacológica a pacientes individuales de manera responsable, con el objetivo de mejorar la calud del paciente mediante la búsqueda, prevención y resolución de problemas relacionados con los medicamentos. El farmacéutico desempeña varios roles como cuidador del servicio de medicamentos, tomador de decisiones eficaz y seguro sobre medicamentos, comunicador entre el prescriptor y el paciente, gestor de recursos, e investigador basado en evidencia cient
Las enfermedades autoinmunes se producen cuando el sistema inmunitario ataca células y tejidos sanos del propio organismo. Estas enfermedades incluyen la artritis reumatoide, el lupus eritematoso sistémico y la esclerosis múltiple. El documento analiza las enfermedades autoinmunes como parte de un curso de fisiología en la Universidad Técnica de Ambato, Ecuador.
The document provides the lyrics to the song "Wonderful Tonight" by Eric Clapton. The song is about a couple getting ready to go to a party, where the man tells the woman she looks wonderful. At the party everyone looks at the beautiful woman. After the party, the man feels tired and the woman drives him home and helps him to bed, where he tells her she was wonderful that night. The document recommends presenting the new vocabulary words from the song to students, having them sing along, and then making worksheets.
7 слайдов, 9 класс, 2011 год.
Дерево – главный строительный материал жилищ древних славян
Планировка жилищ
Совершенствование конструкции срубов
Особенности строительства домов
Итог
El documento habla sobre la evaluación de un profesor que es 75% evaluado y 25% evaluado por los compañeros. También menciona herramientas como Canva, blog y LinkedIn y técnicas de asertividad. Incluye preguntas sobre las funciones de TCAE en clínica, qué es una endodoncia, la diferencia entre un estomatólogo y odontólogo, y los tipos de prótesis que se pueden colocar en la boca.
Luminance is an artificial intelligence platform that analyzes large amounts of legal documents and data for due diligence purposes. It understands language like humans and can process large datasets much faster than people. Luminance provides an overview of a company by identifying potential risks and anomalies without being specifically instructed to look for them. It revolutionizes due diligence by making the review process more efficient and productive.
BDO Indonesia provides payroll services with over 36 years of experience in audit and assurance services. They understand the complexities of managing payroll and can deliver payroll on time every time. Their solutions include tax calculation, leave and benefits management, reports, and integration with relevant third parties. They have a dedicated team of directors and professionals with expertise across various industries to ensure efficient delivery of payroll services.
This document discusses various concepts related to educational technology. It begins by defining adaptive learning as using computers to adapt educational content based on student responses. It then defines virtual classrooms, MOOCs, synchronous vs asynchronous learning, blended learning, flipped classrooms, self-directed learning, and learning management systems. For each concept, it provides a brief definition or description. The document serves to outline and explain several key terms and models used in educational technology.
Memory management in operating system | Paging | Virtual memoryShivam Mitra
This document discusses memory management techniques in operating systems. It begins by covering contiguous memory allocation approaches like fixed and variable partitioning. It then discusses non-contiguous techniques like paging and segmentation. Key concepts covered include logical vs physical addresses, page tables, translation lookaside buffers, demand paging, and virtual memory. The document provides examples and links to detailed video explanations of these important OS memory management topics.
The document discusses best practices for deploying MongoDB including sizing hardware with sufficient memory, CPU and I/O; using an appropriate operating system and filesystem; installing and upgrading MongoDB; ensuring durability with replication and backups; implementing security, monitoring performance with tools, and considerations for deploying on Amazon EC2.
- The document provides guidance on deploying MongoDB including sizing hardware, installing and upgrading MongoDB, configuration considerations for EC2, security, backups, durability, scaling out, and monitoring. Key aspects discussed are profiling and indexing queries for performance, allocating sufficient memory, CPU and disk I/O, using 64-bit OSes, ext4/XFS filesystems, upgrading to even version numbers, and replicating for high availability and backups.
This document provides guidance on considerations for deploying MongoDB in production environments. It covers sizing hardware requirements for memory, CPU and I/O; installing and configuring MongoDB; using MongoDB on Amazon EC2; implementing security, backups, and durability; upgrading MongoDB versions; scaling deployments by sharding; and monitoring MongoDB performance.
The document provides guidance on deploying MongoDB in production environments. It discusses sizing hardware requirements for memory, CPU, and disk I/O. It also covers installing and upgrading MongoDB, considerations for cloud platforms like EC2, security, backups, durability, scaling out, and monitoring. The focus is on performance optimization and ensuring data integrity and high availability.
In this presentation I talk about various topics related to Memory Management in SQL Server such as:
1. Memory Manager: Windows NT
a. Virtual memory
i. Address Space Layout
ii. Virtual Memory Manager
iii. 32-bit Virtual Addresses
iv. Address Translation
b. Memory Pool
c. 4GT Tuning
i. /3GB Switch (Two slides)
ii. Effects of /3GB Tuning
iii. /USERVA Switch
d. PAE
i. Using /3GB & PAE together
e. AWE
f. 32-bit vs 64-bit Virtual Memory
2. Memory Manager: SQLOS
a. SQLOS
i. Memory Management
ii. Scheduling
iii. Exception handling
b. NUMA (Non-Uniform Memory Architecture)
c. BP and MTL ?
d. Memory Types
e. Memory Pressure
Accelerating HBase with NVMe and Bucket CacheNicolas Poggi
on-Volatile-Memory express (NVMe) standard promises and order of magnitude faster storage than regular SSDs, while at the same time being more economical than regular RAM on TB/$. This talk evaluates the use cases and benefits of NVMe drives for its use in Big Data clusters with HBase and Hadoop HDFS.
First, we benchmark the different drives using system level tools (FIO) to get maximum expected values for each different device type and set expectations. Second, we explore the different options and use cases of HBase storage and benchmark the different setups. And finally, we evaluate the speedups obtained by the NVMe technology for the different Big Data use cases from the YCSB benchmark.
In summary, while the NVMe drives show up to 8x speedup in best case scenarios, testing the cost-efficiency of new device technologies is not straightforward in Big Data, where we need to overcome system level caching to measure the maximum benefits.
Current HDFS Namenode stores all of its metadata in RAM. This has allowed Hadoop clusters to scale to 100K concurrent tasks. However, the memory limits the total number of files that a single NameNode can store. While Federation allows one to create multiple volumes with additional Namenodes, there is a need to scale a single namespace and also to store multiple namespaces in a single Namenode.
This talk describes a project that removes the space limits while maintaining similar performance by caching only the working set or hot metadata in Namenode memory. We believe this approach will be very effective because the subset of files that is frequently accessed is much smaller than the full set of files stored in HDFS.
In this talk we will describe our overall approach and give details of our implementation along with some early performance numbers.
Speaker: Lin Xiao, PhD student at Carnegie Mellon University, intern at Hortonworks
This document discusses new graphics APIs like DX12 and Vulkan that aim to provide lower overhead and more direct hardware access compared to earlier APIs. It covers topics like increased parallelism, explicit memory management using descriptor sets and pipelines, and best practices like batching draw calls and using multiple asynchronous queues. Overall, the new APIs allow more explicit control over GPU hardware for improved performance but require following optimization best practices around areas like parallelism, memory usage, and command batching.
This document discusses security issues with microcontrollers (uCs). uCs are commonly used in devices like cars, medical implants, and infrastructure systems. The Atmel AVR8 uC architecture is examined in depth. Issues discussed include weak randomness for crypto, race conditions due to lack of concurrency controls, buffer overflows enabled by the Harvard architecture separating code and data, and attacks involving NULL pointers, uninitialized memory, and dereferencing memory beyond physical limits. Exploiting these issues could allow taking control of uC firmware to potentially manipulate connected devices and systems.
Caches are used in many layers of applications that we develop today, holding data inside or outside of your runtime environment, or even distributed across multiple platforms in data fabrics. However, considerable performance gains can often be realized by configuring the deployment platform/environment and coding your application to take advantage of the properties of CPU caches.
In this talk, we will explore what CPU caches are, how they work and how to measure your JVM-based application data usage to utilize them for maximum efficiency. We will discuss the future of CPU caches in a many-core world, as well as advancements that will soon arrive such as HP's Memristor.
The document discusses the organization and types of system memory in a PC. It describes how the first 640KB of memory is called conventional memory and is available for programs to use. It also explains different types of additional memory areas like extended memory and cache memory, as well as different types of RAM like DRAM, SRAM, and variations of DRAM.
This document discusses SQL Server memory management. It begins by explaining physical memory, virtual address space, and how the virtual memory manager joins physical memory and virtual address space. It then defines key terminology related to memory usage. The document outlines the three levels of SQL Server memory - memory nodes, memory clerks and caches, and memory objects. It provides details on various memory-related aspects like the plan cache, query memory, minimum and maximum server memory settings, and changes in SQL Server 2012. Memory-related dynamic management views and performance counters are also referenced.
This document discusses various strategies for backing up MongoDB data to keep it safe. It recommends:
1. Using mongodump for simple backups that can restore quickly but may be inconsistent.
2. Setting up replication for high availability, but also using mongodump for backups and testing restore processes.
3. Taking snapshots of the data files for consistent backups, but this requires downtime and gaps can occur between snapshots.
4. Using the oplog for incremental, continuous backups to avoid gaps without downtime using tools like the Wordnik Admin Tools. Testing backups is strongly recommended.
Application Profiling for Memory and Performancepradeepfn
This document discusses application profiling for memory and performance. It explains that as concurrency increases, throughput initially increases but contention can then reduce performance. The key resources that can cause contention are CPU, memory, disk I/O, and network I/O. Various tools like JProfiler and JConsole can measure and diagnose contention. Common issues uncovered by profiling include memory leaks, deadlocks, and permgen errors. Profiling is important to optimize applications for production use.
Designing for performance.
Performance balance.
single processor and Multi core processor.
Usage of Processors.
Usage of single processor and Multi core Processor.
Processing Techniques.
Moors law
Operating systems use main memory management techniques like paging and segmentation to allocate memory to processes efficiently. Paging divides both logical and physical memory into fixed-size pages. It uses a page table to map logical page numbers to physical frame numbers. This allows processes to be allocated non-contiguous physical frames. A translation lookaside buffer (TLB) caches recent page translations to improve performance by avoiding slow accesses to the page table in memory. Protection bits and valid/invalid bits ensure processes only access their allocated memory regions.
Application Profiling for Memory and PerformanceWSO2
This document discusses application profiling for memory and performance. It describes how to measure contention in CPU, memory, disk I/O, and network I/O as concurrency increases. It recommends performance tuning by identifying bottlenecks and shifting them through parameter tweaking and code profiling. Common issues like classcast exceptions, permgen errors, deadlocks, nullpointers, and outofmemoryexceptions can be addressed through profiling tools like JProfiler, Eclipse Memory Analyzer, and JConsole. The document provides examples of how WSO2 uses profiling to optimize products like the Identity Server for low-memory environments and Raspberry Pi clusters.
Leveraging Generative AI to Drive Nonprofit InnovationTechSoup
In this webinar, participants learned how to utilize Generative AI to streamline operations and elevate member engagement. Amazon Web Service experts provided a customer specific use cases and dived into low/no-code tools that are quick and easy to deploy through Amazon Web Service (AWS.)
Temple of Asclepius in Thrace. Excavation resultsKrassimira Luka
The temple and the sanctuary around were dedicated to Asklepios Zmidrenus. This name has been known since 1875 when an inscription dedicated to him was discovered in Rome. The inscription is dated in 227 AD and was left by soldiers originating from the city of Philippopolis (modern Plovdiv).
Strategies for Effective Upskilling is a presentation by Chinwendu Peace in a Your Skill Boost Masterclass organisation by the Excellence Foundation for South Sudan on 08th and 09th June 2024 from 1 PM to 3 PM on each day.
Main Java[All of the Base Concepts}.docxadhitya5119
This is part 1 of my Java Learning Journey. This Contains Custom methods, classes, constructors, packages, multithreading , try- catch block, finally block and more.
it describes the bony anatomy including the femoral head , acetabulum, labrum . also discusses the capsule , ligaments . muscle that act on the hip joint and the range of motion are outlined. factors affecting hip joint stability and weight transmission through the joint are summarized.
Walmart Business+ and Spark Good for Nonprofits.pdfTechSoup
"Learn about all the ways Walmart supports nonprofit organizations.
You will hear from Liz Willett, the Head of Nonprofits, and hear about what Walmart is doing to help nonprofits, including Walmart Business and Spark Good. Walmart Business+ is a new offer for nonprofits that offers discounts and also streamlines nonprofits order and expense tracking, saving time and money.
The webinar may also give some examples on how nonprofits can best leverage Walmart Business+.
The event will cover the following::
Walmart Business + (https://business.walmart.com/plus) is a new shopping experience for nonprofits, schools, and local business customers that connects an exclusive online shopping experience to stores. Benefits include free delivery and shipping, a 'Spend Analytics” feature, special discounts, deals and tax-exempt shopping.
Special TechSoup offer for a free 180 days membership, and up to $150 in discounts on eligible orders.
Spark Good (walmart.com/sparkgood) is a charitable platform that enables nonprofits to receive donations directly from customers and associates.
Answers about how you can do more with Walmart!"
Communicating effectively and consistently with students can help them feel at ease during their learning experience and provide the instructor with a communication trail to track the course's progress. This workshop will take you through constructing an engaging course container to facilitate effective communication.
3. In an ideal world…
✦ The ideal world has memory that is
• Very large
• Very fast
• Non-volatile (doesn’t go away when power is turned
off)
✦ The real world has memory that is:
• Very large
• Very fast
• Affordable!
Pick any two…
✦ Memory management goal: make the real world
look as much like the ideal world as possible
4. Memory hierarchy
✦ What is the memory hierarchy?
• Different levels of memory
• Some are small & fast
• Others are large & slow
✦ What levels are usually included?
• Cache: small amount of fast, expensive memory
- L1 (level 1) cache: usually on the CPU chip
- L2: may be on or off chip
- L3 cache: off-chip, made of SRAM
• Main memory: medium-speed, medium price
memory (DRAM)
• Disk: many gigabytes of slow, cheap, non-volatile
storage
✦ Memory manager handles the memory hierarchy
5. Basic memory management
Components include
• Operating system (perhaps with device drivers)
• Single process
✦ Goal: lay these out in memory
• Memory protection may not be an issue (only one program)
• Flexibility may still be useful (allow OS changes, etc.)
✦ No swapping or paging
0xFFFF 0xFFFF
Device drivers
Operating system
(ROM)
User program (ROM)
(RAM)
User program
(RAM)
User program
Operating system (RAM)
Operating system
(RAM)
(RAM)
0 0
6. Fixed partitions: multiple
programs
Fixed memory partitions
• Divide memory into fixed spaces
• Assign a process to a space when it’s free
✦ Mechanisms
• Separate input queues for each partition
• Single input queue: better ability to optimize CPU usage
900K 900K
Partition 4 Partition 4
700K 700K
Partition 3 Partition 3
600K 600K
Partition 2 Partition 2
500K 500K
Partition 1 Partition 1
100K Process 100K
OS OS
0 0
7. How many processes are
enough?
✦ Several memory partitions (fixed or variable size)
✦ Lots of processes wanting to use the CPU
✦ Tradeoff
• More processes utilize the CPU better
• Fewer processes use less memory (cheaper!)
✦ How many processes do we need to keep the
CPU fully utilized?
• This will help determine how much memory we need
• Is this still relevant with memory costing $15/GB?
8. Modeling multiprogramming
✦ More I/O wait means less
processor utilization
• At 20% I/O wait, 3– 4
processes fully utilize CPU
• At 80% I/O wait, even 10
processes aren’t enough
✦ This means that the OS
should have more
processes if they’re I/O
bound
✦ More processes ⇒
memory management &
protection more
important!
9. Multiprogrammed system
performance
Arrival and work requirements of 4 jobs
✦ CPU utilization for 1– 4 jobs with 80% I/O wait
✦ Sequence of events as jobs arrive and finish
• Numbers show amount of CPU time jobs get in each interval
• More processes ⇒ better utilization, less time per process
Job Arrival CPU
time needed
1 2 3 4
1 10:00 4
CPU idle 0.80 0.64 0.51 0.41
2 10:10 3
CPU busy 0.20 0.36 0.49 0.59
3 10:15 2
CPU/process 0.20 0.18 0.16 0.15
4 10:20 2
0 Time 10 15 20 22 27.6 28.2 31.7
1
2
3
4
10. Memory and multiprogramming
✦ Memory needs two things for multiprogramming
• Relocation
• Protection
✦ The OS cannot be certain where a program will
be loaded in memory
• Variables and procedures can’t use absolute
locations in memory
• Several ways to guarantee this
✦ The OS must keep processes’ memory separate
• Protect a process from other processes reading or
modifying its own memory
• Protect a process from modifying its own memory in
undesirable ways (such as writing to program code)
11. Base and limit registers
Special CPU registers: base & 0xFFFF
0x2000
limit
• Access to the registers limited to Limit
system mode Process
• Registers contain partition
- Base: start of the process’s Base
memory partition 0x9000
- Limit: length of the process’s
memory partition
✦ Address generation
• Physical address: location in OS
0
actual memory
• Logical address: location from the Logical address: 0x1204
process’s point of view Physical address:
• Physical address = base + logical
0x1204+0x9000 =
address
0xa204
• Logical address larger than limit ⇒
error
12. Swapping
C C C C C
B B B B
A
A A A
D D D
OS OS OS OS OS OS OS
✦ Memory allocation changes as
• Processes come into memory
• Processes leave memory
- Swapped to disk
- Complete execution
✦ Gray regions are unused memory
13. Swapping: leaving room to grow
✦ Need to allow for
programs to grow Stack
• Allocate more memory for Room for
Process
data B
B to grow
• Larger stack Data
✦ Handled by allocating Code
more space than is Stack
necessary at the start Room for
Process
• Inefficient: wastes memory A
A to grow
that’s not currently in use Data
• What if the process Code
requests too much OS
memory?
14. Tracking memory usage: bitmaps
Keep track of free / allocated memory regions with a bitmap
• One bit in map corresponds to a fixed-size region of memory
• Bitmap is a constant size for a given amount of memory regardless of
how much is allocated at a particular time
✦ Chunk size determines efficiency
• At 1 bit/4KB chunk, we need just 256 bits (32 bytes)/ MB of memory
• For smaller chunks, we need more memory for the bitmap
• Can be difficult to find large contiguous free areas in bitmap
A B C D
8
16 24 32
11111100 Memory regions
00111000
01111111
11111000 Bitmap
15. Tracking memory usage: linked
lists
✦ Keep track of free / allocated memory regions with a linked list
• Each entry in the list corresponds to a contiguous region of memory
• Entry can indicate either allocated or free (and, optionally, owning
process)
• May have separate lists for free and allocated areas
✦ Efficient if chunks are large
• Fixed-size representation for each region
• More regions → more space needed for free lists
A B C D
8
16 24 32
Memory regions
A 0 6 - 6 4 B 10 3 - 13 4 C 17 9
D 26 3 - 29 3
16. Allocating memory
✦ Search through region list to find a large enough space
✦ Suppose there are several choices: which one to use?
• First fit: the first suitable hole on the list
• Next fit: the first suitable after the previously allocated hole
• Best fit: the smallest hole that is larger than the desired region
(wastes least space?)
• Worst fit: the largest available hole (leaves largest fragment)
✦ Option: maintain separate queues for different-size holes
Allocate 20 blocks first fit Allocate 13 blocks best fit
Allocate 12 blocks next fit Allocate 15 blocks worst fit
1 5 18
- 6 5 - 19 14 - 52 25 - 102 30 - 135 16
- 202 10 - 302 20 - 350 30 - 411 19 - 510 3
15
17. Freeing memory
✦ Allocation structures must be updated when memory is
freed
✦ Easy with bitmaps: just set the appropriate bits in the
bitmap
✦ Linked lists: modify adjacent elements as needed
• Merge adjacent free regions into a single region
• May involve merging two regions with the just-freed area
A X B A B
A X A
X B B
X
18. Knuth's observations:
50% rule: if # of processes in memory is n, mean number holes n/2
In equilibrium: half of the ops above allocs and the other
half deallocs; on avg, one hole for 2 procs
unused memory rule: (n/2)*k*s=m-n*s
m: total memory
s, k*s: avg size of process, hole
fraction of memory wasted: k/k+2
overhead of paging: process size*size of page entry/page size+pagezise/2
19. some interesting results
• if n is the number of allocated areas, then n/2 is the
number of holes for “ simple” alloc algs (not buddy!) in
equilibrium
• dynamic storage alloc strategies that never relocate
reserved blocks: mem eff not guaranteed!
– with blocks 1 and 2, can run out of mem even with just
2/3rds full
• 23 seats in a row, groups of 1 and 2 arrive; do we need
to split any pair to seat? no more than 16 present
–Solution: no single is given seat 2, 5, 8, ... 20
• not possible with 22 seats; no more than 14 present
20. Limitations of swapping
✦ Problems with swapping
• Process must fit into physical memory (impossible to
run larger processes)
• Memory becomes fragmented
- External fragmentation: lots of small free areas
- Compaction needed to reassemble larger free
areas
• Processes are either in memory or on disk: half and
half doesn’t do any good
✦ Overlays solved the first problem
• Bring in pieces of the process over time (typically
data)
• Still doesn’t solve the problem of fragmentation or
partially resident processes
21. Virtual memory
✦ Basic idea: allow the OS to hand out more
memory than exists on the system
✦ Keep recently used stuff in physical memory
✦ Move less recently used stuff to disk
✦ Keep all of this hidden from processes
• Processes still see an address space from 0–
max_address
• Movement of information to and from disk handled by
the OS without process help
✦ Virtual memory (VM) especially helpful in
multiprogrammed systems
• CPU schedules process B while process A waits for
its memory to be retrieved from disk
22. Virtual and physical addresses
✦ Program uses virtual
CPU chip addresses
CPU MMU • Addresses local to the
process
Virtual addresses • Hardware translates virtual
from CPU to MMU address to physical address
✦ Translation done by the
Memory
Memory Management Unit
• Usually on the same chip as
Physical addresses the CPU
on bus, in memory
• Only physical addresses
Disk leave the CPU/MMU chip
controller ✦ Physical memory indexed
by physical addresses
23. Paging and page tables
Virtual addresses mapped to 60– 64K X Virtual memory
physical addresses 56– 60K X
• Unit of mapping is called a page 52– 56K 0
• All addresses in the same virtual 48– 52K X
page are in the same physical page 44– 48K X
• Page table entry (PTE) contains 40– 44K X
translation for a single page 36– 40K 3
✦ Table translates virtual page 32– 36K X
number to physical page number 28– 32K X
• Not all virtual memory has a 24– 28K X
physical page 20– 24K X Physical
• Not every physical page need be 16– 20K 1 memory
used 12– 16K 2 12– 16K
✦ Example: 8– 12K X 8– 12K
• 64 KB virtual memory 4– 8K X 4– 8K
• 16 KB physical memory 0– 4K X 0– 4K
24. What’s in a page table entry?
✦ Each entry in the page table contains
• Valid bit: set if this logical page number has a corresponding
physical frame in memory
- If not valid, remainder of PTE is irrelevant
• Page frame number: page in physical memory
• Referenced bit: set if data on the page has been accessed
• Dirty (modified) bit :set if data on the page has been modified
• Protection information
Page frame
Protection D R V
number
Dirty bit Referenced bit Valid bit
25. Mapping logical addresses to
physical addresses
e Split address from CPU Example:
into two pieces • 4 KB (=4096 byte) pages
• Page number (p) • 32 bit logical addresses
• Page offset (d) 2d = 4096 d = 12
✦ Page number
• Index into page table
• Page table contains base
address of page in physical 32-12 = 20 bits 12 bits
memory
✦ Page offset p d
• Added to base address to
get actual physical memory 32 bit logical address
address
✦ Page size = 2d bytes
26. Address translation architecture
Page frame number Page frame number
page number
page offset
0
CPU p d f d 1
.
.
.
0
f-1
1
.
. f
. f+1
p-1 f+2
p f .
.
p+1
.
physical memory
page table
28. Two-level page tables
Problem: page tables can be too
large
• 232 bytes in 4KB pages ⇒
1 million PTEs
✦ Solution: use multi-level page
tables
• “ Page size” in first page table is
large (megabytes)
Level 1
• PTE marked invalid in first page
page table
table needs no 2nd level page
table
✦ 1st level page table has pointers
Level 2
to 2nd level page tables page tables Memory
✦ 2nd level page table has actual
physical page numbers in it
29. More on two-level page tables
l Tradeoffs between 1st and 2nd level page table
sizes
• Total number of bits indexing 1st and 2nd level table
is constant for a given page size and logical address
length
• Tradeoff between number of bits indexing 1st and
number indexing 2nd level tables
- More bits in 1st level: fine granularity at 2nd level
- Fewer bits in 1st level: maybe less wasted space?
✦ All addresses in table are physical addresses
✦ Protection bits kept in 2nd level table
30. Two-level paging: example
System characteristics
• 8 KB pages
• 32-bit logical address divided into 13 bit page offset, 19 bit page number
✦ Page number divided into:
• 10 bit page number
• 9 bit page offset
✦ Logical address looks like this:
• p1 is an index into the 1st level page table
• p2 is an index into the 2nd level page table pointed to by p1
page number page offset
p1 = 10 bits p2 = 9 bits offset = 13 bits
32. Implementing page tables in hardware
Page table resides in main (physical) memory
✦ CPU uses special registers for paging
• Page table base register (PTBR) points to the page table
• Page table length register (PTLR) contains length of page
table: restricts maximum legal logical address
✦ Translating an address requires two memory
accesses
• First access reads page table entry (PTE)
• Second access reads the data / instruction from memory
✦ Reduce number of memory accesses
• Can’t avoid second access (we need the value from
memory)
• Eliminate first access by keeping a hardware cache
(called a translation lookaside buffer or TLB) of recently
used page table entries
33. Translation Lookaside Buffer
(TLB)
Search the TLB for the desired
logical page number Logical Physical
page # frame #
• Search entries in parallel
• Use standard cache techniques 8 3
✦ If desired logical page number unused
is found, get frame number 2 1
from TLB 3 0
✦ If desired logical page number 12 12
isn’t found 29 6
• Get frame number from page 22 11
table in memory 7 4
• Replace an entry in the TLB with Example TLB
the logical & physical page
numbers from this reference
34. Handling TLB misses
s If PTE isn’t found in TLB, OS needs to do the lookup
in the page table
✦ Lookup can be done in hardware or software
✦ Hardware TLB replacement
• CPU hardware does page table lookup
• Can be faster than software
• Less flexible than software, and more complex hardware
✦ Software TLB replacement
• OS gets TLB exception
• Exception handler does page table lookup & places the
result into the TLB
• Program continues after return from exception
• Larger TLB (lower miss rate) can make this feasible
35. How long do memory accesses
take?
✦ Assume the following times:
• TLB lookup time = a (often zero— overlapped in CPU)
• Memory access time = m
✦ Hit ratio (h) is percentage of time that a logical page
number is found in the TLB
• Larger TLB usually means higher h
• TLB structure can affect h as well
✦ Effective access time (an average) is calculated as:
• EAT = (m + a)h + (m + m + a)(1-h)
• EAT = a + (2-h)m
✦ Interpretation
• Reference always requires TLB lookup, 1 memory access
• TLB misses also require an additional memory reference
36. Inverted page table
✦ Reduce page table size further: keep one entry for
each frame in memory
• Alternative: merge tables for pages in memory and on disk
✦ PTE contains
• Virtual address pointing to this frame
• Information about the process that owns this page
✦ Search page table by
• Hashing the virtual page number and process ID
• Starting at the entry corresponding to the hash result
• Search until either the entry is found or a limit is reached
✦ Page frame number is index of PTE
✦ Improve performance by using more advanced hashing
algorithms
37. Inverted page table architecture
One-to-one correspondence between page table entries and pages in
memory
39. Page replacement algorithms
✦ Page fault forces a choice
• No room for new page (steady state)
• Which page must be removed to make room for an
incoming page?
✦ How is a page removed from physical memory?
• If the page is unmodified, simply overwrite it: a copy
already exists on disk
• If the page has been modified, it must be written
back to disk: prefer unmodified pages?
✦ Better not to choose an often used page
• It’ll probably need to be brought back in soon
40. Optimal page replacement
algorithm
✦ What’s the best we can possibly do?
• Assume perfect knowledge of the future
• Not realizable in practice (usually)
• Useful for comparison: if another algorithm is within
5% of optimal, not much more can be done…
✦ Algorithm: replace the page that will be used
furthest in the future
• Only works if we know the whole sequence!
• Can be approximated by running the program twice
- Once to generate the reference trace
- Once (or more) to apply the optimal algorithm
✦ Nice, but not achievable in real systems!
41. Not-recently-used (NRU)
algorithm
d Each page has reference bit and dirty bit
• Bits are set when page is referenced and/or modified
✦ Pages are classified into four classes
• 0: not referenced, not dirty
• 1: not referenced, dirty
• 2: referenced, not dirty
• 3: referenced, dirty
✦ Clear reference bit for all pages periodically
• Can’t clear dirty bit: needed to indicate which pages need to be
flushed to disk
• Class 1 contains dirty pages where reference bit has been cleared
✦ Algorithm: remove a page from the lowest non-empty class
• Select a page at random from that class
✦ Easy to understand and implement
✦ Performance adequate (though not optimal)
42. First-In, First-Out (FIFO) algorithm
✦ Maintain a linked list of all pages
• Maintain the order in which they entered memory
✦ Page at front of list replaced
✦ Advantage: (really) easy to implement
✦ Disadvantage: page in memory the longest may
be often used
• This algorithm forces pages out regardless of usage
• Usage may be helpful in determining which pages to
keep
43. Second chance page replacement
g Modify FIFO to avoid throwing out heavily used
pages
• If reference bit is 0, throw the page out
• If reference bit is 1
- Reset the reference bit to 0
- Move page to the tail of the list
- Continue search for a free page
✦ Still easy to implement, and better than plain FIFO
referencedunreferenced
A B C D E F G H A
t=0 t=4 t=8 t=15 t=21 t=22 t=29 t=30 t=32
44. Clock algorithm
✦ Same functionality as A
second chance H t=32
t=0 B
✦ Simpler implementation t=30 t=32
t=4
• “ Clock” hand points to next
page to replace G C
• If R=0, replace page t=29 t=32
t=8
• If R=1, set R=0 and
advance the clock hand F D
J
✦ Continue until page with t=22 E t=32
t=15
R=0 is found t=21
• This may involve going all
the way around the clock… referencedunreferenced
45. Least Recently Used (LRU)
✦ Assume pages used recently will used again soon
• Throw out page that has been unused for longest time
✦ Must keep a linked list of pages
• Most recently used at front, least at rear
• Update this list every memory reference!
- This can be somewhat slow: hardware has to update a
linked list on every reference!
✦ Alternatively, keep counter in each page table entry
• Global counter increments with each CPU cycle
• Copy global counter to PTE counter on a reference to the
page
• For replacement, evict page with lowest counter value
46. Simulating LRU in software
✦ Few computers have the necessary hardware to
implement full LRU
• Linked-list method impractical in hardware
• Counter-based method could be done, but it’s slow to
find the desired page
✦ Approximate LRU with Not Frequently Used
(NFU) algorithm
• At each clock interrupt, scan through page table
• If R=1 for a page, add one to its counter value
• On replacement, pick the page with the lowest
counter value
✦ Problem: no notion of age— pages with high
counter values will tend to keep them!
47. Aging replacement algorithm
t Reduce counter values over time
• Divide by two every clock cycle (use right shift)
• More weight given to more recent references!
✦ Select page to be evicted by finding the lowest counter value
✦ Algorithm is:
• Every clock tick, shift all counters right by 1 bit
• On reference, set leftmost bit of a counter (can be done by copying
the reference bit to the counter at the clock tick)
Referenced
this tick Tick 0 Tick 1 Tick 2 Tick 3 Tick 4
Page 0 10000000 11000000 11100000 01110000 10111000
Page 1 00000000 10000000 01000000 00100000 00010000
Page 2 10000000 01000000 00100000 10010000 01001000
Page 3 00000000 00000000 00000000 10000000 01000000
Page 4 10000000 01000000 10100000 11010000 01101000
Page 5 10000000 11000000 01100000 10110000 11011000
48. Working set
✦ Demand paging: bring a page into memory when
it’s requested by the process
✦ How many pages are needed?
• Could be all of them, but not likely
• Instead, processes reference a small set of pages at any
given time— locality of reference
• Set of pages can be different for different processes or
even different times in the running of a single process
✦ Set of pages used by a process in a given interval
of time is called the working set
• If entire working set is in memory, no page faults!
• If insufficient space for working set, thrashing may occur
• Goal: keep most of working set in memory to minimize
the number of page faults suffered by a process
49. How big is the working set?
w(k,t)
k
✦ Working set is the set of pages used by the k most
recent memory references
✦ w(k,t) is the size of the working set at time t
✦ Working set may change over time
• Size of working set can change over time as well…
51. Page replacement algorithms:
summary
Algorithm Comment
OPT (Optimal) Not implementable, but useful as a benchmark
NRU (Not Recently Used) Crude
FIFO (First-In, First Out) Might throw out useful pages
Second chance Big improvement over FIFO
Clock Better implementation of second chance
LRU (Least Recently Used) Excellent, but hard to implement exactly
NFU (Not Frequently Used) Poor approximation to LRU
Aging Good approximation to LRU, inefficient to implement
Working Set Somewhat expensive to implement
WSClock Implementable version of Working Set
52. Modeling page replacement
algorithms
✦ Goal: provide quantitative analysis (or simulation)
showing which algorithms do better
• Workload (page reference string) is important:
different strings may favor different algorithms
• Show tradeoffs between algorithms
✦ Compare algorithms to one another
✦ Model parameters within an algorithm
• Number of available physical pages
• Number of bits for aging
53. How is modeling done?
d Generate a list of references
• Artificial (made up)
• Trace a real workload (set of processes)
✦ Use an array (or other structure) to track the pages in physical
memory at any given time
• May keep other information per page to help simulate the algorithm
(modification time, time when paged in, etc.)
✦ Run through references, applying the replacement algorithm
✦ Example: FIFO replacement on reference string 0 1 2 3 0 1 4 0 1
234
• Page replacements highlighted in yellow
Page 0 1 2 3 0 1 4 0 1 2 3 4
referenced
Youngest page 0 1 2 3 0 1 4 4 4 2 3 3
0 1 2 3 0 1 1 1 4 2 2
Oldest page 0 1 2 3 0 0 0 1 4 4
54. Belady’s anomaly
ý Reduce the number of page faults by supplying more
memory
• Use previous reference string and FIFO algorithm
• Add another page to physical memory (total 4 pages)
✦ More page faults (10 vs. 9), not fewer!
• This is called Belady’s anomaly
• Adding more pages shouldn’t result in worse performance!
✦ Motivated the study of paging algorithms
Page referenced 0 1 2 3 0 1 4 0 1 2 3 4
Youngest page 0 1 2 3 3 3 4 0 1 2 3 4
0 1 2 2 2 3 4 0 1 2 3
0 1 1 1 2 3 4 0 1 2
Oldest page 0 0 0 1 2 3 4 0 1
55. Modeling more replacement
algorithms
p Paging system characterized by:
• Reference string of executing process
• Page replacement algorithm
• Number of page frames available in physical memory
(m)
✦ Model this by keeping track of all n pages
referenced in array M
• Top part of M has m pages in memory
• Bottom part of M has n-m pages stored on disk
✦ Page replacement occurs when page moves
from top to bottom
• Top and bottom parts may be rearranged without
causing movement between memory and disk
57. Stack algorithms
✦ LRU is an example of a stack algorithm
✦ For stack algorithms
• Any page in memory with m physical pages is also in
memory with m+1 physical pages
• Increasing memory size is guaranteed to reduce (or at
least not increase) the number of page faults
✦ Stack algorithms do not suffer from Belady’s anomaly
✦ Distance of a reference == position of the page in the
stack before the reference was made
• Distance is ∞ if no reference had been made before
• Distance depends on reference string and paging
algorithm: might be different for LRU and optimal (both
stack algorithms)
58. Predicting page fault rates using
distance
✦ Distance can be used to predict page fault rates
✦ Make a single pass over the reference string to
generate the distance string on-the-fly
✦ Keep an array of counts
• Entry j counts the number of times distance j occurs
in the distance string
✦ The number of page faults for a memory of size
m is the sum of the counts for j>m
• This can be done in a single pass!
• Makes for fast simulations of page replacement
algorithms
✦ This is why virtual memory theorists like stack
algorithms!
59. Local vs. global allocation policies
What is the pool of pages Last access time
eligible to be replaced? Page
• Pages belonging to the process A0 14
needing a new page A1 12 Local
• All pages in the system A2 8
allocation
✦ Local allocation: replace a A3
A4 5
B0 10 A4
page from this process B1 9
• May be more “ fair” : penalize B2
A4 3
Global
processes that replace many C0 16 allocation
pages C1 12
• Can lead to poor performance: C2 8
some processes need more C3 5
pages than others C4 4
✦ Global allocation: replace a
page from any process
60. Page fault rate vs. allocated
frames
✦ Local allocation may be
more “ fair”
• Don’t penalize other
Page faults/second
processes for high page fault High
rate
rate
✦ Global allocation is better
for overall system
performance Low
rate
• Take page frames from
processes that don’t need
them as much
• Reduce the overall page fault Number of page frames assigned
rate (even though rate for a
single process may go up)
61. Control overall page fault rate
✦ Despite good designs, system may still thrash
✦ Most (or all) processes have high page fault rate
• Some processes need more memory, …
• but no processes need less memory (and could give
some up)
✦ Problem: no way to reduce page fault rate
✦ Solution :
Reduce number of processes competing for
memory
• Swap one or more to disk, divide up pages they held
• Reconsider degree of multiprogramming
62. How big should a page be?
✦ Smaller pages have advantages
• Less internal fragmentation
• Better fit for various data structures, code sections
• Less unused physical memory (some pages have 20
useful bytes and the rest isn’t needed currently)
✦ Larger pages are better because
• Less overhead to keep track of them
- Smaller page tables
- TLB can point to more memory (same number of
pages, but more memory per page)
- Faster paging algorithms (fewer table entries to look
through)
• More efficient to transfer larger pages to and from disk
63. Separate I & D address spaces
✦ One user address space for
Instructions Data
both data & code 232-1
• Simpler
• Code/data separation harder
to enforce
Data
• More address space?
✦ One address space for data,
Data
another for code
• Code & data separated
Code
Code
• More complex in hardware
• Less flexible 0
• CPU must handle instructions
& data differently
✦ MINIX does the latter
64. Sharing pages
✦ Processes can share pages
• Entries in page tables point to the same physical
page frame
• Easier to do with code: no problems with modification
✦ Virtual addresses in different processes can be…
• The same: easier to exchange pointers, keep data
structures consistent
• Different: may be easier to actually implement
- Not a problem if there are only a few shared regions
- Can be very difficult if many processes share
regions with each other
65. When are dirty pages written to
disk?
✦ On demand (when they’re replaced)
• Fewest writes to disk
• Slower: replacement takes twice as long (must wait
for disk write and disk read)
✦ Periodically (in the background)
• Background process scans through page tables,
writes out dirty pages that are pretty old
✦ Background process also keeps a list of pages
ready for replacement
• Page faults handled faster: no need to find space on
demand
• Cleaner may use the same structures discussed
earlier (clock, etc.)
66. Implementation issues
✦ Four times when OS involved with paging
✦ Process creation
• Determine program size
• Create page table
✦ During process execution
• Reset the MMU for new process
• Flush the TLB (or reload it from saved state)
✦ Page fault time
• Determine virtual address causing fault
• Swap target page out, needed page in
✦ Process termination time
• Release page table
• Return pages to the free pool
67. How is a page fault handled?
u Hardware causes a page fault
✦ General registers saved (as on every exception)
✦ OS determines which virtual page needed
• Actual fault address in a special register
• Address of faulting instruction in register
- Page fault was in fetching instruction, or
- Page fault was in fetching operands for instruction
- OS must figure out which…
✦ OS checks validity of address
✦ Process killed if address was illegal
✦ OS finds a place to put new page frame
✦ If frame selected for replacement is dirty, write it out to disk
✦ OS requests the new page from disk
✦ Page tables updated
✦ Faulting instruction backed up so it can be restarted
✦ Faulting process scheduled
✦ Registers restored
✦ Program continues
68. Backing up an instruction
✦ Problem: page fault happens in the middle of
instruction execution
• Some changes may have already happened
• Others may be waiting for VM to be fixed
✦ Solution: undo all of the changes made by the
instruction
• Restart instruction from the beginning
• This is easier on some architectures than others
✦ Example: LW R1, 12(R2)
• Page fault in fetching instruction: nothing to undo
• Page fault in getting value at 12(R2): restart instruction
✦ Example: ADD (Rd)+,(Rs1)+,(Rs2)+
• Page fault in writing to (Rd): may have to undo an awful
lot…
69. Locking pages in memory
✦ Virtual memory and I/O occasionally interact
✦ P1 issues call for read from device into buffer
• While it’s waiting for I/O, P2 runs
• P2 has a page fault
• P1’s I/O buffer might be chosen to be paged out
- This can create a problem because an I/O device is
going to write to the buffer on P1’s behalf
✦ Solution: allow some pages to be locked into
memory
• Locked pages are immune from being replaced
• Pages only stay locked for (relatively) short periods
70. Storing pages on disk
Pages removed from memory are stored on disk
✦ Where are they placed?
• Static swap area: easier to code, less flexible
• Dynamically allocated space: more flexible, harder to locate a page
- Dynamic placement often uses a special file (managed by the file system)
to hold pages
✦ Need to keep track of which pages are where within the on-disk
storage
Main memory Disk Main memory Disk
Pages Pages
3 0 3 0
Swap area Swap area
5 6 5 6
1 7
2 2
4
4 1
Swap file
Page table 7
Page table
71. Separating policy and mechanism
✦ Mechanism for page replacement has to be in
kernel
• Modifying page tables
• Reading and writing page table entries
✦ Policy for deciding which pages to replace could be
in user space
• More flexibility 3. Request page
4. Page
User User
process 2. Page needed
External
pager
arrives
space
5. Here is the page!
Kernel Fault MMU
space 1. Page fault handler
6. Map in page
handler
72. Why use segmentation?
Virtual address space ✦ Different “ units” in a
single virtual address
Call
stack space
• Each unit can grow
Constants • How can they be kept
apart?
• Example: symbol table is
Allocated out of space
Source
✦ Solution: segmentation
In use
text • Give each unit its own
address space
Symbol
table
73. Using segments
✦ Each region of the process has its own segment
✦ Each segment can start at 0
• Addresses within the segment relative to the segment
start
✦ Virtual addresses are <segment #, offset within
segment>
20K
16K 16K
12K Symbol 12K 12K
table Source
8K 8K text 8K 8K Call
stack
4K 4K 4K Constants 4K
0K 0K 0K 0K
Segment 0 Segment 1 Segment 2 Segment 3
74. Paging vs. segmentation
What? Paging Segmentation
Need the programmer know No Yes
about it?
How many linear address One Many
spaces?
More addresses than Yes Yes
physical memory?
Separate protection for Not really Yes
different objects?
Variable-sized objects No Yes
handled with ease?
Is sharing easy? No Yes
Why use it? More address space Break programs into
without buying more logical pieces that are
memory handled separately.
78. Memory management in the
Pentium
t Memory composed of segments
• Segment pointed to by segment descriptor
• Segment selector used to identify descriptor
✦ Segment descriptor describes segment
• Base virtual address
• Size
• Protection
• Code / data
79. Converting segment to linear
address
n Selector identifies Selector
segment descriptor Offset
• Limited number of
selectors available in the
CPU Base
✦ Offset added to Limit +
segment’s base address Other info
✦ Result is a virtual address
that will be translated by
paging
32-bit linear
address
80. Translating virtual to physical
addresses
✦ Pentium uses two-level page tables
• Top level is called a “ page directory” (1024 entries)
• Second level is called a “ page table” (1024 entries each)
• 4 KB pages