SlideShare a Scribd company logo
1
Virtual Memory (GalvinNotes, 9th Ed.)
Chapter 9: Virtual Memory
Chapter Objectives
 To describe the benefits of a virtual memory system.
 To explain the concepts of demand paging, page-replacement algorithms, and allocation of page frames.
 To discuss the principles of the working-set model.
 To examine the relationship between shared memory and memory-mapped files.
 To explore how kernel memory is managed.
Outline
 Background (About preceding sections, concept of a process not having all of its pages in memory, virtual memory concept,
virtual address space, shared memoryusing virtual memory)
 Demand Paging:
o Basic concepts
o Performance of Demand Paging
 Copy-on-Write
 Page Replacement:
o Basic Page Replacement
o FIFO Page Replacement
o Optimal Page Replacement
o LRU Page Replacement (Algorithms: Additional-Reference-Bits, Second-Chance, Enhanced Second-Chance, Counting-
based, Page-Buffering, Applications and Page Replacement)
 Allocation of Frames:
o Minimum number of frames
o Allocation Algorithms
o Global vs Local Allocation
o Non-Uniform Memory Access
 Thrashing:
o Cause of Thrashing
o Locality Model
o Working-Set Model
o Page Fault Frequency
 Memory-Mapped Files:
o Basic Mechanism
o Shared Memory in the Win32 API
o Memory-Mapped I/O
 Allocating Kernel Memory:
o Buddy system
o Slab Allocation
 Other Considerations: Prepaging, Page size, TLB Reach, Inverted Page Tables, Program Structure, I/O Interlock and Page
Locking
 OS examples (Optional): Windows, Solaris
Content
 Precedingsections talkedabout howto avoid memoryfragmentation bybreaking process memoryrequirements downintosmaller bites (pages),
and storing the pages non-contiguouslyinmemory.
 Most real processesdo not needalltheir pages, or at least not allat once, for several reasons:Error handling code is not neededunless that
specific error occurs, some ofwhichare quite rare. Arrays are oftenover-sizedfor worst-casescenarios, andonlya smallfractionof the arrays are
actuallyusedin practice. Certain features of certainprograms are rarelyused such as the routine to balance the federalbudget. (Me thinks this
holds the keyto the larger-than-physical virtual memoryconcept)
 The abilityto load onlythe portions ofprocesses that were actuallyneeded(andonlywhen theywere needed) has several benefits:Programs
could be writtenfor a much larger address space (virtual memoryspace) than physicallye xists onthe computer. Because eachprocessis only
2
Virtual Memory (GalvinNotes, 9th Ed.)
using a fractionof their total address space, there is more memoryleft for other programs, improvingCPU utilization and system throughput.
Less I/O is neededfor swappingprocesses inandout of RAM, s peeding things up. (Fig9.1 show layout of VM)
 Figure 9.2 shows virtual address space, whichis the programmer’s logical view of process memorystorage. The actual physical layout is
controlledbythe process's page table. Note that the address space shown inFigure 9.2 is sparse - A great hole inthe middle of the address space
is never used, unlessthe stackand/or the heap grow to fill the hole.

 Virtual memoryalsoallows the
sharing offiles andmemorybymultiple processes,
with several benefits:#System libraries can be
sharedbymapping them into the virtual addressspace of more thanone
process. #Processes can alsoshare virtualmemorybymapping the same block ofmemoryto more thanone process. #Process pagescan
be sharedduringa fork() system call, eliminating the needto copyallof the pages of the original
(parent) process.
DEMAND PAGING
 The basic idea behind demand paging is that when a process is swappedin, its pages are not swapped in
all at once. Rather theyare swappedinonlywhen the process needs them. ( on demand. ) This is termed
a lazyswapper.
 The basic idea
behindpaging is that whena
process is swappedin, the
pager onlyloads intomemory
those pagesthat it expects
the processto need (right away.) Pages that are not loaded into
memoryare marked as invalid inthe page table, using the invalid bit.
(The rest of the page table entrymayeither be blankor containinformation about where to find the swapped-out page onthe hard drive.) If the
process onlyever accesses pages that are loaded inmemory(memoryresident pages), thenthe processruns exactlyas if all the pages were
loadedinto memory.
 On the other hand, ifa page is neededthat wasnot originallyloadedup, thena page fault trapis generated, which must be handled ina series of
steps:The memoryaddressrequested is first checked, to make sure it wasa valid memoryrequest. If the reference was invalid, the process is
terminated. Otherwise, the page must be paged in. A free frame is located, possiblyfrom a free-frame list. A diskoperationis scheduled to bring
3
Virtual Memory (GalvinNotes, 9th Ed.)
in the necessarypage from disk. (This willusuallyblock the process onan I/O wait, allowingsome other processto use the CPU in the meantime.)
When the I/O operationis complete, the process's page table is updatedwith the newframe number, and the invalidbit is changedto indicate
that this is now a validpage reference. The instruction that causedthe page fault must nowbe restartedfromthe beginning, (as soonas this
process gets another turnon the CPU.)
 In an extreme case, NO pagesare swapped infor a process until theyare requestedbypage faults. Thisis knownas pure demand paging.
 In theoryeachinstruction couldgenerate multiple page faults. Inpractice this is veryrare, due to localityof reference, coveredin section 9.6.1.
 The hardware necessaryto support virtual memoryis the same as for pagingand swapping:A page table andsecondarymemory. (Swap space,
whose allocationis discussedinchapter 12.)
 A crucial part of the processis that the instruction must be restartedfromscratchonce the desired page hasbeen made available in memory. For
most simple instructions this is not a major difficulty. However there are some architectures that allow a single instructionto modifya fairlylarge
block of data, (which mayspana page boundary), andifsome of the data gets modifiedbefore the page fault occurs, this couldcause problems.
One solutionis to access bothends of the block before executing the instruction, guaranteeing that the necessarypagesget pagedinbefore the
instruction begins.
 Performance of Demand Paging: There are manysteps that occur whenservicing a page fault (see bookfor full details), andsome of the steps
are optional or variable. But just for the sake ofdiscussion, suppose that a normalmemoryaccess requires 200 nanoseconds, andthat servicing a
page fault takes 8 milliseconds. (8,000,000 nanoseconds, or 40,000 times a normal memoryaccess.)Witha page fault rate ofp, (ona scale from 0
to 1), the effective access time is now: (1 - p) * (200) + p * 8000000 = 200 + 7,999,800 * p
which clearlydepends heavilyon p! Even if onlyone access in 1000 causes a page fault, the effective access time drops from200
nanoseconds to 8.2 microseconds, a slowdownof a factor of 40 times. In order to keep the slowdownless than10%, the page fault rate must be
less than0.0000025, or one in399,990 accesses.
 A subtletyis that swapspace is faster to access thanthe regular file system, because it does not have to go throughthe wh ole directorystructure.
For this reasonsome systems will transfer anentire processfrom the file system to swapspace before starting upthe process, so that future
pagingalloccurs fromthe (relatively) faster swapspace.
 Some systems use demandpaging directlyfrom the file system for binarycode (which never changes andhence doesnot have to be storedona
page operation), andto reserve the swapspace for data segments that must be stored. This approachis used byboth SolarisandBSD Unix.
Copy-on-Write:
 The idea behinda copy-on-write forkis that the pages for a parent process do not have to be actuallycopiedfor the childuntil one or the other of
the processes changesthe page. Theycanbe simplysharedbetweenthe two processesinthe meantime, with a bit set that the page needs to be
copied if it ever gets writtento. This is a reasonable approach, since the childprocessusuallyissues an exec() system call immediatelyafter the
fork (Last line grey). Obviouslyonlypages that can be modifiedevenneed to be labeledas copy-on-write. Code segments cansimplybe shared.
Pages usedto satisfycopy-on-write duplications are typicallyallocated using zero-fill-on-demand, meaning that their previous contents are
zeroed out before the copyproceeds.
 Some systems provide analternative to the fork()systemcall calleda virtual memory fork, vfork(). In this case the parent is suspended, andthe
childuses the parent's memorypages. Thisis veryfast for process creation, but requires that the childnot modifyanyof the sharedmemory
pages before performingthe exec()system call. (Inessence thisaddresses the questionof whichprocessexecutesfirst after a call to fork, the
parent or the child. Withvfork, the parent is suspended, allowing the child to execute first until it calls exec(), sharing pages withthe parent in the
meantime.)
Page Replacement
 In order to make the most use ofvirtualmemory, we loadseveral
processes intomemoryat the same time. Since we onlyloadthe pages
that are actuallyneededbyeachprocess at anygiven time, there is
room to loadmanymore processes than if we hadto loadinthe entire
process.
 Memoryis alsoneeded for other purposes (suchas I/O buffering), andif
some process suddenlydecides it needs more pages and there aren't
anyfree frames available, thenthere are several possible solutions to
consider:
4
Virtual Memory (GalvinNotes, 9th Ed.)
o Adjust the memoryusedbyI/O buffering, etc., to free upsome frames for user processes. The decisionof how to allocate memoryfor I/O
versus user processes is a complex one, yieldingdifferent policieson different systems. (Some allocate a fixedamount for I /O, andothers
let the I/O systemcontendfor memoryalongwitheverything else.)
o Put the process requestingmore pagesintoa wait queue until some free frames become available.
o Swap some processout of memorycompletely, freeing upits page frames.
o Find some page inmemorythat isn't being usedright now, and swap that page onlyout to disk, freeing upa frame that canbe allocated to
the processrequestingit. This is knownas page replacement, andis the most common solution. There are manydifferent algorithms for
page replacement, whichis the subject of the remainder of this section.
Basic Page Replacement:
 The previouslydiscussed page-fault processing assumedthat there would
be free framesavailable on the free-frame list. Nowthe page-fault
handling must be modifiedto free up a frame ifnecessary, as follows:
1. Find the locationof the desired page onthe disk, either in
swapspace or inthe file system.
2. Find a free frame:
a) If there is a free frame, use it.
b)If there is nofree frame, use a page-replacement
algorithmto select anexistingframe to be replaced,
known as the victim frame.
c) Write the victim frame to disk. Change all related
page tables to indicate that this page is nolonger in
memory.
3. Read inthe desiredpage and store it inthe frame. Adjust all
relatedpage andframe tables to indicate the change.
4. Restart the process that was waitingfor this page
 Note that step 2c adds anextra disk write to the page-fault handling, effectivelydoublingthe time requiredto processa page fault. This can be
alleviatedsomewhat byassigninga modify bit, or dirty bit to each page, indicatingwhether or not it has beenchangedsince it was last loadedin
from disk. If the dirtybit has not beenset, thenthe page is unchanged,and does not needto be writtenout to disk. Otherwise the page write is
required. It shouldcome as nosurprise that manypage replacement strategiesspecificallylook for pages that do not have their dirtybit set, and
preferentiallyselect cleanpages as victimpages. It should alsobe obvious that unmodifiable code pages never get their dirtybits set.
 There are two major requirements to implement a successful demandpaging system. We must developa frame-allocationalgorithm and a
page-replacement algorithm. The former centers around how manyframes are allocatedto each process (and to other needs), andthe latter
dealswithhow to select a page for replacement whenthere are no free frames available. The overallgoalinselectingandtuningthese
algorithms is to generate the fewest number of overallpage faults. (Because diskaccessis soslowrelative to memoryaccess, evenslight
improvements to these algorithms can yieldlarge improvements inoverall systemperformance.)
 Algorithms are evaluatedusinga given string of memoryaccesses known as a reference string, which canbe generatedinone of ( at least ) three
common ways:
o Randomlygenerated, either evenlydistributed or withsome distributioncurve based onobserved system behavior. This is the
fastest andeasiest approach, but maynot reflect real performance well, as it ignores localityof reference.
o Specificallydesigned sequences. These are useful for illustrating the properties of comparative algorithms inpublished papers and
textbooks, ( and alsofor homeworkandexam problems. :-) )
o Recorded memoryreferences from a live system. This maybe the best approach, but the amount ofdata collectedcan be enormous,
on the order of a millionaddresses per second. The volume of collecteddata canbe reducedbymaking two important observations:
 Onlythe page number that was accessedis relevant. The offset within that page does not affect pagingoperations.
 Successive accesses within the same page can be treated as a single page request, because allrequests after the first are
guaranteedto be page hits. ( Since there are nointerveningrequests for other pages that couldremove this page from the
page table. )
**So for example, if pages were ofsize 100 bytes, then the sequence of addressrequests ( 0100, 0432, 0101, 0612, 0634,
0688, 0132, 0038, 0420 ) wouldreduce to page requests ( 1, 4, 1, 6, 1, 0, 4 )
FIFO Page Replacement
 As new pagesare brought in, theyare addedto the tail of a queue, andthe page at the headof the queue is the next victim. Inthe following
5
Virtual Memory (GalvinNotes, 9th Ed.)
example, 20 page requests result in15 page faults:
 Although FIFO is simple andeasy, it is not always optimal, or even efficient. An interesting effect that can occur with FIFO is Belady's anomaly, in
which increasingthe number of frames available can actuallyincrease the number of page faults that occur! Consider, for example, the following
chart basedonthe page sequence (1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5) and a varying number of available frames. Obviouslythe maximum number of
faults is 12 (everyrequest generates a fault), andthe minimum number is 5 (each page loadedonlyonce)...
 In FIFO algorithm, whichever page hasbeen inthe frames the longest is the one that is cleared. Until Bélády's anomalywas demonstrated, itwas
believedthat anincreaseinthe number of page frameswouldalways result inthe same number or fewer page faults. Bélády, NelsonandShedler
constructedreference strings for whichFIFO page replacement algorithm producednearlytwice more page faults ina larger memorythanina
smaller one (wiki).
Optimal Page Replacement
 The discoveryof Belady's anomalylead to the searchfor anoptimalpage-replacement algorithm, whichis simplythat whichyields the lowest ofall
possible page-faults, andwhichdoes not suffer from Belady's anomaly.
 Such an algorithm does exist, and is calledOPT or MIN. This algorithmis simply"Replace the page that will not be used for the longest time inthe
future." (www.youtube.com/watch?v=XmdgDHhx0fg clearlyexplains:Lookahead intothe sequence to see whichnumber won’t be requiredfor
the longest period, page out that number). FIFO can
take 2-3 times more time thanOPT/MIN.
 OPT cannot be implemented in practice, because it
requiresforetellingthe future, but it makes a nice
benchmark for the comparisonandevaluationof real
proposednew algorithms.
 In practice most page-replacement algorithms tryto
approximate OPT bypredicting(estimating) in one
fashionor another what page will not be usedfor the
longest periodof time. The basis of FIFO is the predictionthat the page that was brought in the longest time ago is the one that willnot be needed
againfor the longest future time, but as we shall see, there are manyother prediction methods, all strivingto match the performance of OPT.
LRU Page Replacement
 The predictionbehind LRU, the Least RecentlyUsed, algorithm is
that the page that has not beenusedinthe longest time is the
one that will not be usedagaininthe near future. (Note the
distinctionbetweenFIFO andLRU:The former looks at the oldest
loadtime, andthe latter looks at the oldest use time.) Some view
LRU as analogous to OPT, except lookingbackwards intime
insteadof forwards. (OPT has the interestingpropertythat for any
reference string S andits reverse R, OPT will generate the same number of page faults for S and for R. It turns out that LRU hasthis same
property.) Figure 9.15 illustrates LRU for our sample string, yielding12 page faults, (ascompared to 15 for FIFO and9 for OPT.)
 LRU is considereda goodreplacement policy, andis oftenused. The problemis howexactlyto implement it. There are two simple
approaches commonlyused:
o Counters: Everymemoryaccess increments a counter, andthe current value ofthis counter is storedinthe page table entryfor
that page. Then finding the LRU page involves simple searching the table for the page withthe smallest counter value. Note that
overflowing of the counter must be considered.
o Stack:Another approach is to usea stack, and whenever a page is accessed, pullthat page from the middle of the stack andplace it
on the top. The LRU page will always be at the bottomof the
stack. Because this requires removing objects from the middle
of the stack, a doublylinked list is the recommendeddata
structure (last line grey).
 Both implementations of LRU require hardware support, either for
incrementing the counter or for managingthe stack, as these operations
must be performedfor everymemoryaccess.
 Neither LRU or OPT exhibit Belady's anomaly. Bothbelongto a class of
page-replacement algorithms calledstackalgorithms, which cannever
6
Virtual Memory (GalvinNotes, 9th Ed.)
exhibit Belady's anomaly. A stackalgorithmis one in whichthe pageskept in memoryfor a frame set of size N will always be a subset ofthe
pages kept for a frame size of N + 1. In the case of LRU, (andparticularlythe stackimplementationthereof), the topN pages of the stackwill
be the same for all frame set sizes ofN or anythinglarger.
 LRU-Approximation Page Replacement: Full implementationof LRU requires hardware support, andfewsystems provide the fullhardware
support necessary. However manysystems offer some degree of HWsupport, enoughto approximate LRU fairlywell. (In the absence of ANY
hardware support, FIFO might be the best available choice.)Inparticular, manysystems provide a reference bit for everyentryina page
table, which is set anytime that page is accessed. Initiallyall bits are set to zero, andtheycan also all be clearedat an ytime. One bit of
precisionis enoughto distinguishpages that have beenaccessedsince the last clear from those that have not, but doesnot provide anyfiner
grain ofdetail.
 Additional-Reference-Bits Algorithm: Finer grainis possible bystoring the most recent 8 reference bits for each page inan 8-bit byte inthe
page table entry, which is interpretedas anunsignedint. At periodic intervals (clock interrupts), the OS takes over, and right-shifts eachof
the reference bytes byone bit. The high-order (leftmost)bit is then filled in with the current value of the reference bit, andthe reference bits
are cleared. At anygiventime, the page withthe smallest value for the reference byte is the LRU page. Obviouslythe specific number of bits
usedandthe frequencywith whichthe reference byte is updated are adjustable, andare tunedto give the fastest performance on a given
hardware platform.
 Second-Chance Algorithm: Imagine a pointer that moves continuously
from the topmost frame to the bottom andthenback again. If the
pointer is at position Xat a point of time, andthat frame gets filled witha
page fromthe page sequence provided, then the pointer moves/points
to the next frame. The reference bits are set to 0 the first time a new
page is paged in. Anymore reference to that page sets its reference bit
to 1. If the pointer is at a frame whose reference bit is 1, and the next
reference is againto the same page as present in the current frame, then
the bit doesn't become 2! A frame's content is cleanedand replaced only
if the pointer is pointingto it andit's reference bit is 0. If its reference bit
is 1, then the next frame who reference bit is 0 is replaced, but at the
same time, the current frame's reference bit (whichis currently1), is
changed/set to zero before the pointer moves aheadto the next frame
(http://www.mathcs.emory.edu/~cheung/Courses/355/Syllabus/9-
virtual-mem/SC-replace.html). The book's figure is not clear at all for
understanding, but neverthelessproviding it below.
 The second chance algorithm (or Clock Algorithm)is essentiallya FIFO,
except the reference bit is used to give pagesa secondchance at staying
in the page table. Whena page must be replaced, the page table is
scannedina FIFO (circular queue) manner. Ifa page is foundwithits
reference bit not set, thenthat page is selectedas the next victim. If,
however, the next page inthe FIFO does have its reference bit set, then
it is given a second chance: The reference bit is cleared, and the FIFO
search continues. If some other page is foundthat didnot have its
reference bit set, thenthat page will be selectedas the victim, and this
page (the one beinggiventhe second chance)will be allowedto stayin
the page table. If, however, there are noother pages that donot have
their reference bit set (to put it simply, all have their bits set), thenthis
page will be selected as the victimwhenthe FIFO search circles back
aroundto this page on the secondpass. If all reference bits in the table
are set, then secondchance degrades to FIFO, but alsorequires a
complete search ofthe table for everypage-replacement. As long as
there are some pageswhose reference bits are not set, thenanypage
referenced frequentlyenoughgets to stayinthe page table indefinitely.
 Enhanced Second-Chance Algorithm: The enhancedsecondchance
algorithmlooks at the reference bit andthe modifybit (dirtybit) as anorderedpage, and classifies pages intoone offour classes:(0, 0) -
Neither recentlyusednor modified. (0, 1) - Not recentlyused, but modified. (1, 0) - Recentlyused, but clean. (1, 1) - Recentlyused and
modified. This algorithmsearchesthe page table ina circular fashion(inas manyas four passes), looking for the first page it canfindin the
lowest numberedcategory. I.e. it first makes a passlooking for a (0, 0), andthenifit can't findone, it makes another pass lookingfor a (0, 1),
etc. The maindifference between this algorithm andthe previous one is the preference for replacing clean pages if possible.
 Counting-Based Page Replacement: There are several algorithms based oncountingthe number ofreferences that have beenmade to a
given page, such as: (A) Least Frequently Used, LFU – Replace the page withthe lowest reference count. A problem can occur if a page is
usedfrequentlyinitiallyandthennot used anymore, as the reference count remains high. A solutionto this problem is to right-shift the
counters periodically, yielding a time-decaying average reference count. (B) Most Frequently Used, MFU – Replace the page withthe
7
Virtual Memory (GalvinNotes, 9th Ed.)
highest reference count. The logic behindthis ideais that pagesthat have alreadybeen referenceda lot have been inthe system a long time,
and we are probablydone withthem, whereas pagesreferenced onlya few timeshave onlyrecentlybeen loaded, and we still needthem.
In general counting-based algorithms are not commonlyused, as their implementationis expensive andtheydo not approximate OPT well.
 Page-Buffering Algorithms: There are a number of page-buffering algorithms that canbe usedinconjunctionwith the afore-mentioned
algorithms, to improve overallperformance andsometimes make up for inherent weaknesses in the hardware and/or the underlyin gpage-
replacement algorithms —
o Maintaina certainminimum number of free frames at alltimes. Whena page-fault occurs, goaheadandallocate one ofthe free
frames fromthe free list first, to get the requestingprocess upand running againas quicklyas possible, and thenselect a victim
page to write to diskandfree upa frame as a secondstep.
o Keep a list of modifiedpages, andwhenthe I/O system is otherwise idle, have it write these pagesout to disk, andthenclear the
modifybits, therebyincreasing the chance of finding a "clean" page for the next potential victim.
o Keep a pool of free frames, but remember what page was init before it was made free. Since the data in the page is not actually
clearedout whenthe page is freed, it canbe made anactive page againwithout having to loadinanynew data from disk. This is
useful when analgorithmmistakenlyreplacesa page that in fact is needed again soon.
 Some applications like database programs undertake their ownmemorymanagement overridingthe general-purpose OS for data accessing
and caching needs. Theyare oftengiven a rawdiskpartition to work with, containingrawdata blocks, andno file system structure.
Allocation of Frames
We saidearlier that there were twoimportant tasks invirtualmemorymanagement:a page-replacement strategyanda frame-allocation strategy. This
sectioncovers the secondpart of that pair.
 Minimum Number of Frames: The absolute minimum number of frames that a process must be allocated is dependent onsystem
architecture, and corresponds to the worst-case scenarioof the number of pagesthat could be touched bya single (machine)instruction. If an
instruction (andits operands) spans a page boundary, thenmultiple pagescouldbe needed just for the instruction fetch. Memoryreferences
in an instructiontouchmore pages, andifthose memorylocations can spanpage boundaries, then multiple pages could be need edfor
operandaccess also. The worst case involves indirect addressing, particularlywhere multiple levels of indirect addressing are allowed. Left
unchecked, a pointer to a pointer to a pointer to a pointer to a . . . couldtheoreticallytoucheverypage inthe virtual address space ina single
machine instruction, requiringeveryvirtual page be loaded into physicalmemorysimultaneously. For this reasonarchitectures place a limit
(say16) on the number of levels of indirectionallowedinan instruction, whichis enforced witha counter initializedto th e limit and
decremented witheverylevel of indirection inaninstruction - Ifthe counter reaches zero, thenan"excessive indirection" trap occurs. This
example would still require a minimum frame allocationof 17 per process.
 Allocation Algorithms:
o Equal Allocation - If there are m framesavailable andn processes to share them, eachprocess gets m/nframes, andthe leftovers are
kept in a free-frame buffer pool.
o Proportional Allocation - Allocate the framesproportionallyto the size of the process, relative to the total size ofallprocesses. So if
the size of process i is S_i, and S is the sumof all S_i, thenthe allocationfor process P_i is a_i = m * S_i / S. Variations onproportional
allocationcouldconsider priorityof process rather thanjust their size. Obviouslyallallocations fluctuate over time as the number of
available free frames, m, fluctuates, andall are also subject to the constraints ofminimum allocation. (If the minimumallocations
cannot be met, thenprocesses must either be swappedout or not allowedto start untilmore free framesbecome available.)
 Global versus Local Allocation: One bigquestion is whether frame allocation(page replacement) occurs ona local or global level. With local
replacement, the number of pages allocated to a process is fixed, andpage replacement occurs onlyamongst the pages allocatedto this
process. With global replacement, anypage maybe a potential victim, whether it currentlybelongs to the process seekinga free frame or not.
Local page replacement allows processesto better control their ownpage fault rates, andleads to more consistent performance of a given
process over different system loadlevels. Global page replacement is overall more efficient, andis the more commonlyusedapproach.
 Non-Uniform Memory Access (Consolidates understanding): The above arguments all assume that all memoryis equivalent, or at least has
equivalent access times. This maynot be the case in multiple-processor systems, especiallywhere each CPU is physicallylocatedona separate
circuit board which also holds some portionof the overall system memory. In these latter systems, CPUs canaccessmemorythat is physically
locatedon the same board muchfaster thanthe memoryon the other boards. The basic solutionis akin to processor affinity - At the same
time that we tryto schedule processes onthe same CPU to minimize cache misses, we also tryto allocate memoryfor those processeson the
same boards, to minimize access times. The presence of threads complicates the picture, especiallywhen the threads get loadedonto different
processors. Solarisuses anlgroup as a solution, ina hierarchicalfashionbased onrelative latency. For example, all processors and RAMon a
single board would probablybe in the same lgroup. Memoryassignments are made within the same lgroup if possible, or to the next nearest
lgroup otherwise. (Where "nearest" is definedas having the lowest access time.)
Thrashing
If a process cannot maintainits minimum required number of frames, then it must be swappedout, freeing upframesfor other processes. This is an
intermediate level of CPU scheduling. But what about a process that cankeep its minimum, but cannot keepall of the frames that it is currentlyusing on
a regular basis? In thiscase, it is forcedto page out pagesthat it will needagaininthe very near future, leadingto large numbers of page faults. A
process that is spendingmore time pagingthanexecutingis said to be thrashing.
8
Virtual Memory (GalvinNotes, 9th Ed.)
 Cause of Thrashing: Earlyprocess scheduling schemes wouldcontrol the level of
multiprogramming allowedbased onCPU utilization, adding inmore processes
when CPU utilizationwas low. The problem is that whenmemoryfilledup and
processes started spending lots of time waitingfor their pagesto page in, then
CPU utilization would lower, causingthe schedule to add in even more
processes andexacerbating the problem! Eventuallythe systemwould
essentiallygrindto a halt. Local page replacement policiescanprevent one
thrashing process fromtaking pagesawayfrom other processes, but it still tends
to clog up the I/O queue, therebyslowing downanyother process that needs to
do even a littlebit of paging (or anyother I/O for that matter.)
To prevent thrashing we must provide processes withas manyframes as theyreallyneed"right now", but how dowe know what that is? The
localitymodel notes that processes typicallyaccess memoryreferences ina given locality, makinglots of referencesto the same general area
of memorybefore moving periodicallyto a new locality, as shown inFigure 9.19 below. If we couldjust keepas manyframes as are involvedin
the current locality, thenpage faultingwouldoccur primarilyon switches fromone localityto another. (E.g. when one functionexits and
another is called.)
 Working-Set Model: The workingset modelis basedon the concept of locality, and defines a working set window, oflengthdelta. Whatever
pages are included inthe most recent delta page references are
said to be inthe processes working set window, andcomprise its
current working set, as illustratedin Figure 9.20:
The selection ofdelta is critical to the success ofthe workingset
model - If it is too small thenit does not encompass all ofthe pages
of the current locality, andif it is too large, thenit encompasses
pages that are nolonger being frequentlyaccessed. The total
demand, D, is the sum of the sizes of the working sets for all processes. If D exceeds the total number of available frames, thenat least one
process is thrashing, because there are not enough framesavailable to satisfyits minimum working set. If D is significantlylessthanthe
currentlyavailable frames, thenadditional processescanbe launched. The hard part of the working-set model is keeping trackof what pages
are in the current workingset, since everyreference adds one to the set and removes one older page. An approximationcanbe made using
reference bits anda timer that goes off after a set interval of memoryreferences:For example, suppose that we set the timer to go off after
every5000 references (byanyprocess), and we canstore twoadditional historical reference bits inadditionto the current reference bit. Every
time the timer goes off, the current reference bit is copied to one of the twohistorical bits, andthen cleared. If anyof the three bits is set, then
that page was referenced withinthe last 15,000 references, andis considered to be in that processes reference set. Finer resolutioncanbe
achieved withmore historical bits and a more frequent timer, at the expense of greater overhead.
 Page-Fault Frequency: A more direct approach is to recognize that what we reallywant to control is the page-fault rate, andto allocate frames
basedonthis directlymeasurable value. Ifthe page-fault rate exceeds a certainupper boundthenthat process needs more frames, andif it is
below a givenlower bound, thenit can affordto give upsome ofits frames to other processes. (Illinois professor supposes a page-replacement
strategycouldbe devisedthat wouldselect victim framesbasedonthe processwith the lowest current page-fault frequency.). Note that there
is a direct relationshipbetween the page-fault rate andthe working-set, as a process moves fromone localityto another (unnumbered side-
bard-9th Ed).
.
Memory-Mapped Files
Rather thanaccessing data files directlyvia the file system witheveryfileaccess, data files canbe pagedintomemorythe same as process files, resulting
in much faster accesses (except ofcourse when page-faults occur.) Thisis knownas memory-mapping a file.
9
Virtual Memory (GalvinNotes, 9th Ed.)
 Basic Mechanism: Basicallya file is mapped to an address range withina
process's virtual address space, and thenpaged in as neededusing the
ordinarydemandpagingsystem. Note that file writes are made to the
memorypage frames, andare not immediatelywrittenout to disk. (This is the
purpose of the "flush()" systemcall, whichmayalsobe neededfor stdout in
some cases.) Thisis alsowhyit is important to "close()" a file whenone is
done writing to it - So that the data can be safelyflushedout to disk and so
that the memoryframes canbe freedup for other purposes. Some systems
provide specialsystemcalls to memorymapfiles anduse direct disk access
otherwise. Other systems mapthe file to process address space ifthe special
systemcalls are used andmapthe file to kernel address space otherwise, but
do memorymapping in either case. File sharing is made possible bymapping
the same file to the address space of more than one process, as shownin
Figure 9.23 below. Copy-on-write is supported, andmutual exclusion
techniques (chapter 6) maybe neededto avoid synchronization problems.
 Memory-Mapped I/O: All access to devices is done bywriting into(or reading from) the device's registers. Normallythis is done via special I/O
instructions. For certaindevices it makes sense to simplymap the device's registers to addressesinthe process's virtual address space, making
device I/O as fast andsimple as anyother memoryaccess. Videocontroller cards are a classic example ofthis. Serialand parallel devicescan
also use memorymappedI/O, mapping the device registers to specific memoryaddresses known as I/O Ports, e.g. 0xF8. Transferring a series
of bytes must be done one at a time, movingonlyas fast as the I/O device is preparedto processthe data, through one of twomechanisms:
o Programmed I/O (PIO), also known as polling – The CPU periodicallychecks the control bit on the device, to see ifit is readyto
handle another byte of data.
o Interrupt Driven – The device generates aninterrupt when it either hasanother byte of data to deliver or is readyto receive another
byte.
Allocating Kernel Memory
Previous discussions have centeredonprocessmemory, which can be convenientlybrokenup into page-sizedchunks, andthe onlyfragmentation
that occurs is the average half-page lost to internal fragmentationfor each process (segment). There is alsoadditional memoryallocatedto the kernel,
however, which cannot be so easilypaged. Some ofit is usedfor I/O buffering anddirect access bydevices, for example, and must therefore be
contiguous andnot affectedbypaging. Other memoryis usedfor internal kernel data structures of various sizes, andsince k ernelmemoryis often
locked(restrictedfrom being ever swapped out), management of thisresource must be done carefullyto avoid internal fragmentation or other waste.
(i.e. you wouldlike the kernelto consume as littlememoryas possible, leavingas muchas possible for user processes.)Accordinglythere are several
classic algorithms inplace for allocating kernel memorystructures.
 Buddy System: The BuddySystemallocates memoryusing
a power of twoallocator. Under this scheme, memoryis
always allocatedas a power of 2 (4K, 8K, 16K, etc),
roundingupto the next nearest power of two if
necessary. If a block ofthe correct size is not currently
available, then one is formed bysplittingthe next larger
block in two, forming twomatchedbuddies. (Andifthat
larger size is not available, thenthe next largest available
size is split, and soon.)One nice feature ofthe buddy
systemis that if the address of a block is exclusivelyORed
with the size ofthe block, the resultingaddress is the
address of the buddyof the same size, whichallows for
fast andeasycoalescing of free blocks back intolarger
blocks. Free lists are maintained for everysize block. If the
necessaryblock size is not available uponrequest, a free
block fromthe next largest size is split into twobuddies of
the desiredsize. (Recursivelysplitting larger size blocks if necessary.)
When a blockis freed, its buddy's address is calculated, and the free
list for that size blockis checked to see if the buddyis alsofree. If it is,
then the twobuddies are coalescedintoone larger free block, andthe
process is repeated withsuccessivelylarger free lists. See the
(annotated) Figure 9.27 below for anexample.
 Slab Allocation: Slab Allocationallocatesmemoryto the kernelin
chunks calledslabs, consisting ofone or more contiguous pages. The
kernel thencreatesseparate cachesfor eachtype ofdata structure it
might need from one or more slabs. Initiallythe cachesare marked
10
Virtual Memory (GalvinNotes, 9th Ed.)
empty, andare markedfull as theyare used. Newrequests for space in the cache is first granted from emptyor partiallyemptyslabs, andif all
slabs are full, thenadditional slabs are allocated. Thisessentiallyamounts to allocating space for arrays of structures, in large chunks suitable
to the size of the structure beingstored. For example if a particular structure were 512 bytes long, space for them would be allocatedingroups
of 8 using4Kpages. If the structure were 3K, thenspace for 4 of them couldbe allocated at one time ina slabof 12Kusing three 4Kpages.
Benefits of slaballocationinclude lackof internal fragmentationandfast allocationof space for individual structures Solarisuses slaballocation
for the kernel and alsofor certain user-mode memoryallocations. Linux used the buddysystemprior to 2.2 andswitchedto slab allocation
since then.
Other Considerations:
 Prepaging: The basic idea behindprepaging is to predict the pages that will be needed inthe near future, and page them inbefore theyare
actuallyrequested. If a process was swapped out andwe know what its workingset wasat the time, thenwhenwe swapit back inwe cango
aheadandpage back in the entire working set, before the page faults actuallyoccur. Withsmall(data)files we cango aheadand prepage allof
the pagesat one time. Prepaging can be ofbenefit ifthe predictionis goodand the pages are neededeventually, but slows the system downif
the predictionis wrong.
 Page Size: There are quite a few trade-offs ofsmall versus large page sizes:Small pages waste less memorydue to internal fragmentation.
Large pages require smaller page tables. For disk access, the latencyandseek times greatlyoutweigh the actual data transfer times. Thismakes
it much faster to transfer one large page of data thantwo or more smaller pages containing the same amount of data. Smaller pages match
localitybetter, because we are not bringing in data that is not reallyneeded. Small pages generate more page faults, with attending overhead.
The physical hardware mayalsoplaya part in determiningpage size. It is hard to determine an"optimal" page size for anyg ivensystem.
Current norms range from4K to 4M, and tendtowards larger page sizes as time passes.
 TLB Reach: TLB Reachis defined as the amount of memorythat canbe reachedbythe pages listedinthe TLB. Ideallythe working set wouldfit
within the reach ofthe TLB. Increasing the size of the TLB is an obvious wayof increasing TLB reach, but TLB memoryis veryexpensive and also
draws lots of power. Increasingpage sizesincreasesTLB reach, but alsoleads to increasedfragmentationloss. Some systems provide multiple
size pagesto increase TLB reachwhile keeping fragmentationlow. Multiple page sizesrequires that the TLB be managed bysoftware, not
hardware.
 Program Structure: Consider a pair of nestedloops to access everyelement ina 1024 x 1024 two-dimensional arrayof 32-bit ints. Arrays in C
are storedinrow-major order, which means that eachrow of the arraywould occupya page of memory. If the loops are nested sothat the
outer loopincrements the row and the inner loopincrements the column, thenanentire row canbe processedbefore the next page fault,
yielding 1024 page faults total. On the other hand, if the loops are nested the other way, so that the program worked downth e columns
insteadof across the rows, theneveryaccess would be to a different page, yielding a new page fault for eachaccess, or over a millionpage
faults all together. Be aware that different languagesstore their arrays differently. FORTRAN for example storesarrays incolumn-major format
insteadof row-major. This means that blindtranslation ofcode fromone language to another mayturn a fast programintoa veryslow one,
strictlybecause ofthe extra page faults.
 I/O Interlock and Page Locking: There are severaloccasions whenit maybe desirable to lock pages in memory, and not let them get pagedout
— Certainkerneloperations cannot tolerate having their pages swapped out. IfanI/Ocontroller is doingdirect-memoryaccess, it would be
wrong to change pages in the middle of the I/O operation. Ina prioritybasedscheduling system, lowpriorityjobs mayneed to wait quite a
while before getting their turn onthe CPU, and there is a danger of their pages being pagedout before theyget a chance to use them even
once after paging them in. In this situation pages maybe lockedwhentheyare paged in, until the processthat requested themgets at least
one turn inthe CPU.
Operating-System Examples (Optional)
This section is onlyto consolidate your understanding andhelprevise the concepts inyour mindin a real-life case study. Just read throughit, noneed
to push yourself to memorize anything. Just mapmentallywhat youlearnt intothese realOS examples.
Windows:
 Windows uses demandpaging with clustering, meaning theypage in multiple pages whenever a page fault occurs.
 The working set minimumandmaximum are normallyset at 50 and345 pages respectively. (Maximums can be exceededinrare
circumstances.)
 Free pages are maintained ona free list, witha minimumthresholdindicatingwhenthere are enough free frames available.
 If a page fault occurs andthe processis below their maximum, thenadditional pages are allocated. Otherwise some pagesfrom thisprocess
must be replaced, using a local page replacement algorithm.
 If the amount offree frames falls below the allowable threshold, thenworking set trimming occurs, taking frames awayfroma nyprocesses
which are above their minimum, until all are at their minimums. Thenadditional framescanbe allocated to processesthat need them.
 The algorithmfor selecting victimframes depends onthe type of processor:
o On single processor 80x86 systems, a variationof the clock( secondchance ) algorithm is used.
11
Virtual Memory (GalvinNotes, 9th Ed.)
o On Alpha andmultiprocessor systems, clearing the reference bits mayrequire invalidating entriesinthe TLB on other processors,
which is anexpensive operation. Inthis case Windows uses a variationof FIFO.
Solaris:
 Solaris maintains a list of free pages, andallocatesone to a faulting
thread whenever a fault occurs. It is therefore imperative that a
minimum amount of free memorybe kept on handat all times.
 Solaris hasa parameter, lotsfree, usuallyset at 1/64 of total physical
memory. Solaris checks 4 times per secondto see if the free memory
falls belowthis threshhold, and if it does, then the pageout process is
started.
 Pageout uses a variationof the clock(secondchance) algorithm, with
two hands rotating around throughthe frame table. The first hand
clears the reference bits, andthe secondhandcomesbyafterwards and
checks them. Anyframe whose reference bit hasnot beenreset before
the second handgets there gets pagedout.
 The Pageout methodis adjustable bythe distance betweenthe two hands, (the handspan), andthe speedat which the hands move. For
example, if the hands each check100 frames per second, andthe handspanis 1000 frames, thenthere wouldbe a 10 secondinterval between
the time whenthe leadinghandclears the reference bits and the time when the trailing handchecks them.
 The speedof the hands is usuallyadjusted according to the amount of free memory, as shownbelow. Slowscan is usuallyset at 100 pagesper
second, and fastscanis usuallyset at the smaller of 1/2 of the total physical pages per second and 8192 pages per second.
 Solaris alsomaintains a cache of pages that have beenreclaimedbut whichhave not yet beenoverwritten, as opposedto the free list which
onlyholds pages whose current contents are invalid. If one of the pages fromthe cache is neededbefore it gets movedto the free list, thenit
can be quicklyrecovered.
 Normallypageout runs 4 timesper secondto check if memoryhas fallenbelow lotsfree. However if it falls belowdesfree, thenpageout will
run at 100 times per second inanattempt to keepat least desfree pages free. If it is unable to dothis for a 30-secondaverage, thenSolaris
begins swapping processes, starting preferablywithprocesses that have beenidle for a long time.
 If free memoryfallsbelow minfree, then pageout runs witheverypage fault.
 Recent releases ofSolaris have enhancedthe virtual memorymanagement system, includingrecognizing pages fromsharedlibraries, and
protecting themfrom beingpagedout.
12
Virtual Memory (GalvinNotes, 9th Ed.)
 Specifics:
 Linux-specific stuff
o XX
 Hardware-specific:
o XX
To be cleared
 Inverted Page Tables: Invertedpage tables store one entryfor eachframe instead ofone entryfor eachvirtual page. This reduces the memory
requirement for the page table, but loses the information neededto implement virtual memorypaging. A solutionis to keep a separate page
table for each process, for virtualmemorymanagement purposes. These are kept ondisk, andonlypaged inwhena page fault o ccurs. (i.e.
theyare not referencedwitheverymemoryaccess the waya traditional page table would be.)—Greyandinadequate as ofnow
Q’s Later
 XXX
Glossary
ReadLater
Further Reading
 Skipped: SharedMemoryinthe Win32 API (Memory-mapped filessection. There’s a figure there that says “Figure 9.26 Consumer reading from
sharedmemoryusingthe Win32 API”)

Grey Areas
 XXX
CHEW
 Whether the logicalpage size is equal to the physicalframe size (Yes!)
 Note that paging is like having a table ofrelocation registers, one for each page of the logicalmemory
 Page table entries (frame numbers) are typically32 bit numbers, allowing access to 2^32 physical page frames. Ifthose frames are 4 KB in size
each, that translates to 16 TB of addressable physical memory. (32 + 12 = 44 bits of physicaladdress space.)
 One optionis to use a set of registers for the page table. For example, the DECPDP-11 uses16-bit addressingand8 KB pages, resultinginonly
8 pages per process. (It takes13 bits to address8 KB of offset, leaving only3 bits to define a page number.)
 On page 12 of the lecture, do the TLB mathunder "(EighthEditionVersion:)". Required.
 More on TLB
 Apropos page 10 second bullet point of lecture, does it implicitlymean that the offset for bothpage number andframer number should be
same?
13
Virtual Memory (GalvinNotes, 9th Ed.)
 Page 15 VAXArchitecture divides 32-bit addresses into 4 equalsized sections, andeachpage is 512 bytes, yieldinganaddress form of:
 What are segmentationunit and paging unit?
 Can parts of a page table/ page directorybe swappedout too?

More Related Content

What's hot

Demand paging
Demand pagingDemand paging
Demand paging
Trinity Dwarka
 
Vmfs
VmfsVmfs
7 Ways To Crash Postgres
7 Ways To Crash Postgres7 Ways To Crash Postgres
7 Ways To Crash Postgres
PostgreSQL Experts, Inc.
 
Oracle NOLOGGING
Oracle NOLOGGINGOracle NOLOGGING
Oracle NOLOGGING
Franck Pachot
 
Shootout at the PAAS Corral
Shootout at the PAAS CorralShootout at the PAAS Corral
Shootout at the PAAS Corral
PostgreSQL Experts, Inc.
 
Tuning Linux for Databases.
Tuning Linux for Databases.Tuning Linux for Databases.
Tuning Linux for Databases.
Alexey Lesovsky
 
Data replication
Data replicationData replication
Data replication
ssuser1eca7d
 
Streaming replication in practice
Streaming replication in practiceStreaming replication in practice
Streaming replication in practice
Alexey Lesovsky
 
Elephant Roads: a tour of Postgres forks
Elephant Roads: a tour of Postgres forksElephant Roads: a tour of Postgres forks
Elephant Roads: a tour of Postgres forks
Command Prompt., Inc
 
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 ViennaAutovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
PostgreSQL-Consulting
 
Feed Burner Scalability
Feed Burner ScalabilityFeed Burner Scalability
Feed Burner Scalability
didip
 
TokuDB internals / Лесин Владислав (Percona)
TokuDB internals / Лесин Владислав (Percona)TokuDB internals / Лесин Владислав (Percona)
TokuDB internals / Лесин Владислав (Percona)
Ontico
 
PostgreSQL Performance Tables Partitioning vs. Aggregated Data Tables
PostgreSQL Performance Tables Partitioning vs. Aggregated Data TablesPostgreSQL Performance Tables Partitioning vs. Aggregated Data Tables
PostgreSQL Performance Tables Partitioning vs. Aggregated Data Tables
Sperasoft
 
PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Replication in 10  Minutes - SCALEPostgreSQL Replication in 10  Minutes - SCALE
PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Experts, Inc.
 
Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4
Denish Patel
 

What's hot (17)

Demand paging
Demand pagingDemand paging
Demand paging
 
Vmfs
VmfsVmfs
Vmfs
 
7 Ways To Crash Postgres
7 Ways To Crash Postgres7 Ways To Crash Postgres
7 Ways To Crash Postgres
 
Oracle NOLOGGING
Oracle NOLOGGINGOracle NOLOGGING
Oracle NOLOGGING
 
Shootout at the PAAS Corral
Shootout at the PAAS CorralShootout at the PAAS Corral
Shootout at the PAAS Corral
 
GUC Tutorial Package (9.0)
GUC Tutorial Package (9.0)GUC Tutorial Package (9.0)
GUC Tutorial Package (9.0)
 
Tuning Linux for Databases.
Tuning Linux for Databases.Tuning Linux for Databases.
Tuning Linux for Databases.
 
Data replication
Data replicationData replication
Data replication
 
VIRTUAL MEMORY
VIRTUAL MEMORYVIRTUAL MEMORY
VIRTUAL MEMORY
 
Streaming replication in practice
Streaming replication in practiceStreaming replication in practice
Streaming replication in practice
 
Elephant Roads: a tour of Postgres forks
Elephant Roads: a tour of Postgres forksElephant Roads: a tour of Postgres forks
Elephant Roads: a tour of Postgres forks
 
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 ViennaAutovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
 
Feed Burner Scalability
Feed Burner ScalabilityFeed Burner Scalability
Feed Burner Scalability
 
TokuDB internals / Лесин Владислав (Percona)
TokuDB internals / Лесин Владислав (Percona)TokuDB internals / Лесин Владислав (Percona)
TokuDB internals / Лесин Владислав (Percona)
 
PostgreSQL Performance Tables Partitioning vs. Aggregated Data Tables
PostgreSQL Performance Tables Partitioning vs. Aggregated Data TablesPostgreSQL Performance Tables Partitioning vs. Aggregated Data Tables
PostgreSQL Performance Tables Partitioning vs. Aggregated Data Tables
 
PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Replication in 10  Minutes - SCALEPostgreSQL Replication in 10  Minutes - SCALE
PostgreSQL Replication in 10 Minutes - SCALE
 
Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4
 

Viewers also liked

Introduction to CFD FORTRAN code
Introduction to CFD FORTRAN codeIntroduction to CFD FORTRAN code
Introduction to CFD FORTRAN codeBehnam Bozorgmehr
 
Main memoryfinal
Main memoryfinalMain memoryfinal
Main memoryfinal
marangburu42
 
How Spain’s Mobile Experience Leverages HPE Location Services to Enrich the M...
How Spain’s Mobile Experience Leverages HPE Location Services to Enrich the M...How Spain’s Mobile Experience Leverages HPE Location Services to Enrich the M...
How Spain’s Mobile Experience Leverages HPE Location Services to Enrich the M...
Dana Gardner
 
Cpu scheduling final
Cpu scheduling finalCpu scheduling final
Cpu scheduling final
marangburu42
 
Process synchronizationfinal
Process synchronizationfinalProcess synchronizationfinal
Process synchronizationfinal
marangburu42
 
Deadlocks final
Deadlocks finalDeadlocks final
Deadlocks final
marangburu42
 
Internet of Things in 2025
Internet of Things in 2025Internet of Things in 2025
Internet of Things in 2025
Alex Danvy
 
3 клас
 3 клас 3 клас
3 клас
Julja Chernenko
 
RABIES IN TUNISIA:Evolution and result of « National Program of Rabies Control »
RABIES IN TUNISIA:Evolution and result of « National Program of Rabies Control »RABIES IN TUNISIA:Evolution and result of « National Program of Rabies Control »
RABIES IN TUNISIA:Evolution and result of « National Program of Rabies Control »
Pasteur_Tunis
 
GOTO LONDON 2016: Concursus Event sourcing Evolved (Updated)
GOTO LONDON 2016: Concursus Event sourcing Evolved (Updated)GOTO LONDON 2016: Concursus Event sourcing Evolved (Updated)
GOTO LONDON 2016: Concursus Event sourcing Evolved (Updated)
OpenCredo
 

Viewers also liked (10)

Introduction to CFD FORTRAN code
Introduction to CFD FORTRAN codeIntroduction to CFD FORTRAN code
Introduction to CFD FORTRAN code
 
Main memoryfinal
Main memoryfinalMain memoryfinal
Main memoryfinal
 
How Spain’s Mobile Experience Leverages HPE Location Services to Enrich the M...
How Spain’s Mobile Experience Leverages HPE Location Services to Enrich the M...How Spain’s Mobile Experience Leverages HPE Location Services to Enrich the M...
How Spain’s Mobile Experience Leverages HPE Location Services to Enrich the M...
 
Cpu scheduling final
Cpu scheduling finalCpu scheduling final
Cpu scheduling final
 
Process synchronizationfinal
Process synchronizationfinalProcess synchronizationfinal
Process synchronizationfinal
 
Deadlocks final
Deadlocks finalDeadlocks final
Deadlocks final
 
Internet of Things in 2025
Internet of Things in 2025Internet of Things in 2025
Internet of Things in 2025
 
3 клас
 3 клас 3 клас
3 клас
 
RABIES IN TUNISIA:Evolution and result of « National Program of Rabies Control »
RABIES IN TUNISIA:Evolution and result of « National Program of Rabies Control »RABIES IN TUNISIA:Evolution and result of « National Program of Rabies Control »
RABIES IN TUNISIA:Evolution and result of « National Program of Rabies Control »
 
GOTO LONDON 2016: Concursus Event sourcing Evolved (Updated)
GOTO LONDON 2016: Concursus Event sourcing Evolved (Updated)GOTO LONDON 2016: Concursus Event sourcing Evolved (Updated)
GOTO LONDON 2016: Concursus Event sourcing Evolved (Updated)
 

Similar to Virtualmemorypre final-formatting-161019022904

Virtualmemoryfinal 161019175858
Virtualmemoryfinal 161019175858Virtualmemoryfinal 161019175858
Virtualmemoryfinal 161019175858
marangburu42
 
Mca ii os u-4 memory management
Mca  ii  os u-4 memory managementMca  ii  os u-4 memory management
Mca ii os u-4 memory management
Rai University
 
CH09.pdf
CH09.pdfCH09.pdf
CH09.pdf
ImranKhan880955
 
Virtual memory - Demand Paging
Virtual memory - Demand PagingVirtual memory - Demand Paging
Virtual memory - Demand Paging
jeyaperumal
 
virtual memory
virtual memoryvirtual memory
virtual memory
Abeer Naskar
 
LRU_Replacement-Policy.pdf
LRU_Replacement-Policy.pdfLRU_Replacement-Policy.pdf
Virtual memory managment
Virtual memory managmentVirtual memory managment
Virtual memory managment
Santu Kumar
 
Virtual memory
Virtual memoryVirtual memory
Virtual memory
lodhran-hayat
 
Ch10 OS
Ch10 OSCh10 OS
Ch10 OSC.U
 
Chapter 9 - Virtual Memory
Chapter 9 - Virtual MemoryChapter 9 - Virtual Memory
Chapter 9 - Virtual Memory
Wayne Jones Jnr
 
Virtual memory This is the operating system ppt.ppt
Virtual memory This is the operating system ppt.pptVirtual memory This is the operating system ppt.ppt
Virtual memory This is the operating system ppt.ppt
ry54321288
 
381 ccs chapter7_updated(1)
381 ccs chapter7_updated(1)381 ccs chapter7_updated(1)
381 ccs chapter7_updated(1)
Rabie Masoud
 
Distributed Operating System_3
Distributed Operating System_3Distributed Operating System_3
Distributed Operating System_3
Dr Sandeep Kumar Poonia
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating System
Rashmi Bhat
 
VirutualMemory.docx
VirutualMemory.docxVirutualMemory.docx
VirutualMemory.docx
rangarajansoft69
 
Unit 2chapter 2 memory mgmt complete
Unit 2chapter 2  memory mgmt completeUnit 2chapter 2  memory mgmt complete
Unit 2chapter 2 memory mgmt complete
Kalai Selvi
 
Virtual Memory
Virtual MemoryVirtual Memory
Virtual Memory
vampugani
 
Virtual Memory Management
Virtual Memory ManagementVirtual Memory Management
Virtual Memory Management
Rahul Jamwal
 

Similar to Virtualmemorypre final-formatting-161019022904 (20)

Virtualmemoryfinal 161019175858
Virtualmemoryfinal 161019175858Virtualmemoryfinal 161019175858
Virtualmemoryfinal 161019175858
 
Mca ii os u-4 memory management
Mca  ii  os u-4 memory managementMca  ii  os u-4 memory management
Mca ii os u-4 memory management
 
CH09.pdf
CH09.pdfCH09.pdf
CH09.pdf
 
Virtual memory - Demand Paging
Virtual memory - Demand PagingVirtual memory - Demand Paging
Virtual memory - Demand Paging
 
virtual memory
virtual memoryvirtual memory
virtual memory
 
LRU_Replacement-Policy.pdf
LRU_Replacement-Policy.pdfLRU_Replacement-Policy.pdf
LRU_Replacement-Policy.pdf
 
Virtual memory managment
Virtual memory managmentVirtual memory managment
Virtual memory managment
 
Virtual memory
Virtual memoryVirtual memory
Virtual memory
 
OS_Ch10
OS_Ch10OS_Ch10
OS_Ch10
 
Ch10 OS
Ch10 OSCh10 OS
Ch10 OS
 
OSCh10
OSCh10OSCh10
OSCh10
 
Chapter 9 - Virtual Memory
Chapter 9 - Virtual MemoryChapter 9 - Virtual Memory
Chapter 9 - Virtual Memory
 
Virtual memory This is the operating system ppt.ppt
Virtual memory This is the operating system ppt.pptVirtual memory This is the operating system ppt.ppt
Virtual memory This is the operating system ppt.ppt
 
381 ccs chapter7_updated(1)
381 ccs chapter7_updated(1)381 ccs chapter7_updated(1)
381 ccs chapter7_updated(1)
 
Distributed Operating System_3
Distributed Operating System_3Distributed Operating System_3
Distributed Operating System_3
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating System
 
VirutualMemory.docx
VirutualMemory.docxVirutualMemory.docx
VirutualMemory.docx
 
Unit 2chapter 2 memory mgmt complete
Unit 2chapter 2  memory mgmt completeUnit 2chapter 2  memory mgmt complete
Unit 2chapter 2 memory mgmt complete
 
Virtual Memory
Virtual MemoryVirtual Memory
Virtual Memory
 
Virtual Memory Management
Virtual Memory ManagementVirtual Memory Management
Virtual Memory Management
 

More from marangburu42

Hol
HolHol
Write miss
Write missWrite miss
Write miss
marangburu42
 
Hennchthree 161102111515
Hennchthree 161102111515Hennchthree 161102111515
Hennchthree 161102111515
marangburu42
 
Hennchthree
HennchthreeHennchthree
Hennchthree
marangburu42
 
Hennchthree
HennchthreeHennchthree
Hennchthree
marangburu42
 
Sequential circuits
Sequential circuitsSequential circuits
Sequential circuits
marangburu42
 
Combinational circuits
Combinational circuitsCombinational circuits
Combinational circuits
marangburu42
 
Hennchthree 160912095304
Hennchthree 160912095304Hennchthree 160912095304
Hennchthree 160912095304
marangburu42
 
Sequential circuits
Sequential circuitsSequential circuits
Sequential circuits
marangburu42
 
Combinational circuits
Combinational circuitsCombinational circuits
Combinational circuits
marangburu42
 
Karnaugh mapping allaboutcircuits
Karnaugh mapping allaboutcircuitsKarnaugh mapping allaboutcircuits
Karnaugh mapping allaboutcircuits
marangburu42
 
Aac boolean formulae
Aac   boolean formulaeAac   boolean formulae
Aac boolean formulae
marangburu42
 
Io systems final
Io systems finalIo systems final
Io systems final
marangburu42
 
File system interfacefinal
File system interfacefinalFile system interfacefinal
File system interfacefinal
marangburu42
 
File systemimplementationfinal
File systemimplementationfinalFile systemimplementationfinal
File systemimplementationfinal
marangburu42
 
Mass storage structurefinal
Mass storage structurefinalMass storage structurefinal
Mass storage structurefinal
marangburu42
 
All aboutcircuits karnaugh maps
All aboutcircuits karnaugh mapsAll aboutcircuits karnaugh maps
All aboutcircuits karnaugh maps
marangburu42
 
Mainmemoryfinal 161019122029
Mainmemoryfinal 161019122029Mainmemoryfinal 161019122029
Mainmemoryfinal 161019122029
marangburu42
 
Mass storagestructure pre-final-formatting
Mass storagestructure pre-final-formattingMass storagestructure pre-final-formatting
Mass storagestructure pre-final-formatting
marangburu42
 
Mainmemoryfinalprefinal 160927115742
Mainmemoryfinalprefinal 160927115742Mainmemoryfinalprefinal 160927115742
Mainmemoryfinalprefinal 160927115742
marangburu42
 

More from marangburu42 (20)

Hol
HolHol
Hol
 
Write miss
Write missWrite miss
Write miss
 
Hennchthree 161102111515
Hennchthree 161102111515Hennchthree 161102111515
Hennchthree 161102111515
 
Hennchthree
HennchthreeHennchthree
Hennchthree
 
Hennchthree
HennchthreeHennchthree
Hennchthree
 
Sequential circuits
Sequential circuitsSequential circuits
Sequential circuits
 
Combinational circuits
Combinational circuitsCombinational circuits
Combinational circuits
 
Hennchthree 160912095304
Hennchthree 160912095304Hennchthree 160912095304
Hennchthree 160912095304
 
Sequential circuits
Sequential circuitsSequential circuits
Sequential circuits
 
Combinational circuits
Combinational circuitsCombinational circuits
Combinational circuits
 
Karnaugh mapping allaboutcircuits
Karnaugh mapping allaboutcircuitsKarnaugh mapping allaboutcircuits
Karnaugh mapping allaboutcircuits
 
Aac boolean formulae
Aac   boolean formulaeAac   boolean formulae
Aac boolean formulae
 
Io systems final
Io systems finalIo systems final
Io systems final
 
File system interfacefinal
File system interfacefinalFile system interfacefinal
File system interfacefinal
 
File systemimplementationfinal
File systemimplementationfinalFile systemimplementationfinal
File systemimplementationfinal
 
Mass storage structurefinal
Mass storage structurefinalMass storage structurefinal
Mass storage structurefinal
 
All aboutcircuits karnaugh maps
All aboutcircuits karnaugh mapsAll aboutcircuits karnaugh maps
All aboutcircuits karnaugh maps
 
Mainmemoryfinal 161019122029
Mainmemoryfinal 161019122029Mainmemoryfinal 161019122029
Mainmemoryfinal 161019122029
 
Mass storagestructure pre-final-formatting
Mass storagestructure pre-final-formattingMass storagestructure pre-final-formatting
Mass storagestructure pre-final-formatting
 
Mainmemoryfinalprefinal 160927115742
Mainmemoryfinalprefinal 160927115742Mainmemoryfinalprefinal 160927115742
Mainmemoryfinalprefinal 160927115742
 

Recently uploaded

Memory Rental Store - The Chase (Storyboard)
Memory Rental Store - The Chase (Storyboard)Memory Rental Store - The Chase (Storyboard)
Memory Rental Store - The Chase (Storyboard)
SuryaKalyan3
 
一比一原版(QUT毕业证)昆士兰科技大学毕业证成绩单如何办理
一比一原版(QUT毕业证)昆士兰科技大学毕业证成绩单如何办理一比一原版(QUT毕业证)昆士兰科技大学毕业证成绩单如何办理
一比一原版(QUT毕业证)昆士兰科技大学毕业证成绩单如何办理
zeyhe
 
Fed by curiosity and beauty - Remembering Myrsine Zorba
Fed by curiosity and beauty - Remembering Myrsine ZorbaFed by curiosity and beauty - Remembering Myrsine Zorba
Fed by curiosity and beauty - Remembering Myrsine Zorba
mariavlachoupt
 
2137ad - Characters that live in Merindol and are at the center of main stories
2137ad - Characters that live in Merindol and are at the center of main stories2137ad - Characters that live in Merindol and are at the center of main stories
2137ad - Characters that live in Merindol and are at the center of main stories
luforfor
 
2137ad Merindol Colony Interiors where refugee try to build a seemengly norm...
2137ad  Merindol Colony Interiors where refugee try to build a seemengly norm...2137ad  Merindol Colony Interiors where refugee try to build a seemengly norm...
2137ad Merindol Colony Interiors where refugee try to build a seemengly norm...
luforfor
 
Codes n Conventionss copy (2).pptx new new
Codes n Conventionss copy (2).pptx new newCodes n Conventionss copy (2).pptx new new
Codes n Conventionss copy (2).pptx new new
ZackSpencer3
 
IrishWritersCtrsPersonalEssaysMay29.pptx
IrishWritersCtrsPersonalEssaysMay29.pptxIrishWritersCtrsPersonalEssaysMay29.pptx
IrishWritersCtrsPersonalEssaysMay29.pptx
Aine Greaney Ellrott
 
一比一原版(UniSA毕业证)南澳大学毕业证成绩单如何办理
一比一原版(UniSA毕业证)南澳大学毕业证成绩单如何办理一比一原版(UniSA毕业证)南澳大学毕业证成绩单如何办理
一比一原版(UniSA毕业证)南澳大学毕业证成绩单如何办理
zeyhe
 
ashokathegreat project class 12 presentation
ashokathegreat project class 12 presentationashokathegreat project class 12 presentation
ashokathegreat project class 12 presentation
aditiyad2020
 
Inter-Dimensional Girl Boards Segment (Act 3)
Inter-Dimensional Girl Boards Segment (Act 3)Inter-Dimensional Girl Boards Segment (Act 3)
Inter-Dimensional Girl Boards Segment (Act 3)
CristianMestre
 
A Brief Introduction About Hadj Ounis
A Brief  Introduction  About  Hadj OunisA Brief  Introduction  About  Hadj Ounis
A Brief Introduction About Hadj Ounis
Hadj Ounis
 
一比一原版(GU毕业证)格里菲斯大学毕业证成绩单
一比一原版(GU毕业证)格里菲斯大学毕业证成绩单一比一原版(GU毕业证)格里菲斯大学毕业证成绩单
一比一原版(GU毕业证)格里菲斯大学毕业证成绩单
zvaywau
 
Memory Rental Store - The Ending(Storyboard)
Memory Rental Store - The Ending(Storyboard)Memory Rental Store - The Ending(Storyboard)
Memory Rental Store - The Ending(Storyboard)
SuryaKalyan3
 
一比一原版(DU毕业证)迪肯大学毕业证成绩单
一比一原版(DU毕业证)迪肯大学毕业证成绩单一比一原版(DU毕业证)迪肯大学毕业证成绩单
一比一原版(DU毕业证)迪肯大学毕业证成绩单
zvaywau
 
acting board rough title here lolaaaaaaa
acting board rough title here lolaaaaaaaacting board rough title here lolaaaaaaa
acting board rough title here lolaaaaaaa
angelicafronda7
 
一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理
一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理
一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理
taqyed
 
The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎
The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎
The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎
iraqartsandculture
 
ART FORMS OF KERALA: TRADITIONAL AND OTHERS
ART FORMS OF KERALA: TRADITIONAL AND OTHERSART FORMS OF KERALA: TRADITIONAL AND OTHERS
ART FORMS OF KERALA: TRADITIONAL AND OTHERS
Sandhya J.Nair
 
Caffeinated Pitch Bible- developed by Claire Wilson
Caffeinated Pitch Bible- developed by Claire WilsonCaffeinated Pitch Bible- developed by Claire Wilson
Caffeinated Pitch Bible- developed by Claire Wilson
ClaireWilson398082
 

Recently uploaded (19)

Memory Rental Store - The Chase (Storyboard)
Memory Rental Store - The Chase (Storyboard)Memory Rental Store - The Chase (Storyboard)
Memory Rental Store - The Chase (Storyboard)
 
一比一原版(QUT毕业证)昆士兰科技大学毕业证成绩单如何办理
一比一原版(QUT毕业证)昆士兰科技大学毕业证成绩单如何办理一比一原版(QUT毕业证)昆士兰科技大学毕业证成绩单如何办理
一比一原版(QUT毕业证)昆士兰科技大学毕业证成绩单如何办理
 
Fed by curiosity and beauty - Remembering Myrsine Zorba
Fed by curiosity and beauty - Remembering Myrsine ZorbaFed by curiosity and beauty - Remembering Myrsine Zorba
Fed by curiosity and beauty - Remembering Myrsine Zorba
 
2137ad - Characters that live in Merindol and are at the center of main stories
2137ad - Characters that live in Merindol and are at the center of main stories2137ad - Characters that live in Merindol and are at the center of main stories
2137ad - Characters that live in Merindol and are at the center of main stories
 
2137ad Merindol Colony Interiors where refugee try to build a seemengly norm...
2137ad  Merindol Colony Interiors where refugee try to build a seemengly norm...2137ad  Merindol Colony Interiors where refugee try to build a seemengly norm...
2137ad Merindol Colony Interiors where refugee try to build a seemengly norm...
 
Codes n Conventionss copy (2).pptx new new
Codes n Conventionss copy (2).pptx new newCodes n Conventionss copy (2).pptx new new
Codes n Conventionss copy (2).pptx new new
 
IrishWritersCtrsPersonalEssaysMay29.pptx
IrishWritersCtrsPersonalEssaysMay29.pptxIrishWritersCtrsPersonalEssaysMay29.pptx
IrishWritersCtrsPersonalEssaysMay29.pptx
 
一比一原版(UniSA毕业证)南澳大学毕业证成绩单如何办理
一比一原版(UniSA毕业证)南澳大学毕业证成绩单如何办理一比一原版(UniSA毕业证)南澳大学毕业证成绩单如何办理
一比一原版(UniSA毕业证)南澳大学毕业证成绩单如何办理
 
ashokathegreat project class 12 presentation
ashokathegreat project class 12 presentationashokathegreat project class 12 presentation
ashokathegreat project class 12 presentation
 
Inter-Dimensional Girl Boards Segment (Act 3)
Inter-Dimensional Girl Boards Segment (Act 3)Inter-Dimensional Girl Boards Segment (Act 3)
Inter-Dimensional Girl Boards Segment (Act 3)
 
A Brief Introduction About Hadj Ounis
A Brief  Introduction  About  Hadj OunisA Brief  Introduction  About  Hadj Ounis
A Brief Introduction About Hadj Ounis
 
一比一原版(GU毕业证)格里菲斯大学毕业证成绩单
一比一原版(GU毕业证)格里菲斯大学毕业证成绩单一比一原版(GU毕业证)格里菲斯大学毕业证成绩单
一比一原版(GU毕业证)格里菲斯大学毕业证成绩单
 
Memory Rental Store - The Ending(Storyboard)
Memory Rental Store - The Ending(Storyboard)Memory Rental Store - The Ending(Storyboard)
Memory Rental Store - The Ending(Storyboard)
 
一比一原版(DU毕业证)迪肯大学毕业证成绩单
一比一原版(DU毕业证)迪肯大学毕业证成绩单一比一原版(DU毕业证)迪肯大学毕业证成绩单
一比一原版(DU毕业证)迪肯大学毕业证成绩单
 
acting board rough title here lolaaaaaaa
acting board rough title here lolaaaaaaaacting board rough title here lolaaaaaaa
acting board rough title here lolaaaaaaa
 
一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理
一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理
一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理
 
The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎
The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎
The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎
 
ART FORMS OF KERALA: TRADITIONAL AND OTHERS
ART FORMS OF KERALA: TRADITIONAL AND OTHERSART FORMS OF KERALA: TRADITIONAL AND OTHERS
ART FORMS OF KERALA: TRADITIONAL AND OTHERS
 
Caffeinated Pitch Bible- developed by Claire Wilson
Caffeinated Pitch Bible- developed by Claire WilsonCaffeinated Pitch Bible- developed by Claire Wilson
Caffeinated Pitch Bible- developed by Claire Wilson
 

Virtualmemorypre final-formatting-161019022904

  • 1. 1 Virtual Memory (GalvinNotes, 9th Ed.) Chapter 9: Virtual Memory Chapter Objectives  To describe the benefits of a virtual memory system.  To explain the concepts of demand paging, page-replacement algorithms, and allocation of page frames.  To discuss the principles of the working-set model.  To examine the relationship between shared memory and memory-mapped files.  To explore how kernel memory is managed. Outline  Background (About preceding sections, concept of a process not having all of its pages in memory, virtual memory concept, virtual address space, shared memoryusing virtual memory)  Demand Paging: o Basic concepts o Performance of Demand Paging  Copy-on-Write  Page Replacement: o Basic Page Replacement o FIFO Page Replacement o Optimal Page Replacement o LRU Page Replacement (Algorithms: Additional-Reference-Bits, Second-Chance, Enhanced Second-Chance, Counting- based, Page-Buffering, Applications and Page Replacement)  Allocation of Frames: o Minimum number of frames o Allocation Algorithms o Global vs Local Allocation o Non-Uniform Memory Access  Thrashing: o Cause of Thrashing o Locality Model o Working-Set Model o Page Fault Frequency  Memory-Mapped Files: o Basic Mechanism o Shared Memory in the Win32 API o Memory-Mapped I/O  Allocating Kernel Memory: o Buddy system o Slab Allocation  Other Considerations: Prepaging, Page size, TLB Reach, Inverted Page Tables, Program Structure, I/O Interlock and Page Locking  OS examples (Optional): Windows, Solaris Content  Precedingsections talkedabout howto avoid memoryfragmentation bybreaking process memoryrequirements downintosmaller bites (pages), and storing the pages non-contiguouslyinmemory.  Most real processesdo not needalltheir pages, or at least not allat once, for several reasons:Error handling code is not neededunless that specific error occurs, some ofwhichare quite rare. Arrays are oftenover-sizedfor worst-casescenarios, andonlya smallfractionof the arrays are actuallyusedin practice. Certain features of certainprograms are rarelyused such as the routine to balance the federalbudget. (Me thinks this holds the keyto the larger-than-physical virtual memoryconcept)  The abilityto load onlythe portions ofprocesses that were actuallyneeded(andonlywhen theywere needed) has several benefits:Programs could be writtenfor a much larger address space (virtual memoryspace) than physicallye xists onthe computer. Because eachprocessis only
  • 2. 2 Virtual Memory (GalvinNotes, 9th Ed.) using a fractionof their total address space, there is more memoryleft for other programs, improvingCPU utilization and system throughput. Less I/O is neededfor swappingprocesses inandout of RAM, s peeding things up. (Fig9.1 show layout of VM)  Figure 9.2 shows virtual address space, whichis the programmer’s logical view of process memorystorage. The actual physical layout is controlledbythe process's page table. Note that the address space shown inFigure 9.2 is sparse - A great hole inthe middle of the address space is never used, unlessthe stackand/or the heap grow to fill the hole.   Virtual memoryalsoallows the sharing offiles andmemorybymultiple processes, with several benefits:#System libraries can be sharedbymapping them into the virtual addressspace of more thanone process. #Processes can alsoshare virtualmemorybymapping the same block ofmemoryto more thanone process. #Process pagescan be sharedduringa fork() system call, eliminating the needto copyallof the pages of the original (parent) process. DEMAND PAGING  The basic idea behind demand paging is that when a process is swappedin, its pages are not swapped in all at once. Rather theyare swappedinonlywhen the process needs them. ( on demand. ) This is termed a lazyswapper.  The basic idea behindpaging is that whena process is swappedin, the pager onlyloads intomemory those pagesthat it expects the processto need (right away.) Pages that are not loaded into memoryare marked as invalid inthe page table, using the invalid bit. (The rest of the page table entrymayeither be blankor containinformation about where to find the swapped-out page onthe hard drive.) If the process onlyever accesses pages that are loaded inmemory(memoryresident pages), thenthe processruns exactlyas if all the pages were loadedinto memory.  On the other hand, ifa page is neededthat wasnot originallyloadedup, thena page fault trapis generated, which must be handled ina series of steps:The memoryaddressrequested is first checked, to make sure it wasa valid memoryrequest. If the reference was invalid, the process is terminated. Otherwise, the page must be paged in. A free frame is located, possiblyfrom a free-frame list. A diskoperationis scheduled to bring
  • 3. 3 Virtual Memory (GalvinNotes, 9th Ed.) in the necessarypage from disk. (This willusuallyblock the process onan I/O wait, allowingsome other processto use the CPU in the meantime.) When the I/O operationis complete, the process's page table is updatedwith the newframe number, and the invalidbit is changedto indicate that this is now a validpage reference. The instruction that causedthe page fault must nowbe restartedfromthe beginning, (as soonas this process gets another turnon the CPU.)  In an extreme case, NO pagesare swapped infor a process until theyare requestedbypage faults. Thisis knownas pure demand paging.  In theoryeachinstruction couldgenerate multiple page faults. Inpractice this is veryrare, due to localityof reference, coveredin section 9.6.1.  The hardware necessaryto support virtual memoryis the same as for pagingand swapping:A page table andsecondarymemory. (Swap space, whose allocationis discussedinchapter 12.)  A crucial part of the processis that the instruction must be restartedfromscratchonce the desired page hasbeen made available in memory. For most simple instructions this is not a major difficulty. However there are some architectures that allow a single instructionto modifya fairlylarge block of data, (which mayspana page boundary), andifsome of the data gets modifiedbefore the page fault occurs, this couldcause problems. One solutionis to access bothends of the block before executing the instruction, guaranteeing that the necessarypagesget pagedinbefore the instruction begins.  Performance of Demand Paging: There are manysteps that occur whenservicing a page fault (see bookfor full details), andsome of the steps are optional or variable. But just for the sake ofdiscussion, suppose that a normalmemoryaccess requires 200 nanoseconds, andthat servicing a page fault takes 8 milliseconds. (8,000,000 nanoseconds, or 40,000 times a normal memoryaccess.)Witha page fault rate ofp, (ona scale from 0 to 1), the effective access time is now: (1 - p) * (200) + p * 8000000 = 200 + 7,999,800 * p which clearlydepends heavilyon p! Even if onlyone access in 1000 causes a page fault, the effective access time drops from200 nanoseconds to 8.2 microseconds, a slowdownof a factor of 40 times. In order to keep the slowdownless than10%, the page fault rate must be less than0.0000025, or one in399,990 accesses.  A subtletyis that swapspace is faster to access thanthe regular file system, because it does not have to go throughthe wh ole directorystructure. For this reasonsome systems will transfer anentire processfrom the file system to swapspace before starting upthe process, so that future pagingalloccurs fromthe (relatively) faster swapspace.  Some systems use demandpaging directlyfrom the file system for binarycode (which never changes andhence doesnot have to be storedona page operation), andto reserve the swapspace for data segments that must be stored. This approachis used byboth SolarisandBSD Unix. Copy-on-Write:  The idea behinda copy-on-write forkis that the pages for a parent process do not have to be actuallycopiedfor the childuntil one or the other of the processes changesthe page. Theycanbe simplysharedbetweenthe two processesinthe meantime, with a bit set that the page needs to be copied if it ever gets writtento. This is a reasonable approach, since the childprocessusuallyissues an exec() system call immediatelyafter the fork (Last line grey). Obviouslyonlypages that can be modifiedevenneed to be labeledas copy-on-write. Code segments cansimplybe shared. Pages usedto satisfycopy-on-write duplications are typicallyallocated using zero-fill-on-demand, meaning that their previous contents are zeroed out before the copyproceeds.  Some systems provide analternative to the fork()systemcall calleda virtual memory fork, vfork(). In this case the parent is suspended, andthe childuses the parent's memorypages. Thisis veryfast for process creation, but requires that the childnot modifyanyof the sharedmemory pages before performingthe exec()system call. (Inessence thisaddresses the questionof whichprocessexecutesfirst after a call to fork, the parent or the child. Withvfork, the parent is suspended, allowing the child to execute first until it calls exec(), sharing pages withthe parent in the meantime.) Page Replacement  In order to make the most use ofvirtualmemory, we loadseveral processes intomemoryat the same time. Since we onlyloadthe pages that are actuallyneededbyeachprocess at anygiven time, there is room to loadmanymore processes than if we hadto loadinthe entire process.  Memoryis alsoneeded for other purposes (suchas I/O buffering), andif some process suddenlydecides it needs more pages and there aren't anyfree frames available, thenthere are several possible solutions to consider:
  • 4. 4 Virtual Memory (GalvinNotes, 9th Ed.) o Adjust the memoryusedbyI/O buffering, etc., to free upsome frames for user processes. The decisionof how to allocate memoryfor I/O versus user processes is a complex one, yieldingdifferent policieson different systems. (Some allocate a fixedamount for I /O, andothers let the I/O systemcontendfor memoryalongwitheverything else.) o Put the process requestingmore pagesintoa wait queue until some free frames become available. o Swap some processout of memorycompletely, freeing upits page frames. o Find some page inmemorythat isn't being usedright now, and swap that page onlyout to disk, freeing upa frame that canbe allocated to the processrequestingit. This is knownas page replacement, andis the most common solution. There are manydifferent algorithms for page replacement, whichis the subject of the remainder of this section. Basic Page Replacement:  The previouslydiscussed page-fault processing assumedthat there would be free framesavailable on the free-frame list. Nowthe page-fault handling must be modifiedto free up a frame ifnecessary, as follows: 1. Find the locationof the desired page onthe disk, either in swapspace or inthe file system. 2. Find a free frame: a) If there is a free frame, use it. b)If there is nofree frame, use a page-replacement algorithmto select anexistingframe to be replaced, known as the victim frame. c) Write the victim frame to disk. Change all related page tables to indicate that this page is nolonger in memory. 3. Read inthe desiredpage and store it inthe frame. Adjust all relatedpage andframe tables to indicate the change. 4. Restart the process that was waitingfor this page  Note that step 2c adds anextra disk write to the page-fault handling, effectivelydoublingthe time requiredto processa page fault. This can be alleviatedsomewhat byassigninga modify bit, or dirty bit to each page, indicatingwhether or not it has beenchangedsince it was last loadedin from disk. If the dirtybit has not beenset, thenthe page is unchanged,and does not needto be writtenout to disk. Otherwise the page write is required. It shouldcome as nosurprise that manypage replacement strategiesspecificallylook for pages that do not have their dirtybit set, and preferentiallyselect cleanpages as victimpages. It should alsobe obvious that unmodifiable code pages never get their dirtybits set.  There are two major requirements to implement a successful demandpaging system. We must developa frame-allocationalgorithm and a page-replacement algorithm. The former centers around how manyframes are allocatedto each process (and to other needs), andthe latter dealswithhow to select a page for replacement whenthere are no free frames available. The overallgoalinselectingandtuningthese algorithms is to generate the fewest number of overallpage faults. (Because diskaccessis soslowrelative to memoryaccess, evenslight improvements to these algorithms can yieldlarge improvements inoverall systemperformance.)  Algorithms are evaluatedusinga given string of memoryaccesses known as a reference string, which canbe generatedinone of ( at least ) three common ways: o Randomlygenerated, either evenlydistributed or withsome distributioncurve based onobserved system behavior. This is the fastest andeasiest approach, but maynot reflect real performance well, as it ignores localityof reference. o Specificallydesigned sequences. These are useful for illustrating the properties of comparative algorithms inpublished papers and textbooks, ( and alsofor homeworkandexam problems. :-) ) o Recorded memoryreferences from a live system. This maybe the best approach, but the amount ofdata collectedcan be enormous, on the order of a millionaddresses per second. The volume of collecteddata canbe reducedbymaking two important observations:  Onlythe page number that was accessedis relevant. The offset within that page does not affect pagingoperations.  Successive accesses within the same page can be treated as a single page request, because allrequests after the first are guaranteedto be page hits. ( Since there are nointerveningrequests for other pages that couldremove this page from the page table. ) **So for example, if pages were ofsize 100 bytes, then the sequence of addressrequests ( 0100, 0432, 0101, 0612, 0634, 0688, 0132, 0038, 0420 ) wouldreduce to page requests ( 1, 4, 1, 6, 1, 0, 4 ) FIFO Page Replacement  As new pagesare brought in, theyare addedto the tail of a queue, andthe page at the headof the queue is the next victim. Inthe following
  • 5. 5 Virtual Memory (GalvinNotes, 9th Ed.) example, 20 page requests result in15 page faults:  Although FIFO is simple andeasy, it is not always optimal, or even efficient. An interesting effect that can occur with FIFO is Belady's anomaly, in which increasingthe number of frames available can actuallyincrease the number of page faults that occur! Consider, for example, the following chart basedonthe page sequence (1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5) and a varying number of available frames. Obviouslythe maximum number of faults is 12 (everyrequest generates a fault), andthe minimum number is 5 (each page loadedonlyonce)...  In FIFO algorithm, whichever page hasbeen inthe frames the longest is the one that is cleared. Until Bélády's anomalywas demonstrated, itwas believedthat anincreaseinthe number of page frameswouldalways result inthe same number or fewer page faults. Bélády, NelsonandShedler constructedreference strings for whichFIFO page replacement algorithm producednearlytwice more page faults ina larger memorythanina smaller one (wiki). Optimal Page Replacement  The discoveryof Belady's anomalylead to the searchfor anoptimalpage-replacement algorithm, whichis simplythat whichyields the lowest ofall possible page-faults, andwhichdoes not suffer from Belady's anomaly.  Such an algorithm does exist, and is calledOPT or MIN. This algorithmis simply"Replace the page that will not be used for the longest time inthe future." (www.youtube.com/watch?v=XmdgDHhx0fg clearlyexplains:Lookahead intothe sequence to see whichnumber won’t be requiredfor the longest period, page out that number). FIFO can take 2-3 times more time thanOPT/MIN.  OPT cannot be implemented in practice, because it requiresforetellingthe future, but it makes a nice benchmark for the comparisonandevaluationof real proposednew algorithms.  In practice most page-replacement algorithms tryto approximate OPT bypredicting(estimating) in one fashionor another what page will not be usedfor the longest periodof time. The basis of FIFO is the predictionthat the page that was brought in the longest time ago is the one that willnot be needed againfor the longest future time, but as we shall see, there are manyother prediction methods, all strivingto match the performance of OPT. LRU Page Replacement  The predictionbehind LRU, the Least RecentlyUsed, algorithm is that the page that has not beenusedinthe longest time is the one that will not be usedagaininthe near future. (Note the distinctionbetweenFIFO andLRU:The former looks at the oldest loadtime, andthe latter looks at the oldest use time.) Some view LRU as analogous to OPT, except lookingbackwards intime insteadof forwards. (OPT has the interestingpropertythat for any reference string S andits reverse R, OPT will generate the same number of page faults for S and for R. It turns out that LRU hasthis same property.) Figure 9.15 illustrates LRU for our sample string, yielding12 page faults, (ascompared to 15 for FIFO and9 for OPT.)  LRU is considereda goodreplacement policy, andis oftenused. The problemis howexactlyto implement it. There are two simple approaches commonlyused: o Counters: Everymemoryaccess increments a counter, andthe current value ofthis counter is storedinthe page table entryfor that page. Then finding the LRU page involves simple searching the table for the page withthe smallest counter value. Note that overflowing of the counter must be considered. o Stack:Another approach is to usea stack, and whenever a page is accessed, pullthat page from the middle of the stack andplace it on the top. The LRU page will always be at the bottomof the stack. Because this requires removing objects from the middle of the stack, a doublylinked list is the recommendeddata structure (last line grey).  Both implementations of LRU require hardware support, either for incrementing the counter or for managingthe stack, as these operations must be performedfor everymemoryaccess.  Neither LRU or OPT exhibit Belady's anomaly. Bothbelongto a class of page-replacement algorithms calledstackalgorithms, which cannever
  • 6. 6 Virtual Memory (GalvinNotes, 9th Ed.) exhibit Belady's anomaly. A stackalgorithmis one in whichthe pageskept in memoryfor a frame set of size N will always be a subset ofthe pages kept for a frame size of N + 1. In the case of LRU, (andparticularlythe stackimplementationthereof), the topN pages of the stackwill be the same for all frame set sizes ofN or anythinglarger.  LRU-Approximation Page Replacement: Full implementationof LRU requires hardware support, andfewsystems provide the fullhardware support necessary. However manysystems offer some degree of HWsupport, enoughto approximate LRU fairlywell. (In the absence of ANY hardware support, FIFO might be the best available choice.)Inparticular, manysystems provide a reference bit for everyentryina page table, which is set anytime that page is accessed. Initiallyall bits are set to zero, andtheycan also all be clearedat an ytime. One bit of precisionis enoughto distinguishpages that have beenaccessedsince the last clear from those that have not, but doesnot provide anyfiner grain ofdetail.  Additional-Reference-Bits Algorithm: Finer grainis possible bystoring the most recent 8 reference bits for each page inan 8-bit byte inthe page table entry, which is interpretedas anunsignedint. At periodic intervals (clock interrupts), the OS takes over, and right-shifts eachof the reference bytes byone bit. The high-order (leftmost)bit is then filled in with the current value of the reference bit, andthe reference bits are cleared. At anygiventime, the page withthe smallest value for the reference byte is the LRU page. Obviouslythe specific number of bits usedandthe frequencywith whichthe reference byte is updated are adjustable, andare tunedto give the fastest performance on a given hardware platform.  Second-Chance Algorithm: Imagine a pointer that moves continuously from the topmost frame to the bottom andthenback again. If the pointer is at position Xat a point of time, andthat frame gets filled witha page fromthe page sequence provided, then the pointer moves/points to the next frame. The reference bits are set to 0 the first time a new page is paged in. Anymore reference to that page sets its reference bit to 1. If the pointer is at a frame whose reference bit is 1, and the next reference is againto the same page as present in the current frame, then the bit doesn't become 2! A frame's content is cleanedand replaced only if the pointer is pointingto it andit's reference bit is 0. If its reference bit is 1, then the next frame who reference bit is 0 is replaced, but at the same time, the current frame's reference bit (whichis currently1), is changed/set to zero before the pointer moves aheadto the next frame (http://www.mathcs.emory.edu/~cheung/Courses/355/Syllabus/9- virtual-mem/SC-replace.html). The book's figure is not clear at all for understanding, but neverthelessproviding it below.  The second chance algorithm (or Clock Algorithm)is essentiallya FIFO, except the reference bit is used to give pagesa secondchance at staying in the page table. Whena page must be replaced, the page table is scannedina FIFO (circular queue) manner. Ifa page is foundwithits reference bit not set, thenthat page is selectedas the next victim. If, however, the next page inthe FIFO does have its reference bit set, then it is given a second chance: The reference bit is cleared, and the FIFO search continues. If some other page is foundthat didnot have its reference bit set, thenthat page will be selectedas the victim, and this page (the one beinggiventhe second chance)will be allowedto stayin the page table. If, however, there are noother pages that donot have their reference bit set (to put it simply, all have their bits set), thenthis page will be selected as the victimwhenthe FIFO search circles back aroundto this page on the secondpass. If all reference bits in the table are set, then secondchance degrades to FIFO, but alsorequires a complete search ofthe table for everypage-replacement. As long as there are some pageswhose reference bits are not set, thenanypage referenced frequentlyenoughgets to stayinthe page table indefinitely.  Enhanced Second-Chance Algorithm: The enhancedsecondchance algorithmlooks at the reference bit andthe modifybit (dirtybit) as anorderedpage, and classifies pages intoone offour classes:(0, 0) - Neither recentlyusednor modified. (0, 1) - Not recentlyused, but modified. (1, 0) - Recentlyused, but clean. (1, 1) - Recentlyused and modified. This algorithmsearchesthe page table ina circular fashion(inas manyas four passes), looking for the first page it canfindin the lowest numberedcategory. I.e. it first makes a passlooking for a (0, 0), andthenifit can't findone, it makes another pass lookingfor a (0, 1), etc. The maindifference between this algorithm andthe previous one is the preference for replacing clean pages if possible.  Counting-Based Page Replacement: There are several algorithms based oncountingthe number ofreferences that have beenmade to a given page, such as: (A) Least Frequently Used, LFU – Replace the page withthe lowest reference count. A problem can occur if a page is usedfrequentlyinitiallyandthennot used anymore, as the reference count remains high. A solutionto this problem is to right-shift the counters periodically, yielding a time-decaying average reference count. (B) Most Frequently Used, MFU – Replace the page withthe
  • 7. 7 Virtual Memory (GalvinNotes, 9th Ed.) highest reference count. The logic behindthis ideais that pagesthat have alreadybeen referenceda lot have been inthe system a long time, and we are probablydone withthem, whereas pagesreferenced onlya few timeshave onlyrecentlybeen loaded, and we still needthem. In general counting-based algorithms are not commonlyused, as their implementationis expensive andtheydo not approximate OPT well.  Page-Buffering Algorithms: There are a number of page-buffering algorithms that canbe usedinconjunctionwith the afore-mentioned algorithms, to improve overallperformance andsometimes make up for inherent weaknesses in the hardware and/or the underlyin gpage- replacement algorithms — o Maintaina certainminimum number of free frames at alltimes. Whena page-fault occurs, goaheadandallocate one ofthe free frames fromthe free list first, to get the requestingprocess upand running againas quicklyas possible, and thenselect a victim page to write to diskandfree upa frame as a secondstep. o Keep a list of modifiedpages, andwhenthe I/O system is otherwise idle, have it write these pagesout to disk, andthenclear the modifybits, therebyincreasing the chance of finding a "clean" page for the next potential victim. o Keep a pool of free frames, but remember what page was init before it was made free. Since the data in the page is not actually clearedout whenthe page is freed, it canbe made anactive page againwithout having to loadinanynew data from disk. This is useful when analgorithmmistakenlyreplacesa page that in fact is needed again soon.  Some applications like database programs undertake their ownmemorymanagement overridingthe general-purpose OS for data accessing and caching needs. Theyare oftengiven a rawdiskpartition to work with, containingrawdata blocks, andno file system structure. Allocation of Frames We saidearlier that there were twoimportant tasks invirtualmemorymanagement:a page-replacement strategyanda frame-allocation strategy. This sectioncovers the secondpart of that pair.  Minimum Number of Frames: The absolute minimum number of frames that a process must be allocated is dependent onsystem architecture, and corresponds to the worst-case scenarioof the number of pagesthat could be touched bya single (machine)instruction. If an instruction (andits operands) spans a page boundary, thenmultiple pagescouldbe needed just for the instruction fetch. Memoryreferences in an instructiontouchmore pages, andifthose memorylocations can spanpage boundaries, then multiple pages could be need edfor operandaccess also. The worst case involves indirect addressing, particularlywhere multiple levels of indirect addressing are allowed. Left unchecked, a pointer to a pointer to a pointer to a pointer to a . . . couldtheoreticallytoucheverypage inthe virtual address space ina single machine instruction, requiringeveryvirtual page be loaded into physicalmemorysimultaneously. For this reasonarchitectures place a limit (say16) on the number of levels of indirectionallowedinan instruction, whichis enforced witha counter initializedto th e limit and decremented witheverylevel of indirection inaninstruction - Ifthe counter reaches zero, thenan"excessive indirection" trap occurs. This example would still require a minimum frame allocationof 17 per process.  Allocation Algorithms: o Equal Allocation - If there are m framesavailable andn processes to share them, eachprocess gets m/nframes, andthe leftovers are kept in a free-frame buffer pool. o Proportional Allocation - Allocate the framesproportionallyto the size of the process, relative to the total size ofallprocesses. So if the size of process i is S_i, and S is the sumof all S_i, thenthe allocationfor process P_i is a_i = m * S_i / S. Variations onproportional allocationcouldconsider priorityof process rather thanjust their size. Obviouslyallallocations fluctuate over time as the number of available free frames, m, fluctuates, andall are also subject to the constraints ofminimum allocation. (If the minimumallocations cannot be met, thenprocesses must either be swappedout or not allowedto start untilmore free framesbecome available.)  Global versus Local Allocation: One bigquestion is whether frame allocation(page replacement) occurs ona local or global level. With local replacement, the number of pages allocated to a process is fixed, andpage replacement occurs onlyamongst the pages allocatedto this process. With global replacement, anypage maybe a potential victim, whether it currentlybelongs to the process seekinga free frame or not. Local page replacement allows processesto better control their ownpage fault rates, andleads to more consistent performance of a given process over different system loadlevels. Global page replacement is overall more efficient, andis the more commonlyusedapproach.  Non-Uniform Memory Access (Consolidates understanding): The above arguments all assume that all memoryis equivalent, or at least has equivalent access times. This maynot be the case in multiple-processor systems, especiallywhere each CPU is physicallylocatedona separate circuit board which also holds some portionof the overall system memory. In these latter systems, CPUs canaccessmemorythat is physically locatedon the same board muchfaster thanthe memoryon the other boards. The basic solutionis akin to processor affinity - At the same time that we tryto schedule processes onthe same CPU to minimize cache misses, we also tryto allocate memoryfor those processeson the same boards, to minimize access times. The presence of threads complicates the picture, especiallywhen the threads get loadedonto different processors. Solarisuses anlgroup as a solution, ina hierarchicalfashionbased onrelative latency. For example, all processors and RAMon a single board would probablybe in the same lgroup. Memoryassignments are made within the same lgroup if possible, or to the next nearest lgroup otherwise. (Where "nearest" is definedas having the lowest access time.) Thrashing If a process cannot maintainits minimum required number of frames, then it must be swappedout, freeing upframesfor other processes. This is an intermediate level of CPU scheduling. But what about a process that cankeep its minimum, but cannot keepall of the frames that it is currentlyusing on a regular basis? In thiscase, it is forcedto page out pagesthat it will needagaininthe very near future, leadingto large numbers of page faults. A process that is spendingmore time pagingthanexecutingis said to be thrashing.
  • 8. 8 Virtual Memory (GalvinNotes, 9th Ed.)  Cause of Thrashing: Earlyprocess scheduling schemes wouldcontrol the level of multiprogramming allowedbased onCPU utilization, adding inmore processes when CPU utilizationwas low. The problem is that whenmemoryfilledup and processes started spending lots of time waitingfor their pagesto page in, then CPU utilization would lower, causingthe schedule to add in even more processes andexacerbating the problem! Eventuallythe systemwould essentiallygrindto a halt. Local page replacement policiescanprevent one thrashing process fromtaking pagesawayfrom other processes, but it still tends to clog up the I/O queue, therebyslowing downanyother process that needs to do even a littlebit of paging (or anyother I/O for that matter.) To prevent thrashing we must provide processes withas manyframes as theyreallyneed"right now", but how dowe know what that is? The localitymodel notes that processes typicallyaccess memoryreferences ina given locality, makinglots of referencesto the same general area of memorybefore moving periodicallyto a new locality, as shown inFigure 9.19 below. If we couldjust keepas manyframes as are involvedin the current locality, thenpage faultingwouldoccur primarilyon switches fromone localityto another. (E.g. when one functionexits and another is called.)  Working-Set Model: The workingset modelis basedon the concept of locality, and defines a working set window, oflengthdelta. Whatever pages are included inthe most recent delta page references are said to be inthe processes working set window, andcomprise its current working set, as illustratedin Figure 9.20: The selection ofdelta is critical to the success ofthe workingset model - If it is too small thenit does not encompass all ofthe pages of the current locality, andif it is too large, thenit encompasses pages that are nolonger being frequentlyaccessed. The total demand, D, is the sum of the sizes of the working sets for all processes. If D exceeds the total number of available frames, thenat least one process is thrashing, because there are not enough framesavailable to satisfyits minimum working set. If D is significantlylessthanthe currentlyavailable frames, thenadditional processescanbe launched. The hard part of the working-set model is keeping trackof what pages are in the current workingset, since everyreference adds one to the set and removes one older page. An approximationcanbe made using reference bits anda timer that goes off after a set interval of memoryreferences:For example, suppose that we set the timer to go off after every5000 references (byanyprocess), and we canstore twoadditional historical reference bits inadditionto the current reference bit. Every time the timer goes off, the current reference bit is copied to one of the twohistorical bits, andthen cleared. If anyof the three bits is set, then that page was referenced withinthe last 15,000 references, andis considered to be in that processes reference set. Finer resolutioncanbe achieved withmore historical bits and a more frequent timer, at the expense of greater overhead.  Page-Fault Frequency: A more direct approach is to recognize that what we reallywant to control is the page-fault rate, andto allocate frames basedonthis directlymeasurable value. Ifthe page-fault rate exceeds a certainupper boundthenthat process needs more frames, andif it is below a givenlower bound, thenit can affordto give upsome ofits frames to other processes. (Illinois professor supposes a page-replacement strategycouldbe devisedthat wouldselect victim framesbasedonthe processwith the lowest current page-fault frequency.). Note that there is a direct relationshipbetween the page-fault rate andthe working-set, as a process moves fromone localityto another (unnumbered side- bard-9th Ed). . Memory-Mapped Files Rather thanaccessing data files directlyvia the file system witheveryfileaccess, data files canbe pagedintomemorythe same as process files, resulting in much faster accesses (except ofcourse when page-faults occur.) Thisis knownas memory-mapping a file.
  • 9. 9 Virtual Memory (GalvinNotes, 9th Ed.)  Basic Mechanism: Basicallya file is mapped to an address range withina process's virtual address space, and thenpaged in as neededusing the ordinarydemandpagingsystem. Note that file writes are made to the memorypage frames, andare not immediatelywrittenout to disk. (This is the purpose of the "flush()" systemcall, whichmayalsobe neededfor stdout in some cases.) Thisis alsowhyit is important to "close()" a file whenone is done writing to it - So that the data can be safelyflushedout to disk and so that the memoryframes canbe freedup for other purposes. Some systems provide specialsystemcalls to memorymapfiles anduse direct disk access otherwise. Other systems mapthe file to process address space ifthe special systemcalls are used andmapthe file to kernel address space otherwise, but do memorymapping in either case. File sharing is made possible bymapping the same file to the address space of more than one process, as shownin Figure 9.23 below. Copy-on-write is supported, andmutual exclusion techniques (chapter 6) maybe neededto avoid synchronization problems.  Memory-Mapped I/O: All access to devices is done bywriting into(or reading from) the device's registers. Normallythis is done via special I/O instructions. For certaindevices it makes sense to simplymap the device's registers to addressesinthe process's virtual address space, making device I/O as fast andsimple as anyother memoryaccess. Videocontroller cards are a classic example ofthis. Serialand parallel devicescan also use memorymappedI/O, mapping the device registers to specific memoryaddresses known as I/O Ports, e.g. 0xF8. Transferring a series of bytes must be done one at a time, movingonlyas fast as the I/O device is preparedto processthe data, through one of twomechanisms: o Programmed I/O (PIO), also known as polling – The CPU periodicallychecks the control bit on the device, to see ifit is readyto handle another byte of data. o Interrupt Driven – The device generates aninterrupt when it either hasanother byte of data to deliver or is readyto receive another byte. Allocating Kernel Memory Previous discussions have centeredonprocessmemory, which can be convenientlybrokenup into page-sizedchunks, andthe onlyfragmentation that occurs is the average half-page lost to internal fragmentationfor each process (segment). There is alsoadditional memoryallocatedto the kernel, however, which cannot be so easilypaged. Some ofit is usedfor I/O buffering anddirect access bydevices, for example, and must therefore be contiguous andnot affectedbypaging. Other memoryis usedfor internal kernel data structures of various sizes, andsince k ernelmemoryis often locked(restrictedfrom being ever swapped out), management of thisresource must be done carefullyto avoid internal fragmentation or other waste. (i.e. you wouldlike the kernelto consume as littlememoryas possible, leavingas muchas possible for user processes.)Accordinglythere are several classic algorithms inplace for allocating kernel memorystructures.  Buddy System: The BuddySystemallocates memoryusing a power of twoallocator. Under this scheme, memoryis always allocatedas a power of 2 (4K, 8K, 16K, etc), roundingupto the next nearest power of two if necessary. If a block ofthe correct size is not currently available, then one is formed bysplittingthe next larger block in two, forming twomatchedbuddies. (Andifthat larger size is not available, thenthe next largest available size is split, and soon.)One nice feature ofthe buddy systemis that if the address of a block is exclusivelyORed with the size ofthe block, the resultingaddress is the address of the buddyof the same size, whichallows for fast andeasycoalescing of free blocks back intolarger blocks. Free lists are maintained for everysize block. If the necessaryblock size is not available uponrequest, a free block fromthe next largest size is split into twobuddies of the desiredsize. (Recursivelysplitting larger size blocks if necessary.) When a blockis freed, its buddy's address is calculated, and the free list for that size blockis checked to see if the buddyis alsofree. If it is, then the twobuddies are coalescedintoone larger free block, andthe process is repeated withsuccessivelylarger free lists. See the (annotated) Figure 9.27 below for anexample.  Slab Allocation: Slab Allocationallocatesmemoryto the kernelin chunks calledslabs, consisting ofone or more contiguous pages. The kernel thencreatesseparate cachesfor eachtype ofdata structure it might need from one or more slabs. Initiallythe cachesare marked
  • 10. 10 Virtual Memory (GalvinNotes, 9th Ed.) empty, andare markedfull as theyare used. Newrequests for space in the cache is first granted from emptyor partiallyemptyslabs, andif all slabs are full, thenadditional slabs are allocated. Thisessentiallyamounts to allocating space for arrays of structures, in large chunks suitable to the size of the structure beingstored. For example if a particular structure were 512 bytes long, space for them would be allocatedingroups of 8 using4Kpages. If the structure were 3K, thenspace for 4 of them couldbe allocated at one time ina slabof 12Kusing three 4Kpages. Benefits of slaballocationinclude lackof internal fragmentationandfast allocationof space for individual structures Solarisuses slaballocation for the kernel and alsofor certain user-mode memoryallocations. Linux used the buddysystemprior to 2.2 andswitchedto slab allocation since then. Other Considerations:  Prepaging: The basic idea behindprepaging is to predict the pages that will be needed inthe near future, and page them inbefore theyare actuallyrequested. If a process was swapped out andwe know what its workingset wasat the time, thenwhenwe swapit back inwe cango aheadandpage back in the entire working set, before the page faults actuallyoccur. Withsmall(data)files we cango aheadand prepage allof the pagesat one time. Prepaging can be ofbenefit ifthe predictionis goodand the pages are neededeventually, but slows the system downif the predictionis wrong.  Page Size: There are quite a few trade-offs ofsmall versus large page sizes:Small pages waste less memorydue to internal fragmentation. Large pages require smaller page tables. For disk access, the latencyandseek times greatlyoutweigh the actual data transfer times. Thismakes it much faster to transfer one large page of data thantwo or more smaller pages containing the same amount of data. Smaller pages match localitybetter, because we are not bringing in data that is not reallyneeded. Small pages generate more page faults, with attending overhead. The physical hardware mayalsoplaya part in determiningpage size. It is hard to determine an"optimal" page size for anyg ivensystem. Current norms range from4K to 4M, and tendtowards larger page sizes as time passes.  TLB Reach: TLB Reachis defined as the amount of memorythat canbe reachedbythe pages listedinthe TLB. Ideallythe working set wouldfit within the reach ofthe TLB. Increasing the size of the TLB is an obvious wayof increasing TLB reach, but TLB memoryis veryexpensive and also draws lots of power. Increasingpage sizesincreasesTLB reach, but alsoleads to increasedfragmentationloss. Some systems provide multiple size pagesto increase TLB reachwhile keeping fragmentationlow. Multiple page sizesrequires that the TLB be managed bysoftware, not hardware.  Program Structure: Consider a pair of nestedloops to access everyelement ina 1024 x 1024 two-dimensional arrayof 32-bit ints. Arrays in C are storedinrow-major order, which means that eachrow of the arraywould occupya page of memory. If the loops are nested sothat the outer loopincrements the row and the inner loopincrements the column, thenanentire row canbe processedbefore the next page fault, yielding 1024 page faults total. On the other hand, if the loops are nested the other way, so that the program worked downth e columns insteadof across the rows, theneveryaccess would be to a different page, yielding a new page fault for eachaccess, or over a millionpage faults all together. Be aware that different languagesstore their arrays differently. FORTRAN for example storesarrays incolumn-major format insteadof row-major. This means that blindtranslation ofcode fromone language to another mayturn a fast programintoa veryslow one, strictlybecause ofthe extra page faults.  I/O Interlock and Page Locking: There are severaloccasions whenit maybe desirable to lock pages in memory, and not let them get pagedout — Certainkerneloperations cannot tolerate having their pages swapped out. IfanI/Ocontroller is doingdirect-memoryaccess, it would be wrong to change pages in the middle of the I/O operation. Ina prioritybasedscheduling system, lowpriorityjobs mayneed to wait quite a while before getting their turn onthe CPU, and there is a danger of their pages being pagedout before theyget a chance to use them even once after paging them in. In this situation pages maybe lockedwhentheyare paged in, until the processthat requested themgets at least one turn inthe CPU. Operating-System Examples (Optional) This section is onlyto consolidate your understanding andhelprevise the concepts inyour mindin a real-life case study. Just read throughit, noneed to push yourself to memorize anything. Just mapmentallywhat youlearnt intothese realOS examples. Windows:  Windows uses demandpaging with clustering, meaning theypage in multiple pages whenever a page fault occurs.  The working set minimumandmaximum are normallyset at 50 and345 pages respectively. (Maximums can be exceededinrare circumstances.)  Free pages are maintained ona free list, witha minimumthresholdindicatingwhenthere are enough free frames available.  If a page fault occurs andthe processis below their maximum, thenadditional pages are allocated. Otherwise some pagesfrom thisprocess must be replaced, using a local page replacement algorithm.  If the amount offree frames falls below the allowable threshold, thenworking set trimming occurs, taking frames awayfroma nyprocesses which are above their minimum, until all are at their minimums. Thenadditional framescanbe allocated to processesthat need them.  The algorithmfor selecting victimframes depends onthe type of processor: o On single processor 80x86 systems, a variationof the clock( secondchance ) algorithm is used.
  • 11. 11 Virtual Memory (GalvinNotes, 9th Ed.) o On Alpha andmultiprocessor systems, clearing the reference bits mayrequire invalidating entriesinthe TLB on other processors, which is anexpensive operation. Inthis case Windows uses a variationof FIFO. Solaris:  Solaris maintains a list of free pages, andallocatesone to a faulting thread whenever a fault occurs. It is therefore imperative that a minimum amount of free memorybe kept on handat all times.  Solaris hasa parameter, lotsfree, usuallyset at 1/64 of total physical memory. Solaris checks 4 times per secondto see if the free memory falls belowthis threshhold, and if it does, then the pageout process is started.  Pageout uses a variationof the clock(secondchance) algorithm, with two hands rotating around throughthe frame table. The first hand clears the reference bits, andthe secondhandcomesbyafterwards and checks them. Anyframe whose reference bit hasnot beenreset before the second handgets there gets pagedout.  The Pageout methodis adjustable bythe distance betweenthe two hands, (the handspan), andthe speedat which the hands move. For example, if the hands each check100 frames per second, andthe handspanis 1000 frames, thenthere wouldbe a 10 secondinterval between the time whenthe leadinghandclears the reference bits and the time when the trailing handchecks them.  The speedof the hands is usuallyadjusted according to the amount of free memory, as shownbelow. Slowscan is usuallyset at 100 pagesper second, and fastscanis usuallyset at the smaller of 1/2 of the total physical pages per second and 8192 pages per second.  Solaris alsomaintains a cache of pages that have beenreclaimedbut whichhave not yet beenoverwritten, as opposedto the free list which onlyholds pages whose current contents are invalid. If one of the pages fromthe cache is neededbefore it gets movedto the free list, thenit can be quicklyrecovered.  Normallypageout runs 4 timesper secondto check if memoryhas fallenbelow lotsfree. However if it falls belowdesfree, thenpageout will run at 100 times per second inanattempt to keepat least desfree pages free. If it is unable to dothis for a 30-secondaverage, thenSolaris begins swapping processes, starting preferablywithprocesses that have beenidle for a long time.  If free memoryfallsbelow minfree, then pageout runs witheverypage fault.  Recent releases ofSolaris have enhancedthe virtual memorymanagement system, includingrecognizing pages fromsharedlibraries, and protecting themfrom beingpagedout.
  • 12. 12 Virtual Memory (GalvinNotes, 9th Ed.)  Specifics:  Linux-specific stuff o XX  Hardware-specific: o XX To be cleared  Inverted Page Tables: Invertedpage tables store one entryfor eachframe instead ofone entryfor eachvirtual page. This reduces the memory requirement for the page table, but loses the information neededto implement virtual memorypaging. A solutionis to keep a separate page table for each process, for virtualmemorymanagement purposes. These are kept ondisk, andonlypaged inwhena page fault o ccurs. (i.e. theyare not referencedwitheverymemoryaccess the waya traditional page table would be.)—Greyandinadequate as ofnow Q’s Later  XXX Glossary ReadLater Further Reading  Skipped: SharedMemoryinthe Win32 API (Memory-mapped filessection. There’s a figure there that says “Figure 9.26 Consumer reading from sharedmemoryusingthe Win32 API”)  Grey Areas  XXX CHEW  Whether the logicalpage size is equal to the physicalframe size (Yes!)  Note that paging is like having a table ofrelocation registers, one for each page of the logicalmemory  Page table entries (frame numbers) are typically32 bit numbers, allowing access to 2^32 physical page frames. Ifthose frames are 4 KB in size each, that translates to 16 TB of addressable physical memory. (32 + 12 = 44 bits of physicaladdress space.)  One optionis to use a set of registers for the page table. For example, the DECPDP-11 uses16-bit addressingand8 KB pages, resultinginonly 8 pages per process. (It takes13 bits to address8 KB of offset, leaving only3 bits to define a page number.)  On page 12 of the lecture, do the TLB mathunder "(EighthEditionVersion:)". Required.  More on TLB  Apropos page 10 second bullet point of lecture, does it implicitlymean that the offset for bothpage number andframer number should be same?
  • 13. 13 Virtual Memory (GalvinNotes, 9th Ed.)  Page 15 VAXArchitecture divides 32-bit addresses into 4 equalsized sections, andeachpage is 512 bytes, yieldinganaddress form of:  What are segmentationunit and paging unit?  Can parts of a page table/ page directorybe swappedout too?