Your SlideShare is downloading. ×
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Operating Systems - Architecture
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Operating Systems - Architecture

1,007

Published on

From the Operating Systems course (CMPSCI 377) at UMass Amherst, Fall 2007.

From the Operating Systems course (CMPSCI 377) at UMass Amherst, Fall 2007.

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,007
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
114
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Operating Systems CMPSCI 377 Architecture Emery Berger University of Massachusetts Amherst UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 2. Architecture Hardware Support for Applications & OS  Architecture basics & details  Focus on characteristics exposed to  application programmer / OS UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 2
  • 3. The Memory Hierarchy Registers  Caches  Associativity  Misses  Locality  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 3
  • 4. Registers Register = dedicated name for word of  memory managed by CPU General-purpose: “AX”, “BX”, “CX” on x86  SP Special-purpose:  arg0 arg1 arg0 “SP” = stack pointer  arg1 arg2 “FP” = frame pointer FP  “PC” = program counter  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 4
  • 5. Registers Register = dedicated name for one word of  memory managed by CPU General-purpose: “AX”, “BX”, “CX” on x86  SP Special-purpose:  arg0 arg1 “SP” = stack pointer  “FP” = frame pointer FP  “PC” = program counter  Change processes:  save current registers & load saved registers = context switch UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 5
  • 6. Caches Access to main memory: “expensive”  ~ 100 cycles (slow, relatively cheap)  Caches: small, fast, expensive memory  Hold recently-accessed data (D$) or  instructions (I$) Different sizes & locations  Level 1 (L1) – on-chip, smallish  Level 2 (L2) – on or next to chip, larger  Level 3 (L3) – pretty large, on bus  Manages lines of memory (32-128 bytes)  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 6
  • 7. Memory Hierarchy Higher = small, fast, more $, lower latency  Lower = large, slow, less $, higher latency  registers 1-cycle latency 2-cycle latency L1 evict load D$, I$ separate L2 7-cycle latency D$, I$ unified RAM 100 cycle latency Disk 40,000,000 cycle latency Network 200,000,000+ cycle latency UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 7
  • 8. Cache Jargon Cache initially cold  Accessing data initially misses  Fetch from lower level in hierarchy  Bring line into cache (populate cache)  Next access: hit  Once cache holds most-frequently used  data: “warmed up” Context switch implications?  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 8
  • 9. Cache Details Ideal cache would be fully associative  That is, LRU (least-recently used) queue  Generally too expensive  Instead, partition memory addresses and  put into separate bins divided into ways 1-way or direct-mapped  2-way = 2 entries per bin  4-way = 4 entries per bin, etc.  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 9
  • 10. Associativity Example Hash memory based on addresses to  different indices in cache UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 10
  • 11. Miss Classification First access = compulsory miss  Unavoidable without prefetching  Too many items in way = conflict miss  Avoidable if we had higher associativity  No space in cache = capacity miss  Avoidable if cache were larger  Invalidated = coherence miss  Avoidable if cache were unshared  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 11
  • 12. Exercise Cache with 4 entries, 2-way associativity  Assume hash(x) = x % 4 (modulus)  How many misses?  # compulsory misses?  # conflict misses?  # capacity misses?  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 12
  • 13. Solution Cache with 4 entries, 2-way associativity  Assume hash(x) = x % 4 (modulus)  How many misses?  # compulsory misses? 10  # conflict misses?  # capacity misses?  3 7 11 2 3 7 7 9 9 6 13 7 2 5 8 10 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 13
  • 14. Solution Cache with 4 entries, 2-way associativity  Assume hash(x) = x % 4 (modulus)  How many misses?  # compulsory misses? 10  # conflict misses? 2  # capacity misses?  3 7 11 2 3 7 7 9 9 6 13 7 2 5 8 10 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 14
  • 15. Solution Cache with 4 entries, 2-way associativity  Assume hash(x) = x % 4 (modulus)  How many misses?  # compulsory misses? 10  # conflict misses? 2  # capacity misses? 0  3 7 11 2 3 7 7 9 9 6 13 7 2 5 8 10 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 15
  • 16. Locality Locality = re-use of recently-used items  Temporal locality: re-use in time  Spatial locality: use of nearby items  In same cache line, same page (4K chunk)  Intuitively – greater locality = fewer misses  # misses depends on cache layout, # of levels,  associativity… Machine-specific  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 16
  • 17. Quantifying Locality Instead of counting misses,  compute hit curve from LRU histogram Assume perfect LRU cache  Ignore compulsory misses  3 7 7 2 3 7 7 3 1 2 3 4 5 6 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 17
  • 18. Quantifying Locality Instead of counting misses,  compute hit curve from LRU histogram Assume perfect LRU cache  Ignore compulsory misses  3 7 7 2 3 7 7 3 1 2 3 4 5 6 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 18
  • 19. Quantifying Locality Instead of counting misses,  compute hit curve from LRU histogram Assume perfect LRU cache  Ignore compulsory misses  3 7 7 2 3 7 2 7 3 1 2 3 4 5 6 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 19
  • 20. Quantifying Locality Instead of counting misses,  compute hit curve from LRU histogram Assume perfect LRU cache  Ignore compulsory misses  3 7 7 2 3 7 2 7 3 1 2 3 4 5 6 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 20
  • 21. Quantifying Locality Instead of counting misses,  compute hit curve from LRU histogram Assume perfect LRU cache  Ignore compulsory misses  3 7 7 2 3 7 3 2 7 1 2 3 4 5 6 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 21
  • 22. Quantifying Locality Instead of counting misses,  compute hit curve from LRU histogram Assume perfect LRU cache  Ignore compulsory misses  3 7 7 2 3 7 3 2 7 1 2 3 4 5 6 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 22
  • 23. Quantifying Locality Instead of counting misses,  compute hit curve from LRU histogram Start with total misses on right hand side  Subtract histogram values  1 1 3 3 3 3 1 2 3 4 5 6 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 23
  • 24. Quantifying Locality Instead of counting misses,  compute hit curve from LRU histogram Start with total misses on right hand side  Subtract histogram values  Normalize  100% .3 .3 1 1 1 1 67% 33% 0% 1 2 3 4 5 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 24
  • 25. Hit Curve Exercise Derive hit curve for following trace:  3 5 4 2 8 3 6 9 9 6 13 7 2 5 8 10 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 25
  • 26. Hit Curve Exercise Derive hit curve for following trace:  1 2 3 4 5 6 7 8 9 3 5 4 2 8 3 6 9 9 6 13 7 2 5 8 10 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 26
  • 27. Hit Curve Exercise Derive hit curve for following trace:  1 2 2 2 3 3 4 5 6 1 2 3 4 5 6 7 8 9 3 5 4 2 8 3 6 9 9 6 13 7 2 5 8 10 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 27
  • 28. Hit Curve Exercise Derive hit curve for following trace:  1 2 2 2 3 3 4 5 6 100% 67% 33% 0% 1 2 3 4 5 6 7 8 9 UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 28
  • 29. Important CPU Internals Issues that affect performance  Pipelining  Branches & prediction  System calls (kernel crossings)  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 29
  • 30. Scalar architecture + memory… Straight-up sequential execution  Fetch instruction  Decode it  Execute it  Problem: instruction or data miss in cache  Result – stall: everything stops  How long to wait for miss all the way to  RAM? UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 30
  • 31. Superscalar architectures Out-of-order processors  Pipeline of instructions in flight  Instead of stalling on load, guess!  Branch prediction  Value prediction  Predictors based on history, location in program  Speculatively execute instructions  Actual results checked asynchronously  If mispredicted, squash instructions  Accurate prediction = massive speedup  Hides latency of memory hierarchy  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 31
  • 32. Pipelining and Branches Pipelining overlaps instructions to exploit parallelism, allowing the clock rate to be increased. Branches cause bubbles in the pipeline, where some stages are left idle. Instruction fetch Instruction decode Execute Memory access Write back Unresolved branch UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 33. Branch Prediction A branch predictor allows the processor to speculatively fetch and execute instructions down the predicted path. Instruction fetch Instruction decode Execute Memory access Write back Speculative execution UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science
  • 34. Kernel Mode Protects OS from users  kernel = English for nucleus  Think atom  Only privileged code executes in kernel  System call –  Enters kernel mode  Flushes pipeline, saves context  Executes code in kernel land  Returns to user mode, restoring context  Where we are in user land  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 34
  • 35. Timers & Interrupts Need to respond to events periodically  Change executing processes  Quantum – time limit for process execution  Fairness – when timer goes off, interrupt  Current process stops  OS takes control through interrupt handler  Scheduler chooses next process  Interrupts also signal I/O events  Network packet arrival, disk read complete…  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 35
  • 36. To do Read C/C++ notes for next week  First homework assigned next week  Language: C/C++  Will be due in 2 weeks  UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 36
  • 37. The End UNIVERSITY OF MASSACHUSETTS AMHERST • Department of Computer Science 37

×