Memory access control in multiprocessor for real-time system with mixed criticality

1,005 views

Published on

Proposed/implemented a memory bandwidth control mechanism on Linux kernel for Multi-core systems.

Described a timing analysis method based up on the proposed mechanism.

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,005
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
6
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Memory access control in multiprocessor for real-time system with mixed criticality

  1. 1. Memory Access Control inMultiprocessor for Real-time Systems with Mixed Criticality Heechul Yun+, Gang Yao+, Rodolfo Pellizzoni*, Marco Caccamo+, Lui Sha+ University of Illinois at Urbana and Champaign+ Univerity of Waterloo*
  2. 2. Multi-core Systems• Mainstream in smartphone – Dual/quad-core smartphones – More performance with less power Tegra 3 (4 cores)• Traditional embedded/real-time domains – Avionics companies are investigating [Nowotsch12] • 8 core P4080 processor from Freescale 2 [Nowotsch12] “Leveraging Multi-Core Computing Architectures in Avionics”, EDCC, 2012
  3. 3. Challenge Core1 Core2 Core3 Core4 System bus DRAM• Timing isolation is hard to achieve 3
  4. 4. Challenge Appl1 App2 App 3 App 4 Core1 Core2 Core3 Core4 System bus DRAM• Cores compete for shared HW resources – System bus, DRAM controller, shared cache, … 4
  5. 5. Effect of Memory Contention Run-time increase (%) 70% 60% 60% WCET increase 50% is unacceptable 40% 30%App.App membomb 20% 10% 0%Core Core 429.mcf 471.omnetpp 473.astar 433.milc 470.lbmShared system bus • Run-time increase due to contention Memory – Five SPEC2006 benchmarks – Compared to solo execution 5
  6. 6. Goal• Mechanism to control memory contention – Software based controller for COTS multi-core processors• Response time analysis accounting memory contention effect – Based on proposed software based controller 6
  7. 7. Outline• Motivation• Memory Access Control System• Response Time Analysis• Evaluation• Conclusion 7
  8. 8. System Architecture Core1 Core2 Core3 Core4 Memory bandwidth controllers (Part of OS) 20% 30% 10% 40% System bus DRAM• Assign memory bandwidth to each core using per-core memory bandwidth controller 8
  9. 9. Memory Bandwidth Controller• Periodic server for memory resource• Periodically monitor memory accesses of the core and control user specified bandwidth using OS scheduler – Monitoring can be efficiently done by using per-core hardware performance counter – Bandwidth = # memory accesses X avg. access time 9
  10. 10. Memory Bandwidth Controller • Period: 10 time unit, Budget: 2 memory accesses – memory access takes 1 time unit Enqueue tasks 2Budget 1 Task 0 10 20 Dequeue tasks Dequeue tasks computation 10 memory fetch
  11. 11. Outline• Motivation• Memory Bandwidth Control System• Response Time Analysis• Evaluation• Conclusion 11
  12. 12. System Model Critical core Interfering cores Core1 Core2 Core3 Core4 Memory bandwidth controller System bus DRAM• Cores are partitioned based on criticality• Critical core runs periodic real-time tasks with fixed priority scheduling algorithm• Interfering cores run non-critical workload and regulated with proposed memory access controller 12
  13. 13. Assumptions Critical core Interfering cores Core1 Core2 Core3 Core4 Memory bandwidth controller System bus DRAM• Private or partitioned last level cache (LLC)• Round-robin bus arbitration policy• Memory access latency is constant• 1 LLC miss = 1 DRAM access 13
  14. 14. Simple Case: One Interfering Core Critical Interfering Core Core Memory bandwidth controller System bus DRAM• Critical core - core under analysis• Interfering core – generating memory interference 14
  15. 15. Problem Formulation• For a given periodic real-time task set 𝑇 = {𝜏1 , 𝜏1 ,…, 𝜏 𝑛 } on a critical core• Problem: – Determine 𝑇 is schedulable on the critical core given memory access control budget Q and period P on the interfering core 15
  16. 16. Task Model CM computation memory fetch (cache stall) C time– C : WCET of a task on isolated core (no interference)– CM: number of last level cache misses (DRAM accesses)– L: stall time of single cache miss 16
  17. 17. Memory Interference Model– P : memory access controller period– Q: memory access time budget– αu(t): Linearized interfering memory traffic upper-bound 17
  18. 18. Background [Pellizzoni07]• Accounting Memory Interference – Cache bound: maximum interference time <= maximum number cache-accesses (CM) * L of the task under analysis – Traffic bound: maximum interference time <= maximum bus time requested by the interfering core Cache-bound Traffic bound 𝐶 : WCET account memory stall delay L: stall time of single cache miss 18 [Pellizzoni07] “Toward the predictable integration of real-time cots based systems,” RTSS, 2007
  19. 19. Classic Response Time Analysis 𝑅𝑖 𝑘𝑅 𝑖𝑘+1 = 𝐶 𝑖 + ∗ 𝐶𝑗 𝑇𝑗 𝑗<𝑖– Tasks are sorted in priority order • low index = high priority task– 𝐶 𝑖 : WCET of task i (in isolation w/o memory interference)– 𝑅 𝑖 : Response time of task i– 𝑇𝑗 : Period of task I 19
  20. 20. Extended Response Time Analysis 𝑅𝑖 𝑘 𝑅 𝑖𝑘+1 = 𝐶 𝑖 + ∗ 𝐶𝑗 + min 𝑁 𝑅 𝑘 ∗ 𝐿, 𝛼 𝑢 𝑅 𝑘 𝑇𝑗 𝑗<𝑖 𝑡 where 𝑁 𝑡 = 𝑗≤𝑖 ∗ 𝐶𝑀𝑗 𝑇𝑗 𝑢 𝑄 𝑄(𝑃 − 𝑄) 𝛼 𝑡 = 𝑡 +2 𝑃 𝑃 – 𝑁 𝑡 : aggregated cache misses over time t – 𝛼 𝑢 (𝑡): interfering memory traffic over time t – P: memory access control period – Q: memory access time budget• Proposed method achieves tighter response time than using 𝐶 20
  21. 21. Outline• Motivation• Memory Bandwidth Control System• Response Time Analysis• Evaluation• Conclusion 21
  22. 22. Linux Kernel Implementation• Extending CPU bandwidth reservation feature of group scheduler – Specify core and bandwidth (memory budget, period) • mkdir /sys/fs/cgroup/core3; cd /sys/fs/cgroup/core3 • echo 3 > cpuset.cpus  core 3 • echo 10000 > cpu.cfs_period_us  period • echo 500000 > cpu.cfs_quota_event  cache-misses budget – Added feature – Monitor memory usage at every scheduler tick and context switch 22
  23. 23. Experimental Platform Intel Core2Quad Core 0 Core 1 Core 2 Core 3 L1-I L1-D L1-I L1-D L1-I L1-D L1-I L1-D L2 Cache L2 Cache System Bus DRAM• Core 0,2 were disabled to simulate a private LLC system• Running a modified Linux 3.2 kernel – https://github.com/heechul/linux-sched-coreidle/tree/sched-3.2-throttle-v2 23
  24. 24. Synthetic TaskApp.App membombCore1 Core3Shared system bus Memory • Core under analysis runs a synthetic task with 50% memory bandwidth • Vary throttling budget of the interfering core from 0 to 100% • Two findings: (1) we can control interference, (2) analysis provide an upper bound (albeit still pessimistic) 24
  25. 25. H.264 Movie Playback• Cache-miss counts sampled over every 100ms• Some inaccuracy in regulation due to implementation limitation – Current version is improved accurate by using hardware overflow interrupt 25
  26. 26. Conclusion• Shared hardware resources in multi-core systems are big challenges for designing real-time systems• We proposed and implemented a mechanism to provide memory bandwidth reservation capability on COTS multi-core processors• We developed a response time analysis method using the proposed memory access control mechanism 26
  27. 27. Thank you. 27

×