Supporting Time-Sensitive
Applications on a Commodity OS
Ashvin Goel, Luca Abeni, Charles Krasic, Jim Snow
and Jonathan Walpole
Presenter : Namhyuk Ahn
Real-Time and Time-Sensitive Applications
§ Real-Time system
• Must meet an exact (or very nearly) deadline
• Failure to meet deadline means system failures
• e.g. space shuttle, air craft..
§ Time-sensitive applications
• Applications with real-time requirements
• e.g. media player
Distributed Shared Memory: Concepts and Systems 2
Motivation
§ Needs of time-sensitive application on commodity OS increase
§ However,
§ Most commodity OS aim to maximize system throughput
• These don’t consider real-time nor time-sensitive applications
§ Most real-time OS focus on hard real-time scenario
• System throughput is decreased and limited
§ Solution: General-purpose OS should offer both properties
• Time Sensitive Linux (TS-Linux)
Distributed Shared Memory: Concepts and Systems 3
Problems of Time-Sensitive Application
§ Time-sensitive applications require timely resources allocation
§ Minimize kernel latency is important issue
• Timer latency due to inaccurate timer
• Preemption latency due to threads executing in kernel cannot be preempted
• Scheduling latency is time taken to schedule the event
Distributed Shared Memory: Concepts and Systems 4
Solutions
§ TSL improves three keys causing latency
• Timer latency → Accurate timer mechanism
► Firm timer
• Preemption latency → Responsive kernel
► Lock-breaking preemptible kernel
• Scheduling latency → Effective scheduling
► Proportion-period scheduler
► Priority-based scheduler
§ TSL is version of Linux that can:
• Effectively run time-sensitive applications
• Effectively run throughput applications
• Be implemented without modifying existing applications
Distributed Shared Memory: Concepts and Systems 5
Timer Mechanism
§ Periodic timers
• Timer interrupts periodically; Max timer latency equals to the period
• Reducing latency increase interrupt overhead
§ One-shot timers
• Interrupts only when needs
• Cost of reprogramming
• High accuracy but high interrupt overhead
§ Example: Two task with periods 5, 7ms and run 35ms
• Periodic timers with period of 1ms
► Maximum latency is 1ms; 35 interrupts
• One-shot timer
► Relatively small latency; 11 interrupts (5ms, 7ms, 10ms, 14ms ..)
Distributed Shared Memory: Concepts and Systems 6
Timer Mechanism
§ Soft timers
• One-shot timer have interrupt overhead
• Reduce cost of context switch caused by interrupts
• But, timer latency is increase
§ Procedure
1. At some points, system will arrive at “trigger states”
o End of system call, exception handler or interrupt handler
o CPU idle
2. This invoke event handler; there is no overhead since already context-
switched
3. Soft timer checks for any pending events without hardware timer cost
Distributed Shared Memory: Concepts and Systems 7
Accurate Timer Mechanism – Firm timer
§ Firm timers
• Combine all the advantages of three timers before
• Overshoot parameter: bridge between one-shot and soft timer
• This makes unnecessary interrupts avoided
• High overshoot = soft; low = one-shot
Distributed Shared Memory: Concepts and Systems 8
Time
One-shot
timer expired
Overshoot
syscall
Programmed
One-shot interrupt
Dispatch and reprogram
one-shot timer
Interrupt does not occured
In case no syscall in overshot period, it perform as one-shot timer
Responsive Kernel
§ Kernel is responsive when non-preemptible section is small
§ The case scheduler is not able to run
• Interrupts is disabled (has to ready in ready-queue)
• Another thread is executing critical section in kernel
§ So, length of non-preemptible section is important
• But traditional commodity kernel disable preemption for the entire kernel
Distributed Shared Memory: Concepts and Systems 9
Fine-grained Preemptible Kernel
§ Explicit preemption
• Explicit insertion of preemption points inside the kernel
• Kernel explicitly yields CPU to scheduler when meet preemption point
• Has to be manually placed preemption points
§ Anytime preemption
• Allow preemption anytime kernel not manipulating shared data structure
• Shared kernel data must be protected using mutex or spinlock
• High latency when spinlock are held long time
Distributed Shared Memory: Concepts and Systems 10
Fine-grained Preemptible Kernel
§ Lock-breaking preemption
• Combine the above ideas
• TSL use lock-breaking preemption
• Split long spinlock section to multiple acquire-release points
Distributed Shared Memory: Concepts and Systems 11
acq()
... // manipulate shared data
rel()
… // preemptive kernel, check for expired scheduler
reacq()
…
rel()
acq()
…
rel()
Scheduling
§ Highest priority goes first
• Very simple
• But misbehaving and high-priority process can starve all other tasks
• Temporal protection issue:
► don’t make misbehaved task consuming too much execution time
Distributed Shared Memory: Concepts and Systems 12
Scheduling
§ Proportion-period scheduling
• Each task is allocated a fixed proportion
of CPU
• Provide temporal protection
• Q and T are adjustable by feedback
controller mechanism
• After executing for time Q, it block until
next T (or schedule for non-TS task)
13
T1
T2
2/3
1/3
Proportion Q
Proportion Q
Time
Period T
Scheduling
§ TSL combines proportion-based and fixed priority scheduler
• Fixed priority schedule doesn’t guarantee temporal protection
• Make proportion Q with respect to fixed priority
• Sounds good, but few exceptions
Distributed Shared Memory: Concepts and Systems 14
Scheduling
§ Exception case:
C1
Server
High priority
Low priority
Time
C1 scheduled Server
run
C2 Mid priority
Blocking call
C1 Ready to runC2
run
C2 preempts server
: Priority inversion for C1
Scheduling
§ Exception case:
C1
Server
High priority
Low priority
Time
C1 scheduled Server
run
C2 Mid priority
Blocking call
C1 Ready to run
Highest locking priority (HLP):
When a task acquires a resource, it gets the
highest priority that can acquire this resource
e.g. server task is now priority of C1
Evaluation
§ Setup:
• Modified version of Linux 2.4.16
• + Firm timer, lock-breaking and proportion-period scheduler
• 1.5 GHz Intel Pentium 4
• 512 MB RAM
Distributed Shared Memory: Concepts and Systems 17
Evaluation
§ Micro benchmarks
• Evaluate timer latency and preemption latency
• Measure time that process actually sleep in time-sensitive process using
nanosleep()
§ Timer latency
• Standard Linux: 10ms
• + firm timer: < 1ms
§ Preemption latency
• Evaluate when number of system loads are run in background
• Linux + firm timer: > 5ms
• + lock-break: < 1ms
Distributed Shared Memory: Concepts and Systems 18
Evaluation
§ Evaluation on real-world application
• Measure audio/video sync skew in Mplayer (media player)
§ Evaluate under three scenarios
1. Non-kernel CPU load
o User-level stress test is run in the background
2. Kernel CPU load
o Large memory buffer is copied to a file (user to kernel space)
3. File-system load
o Large directory is copied
Distributed Shared Memory: Concepts and Systems 19
Evaluation
§ Non-kernel CPU load
Distributed Shared Memory: Concepts and Systems 20
Evaluation
§ Kernel CPU load
Distributed Shared Memory: Concepts and Systems 21
Evaluation
§ File-system load
Distributed Shared Memory: Concepts and Systems 22
Evaluation
§ Preemption overhead
• Memory access (access 128MB array thus produce page fault)
► TSL overhead: 0.42% std 0.18%
• Fork (create 512 processes)
► TSL overhead: 0.53% std 0.06%
• File system (copy data from 2MB user buffer to 8MB file)
► TSL overhead: no significant overhead
Distributed Shared Memory: Concepts and Systems 23
Evaluation
§ Firm timer overhead
Distributed Shared Memory: Concepts and Systems 24
Conclusion
§ TSL provides real-time support for commodity OS by,
• Firm timer
• Responsive kernel
• Proportion-based scheduler
§ TSL can be used in both time-sensitive and throughput oriented app
Distributed Shared Memory: Concepts and Systems 25
Discussion topics
§ Rethinking the Real-time
• Requirements for RT system
► What are needed to ensure exact (or near-exact) timing?
• Example of time-sensitive applications have to run in commodity OS
• How to bring RT concept to commodity OS efficiently?
Distributed Shared Memory: Concepts and Systems 26
Discussion topics
§ What features should we consider to build Distributed RTOS?
Distributed Shared Memory: Concepts and Systems 27

Supporting Time-Sensitive Applications on a Commodity OS

  • 1.
    Supporting Time-Sensitive Applications ona Commodity OS Ashvin Goel, Luca Abeni, Charles Krasic, Jim Snow and Jonathan Walpole Presenter : Namhyuk Ahn
  • 2.
    Real-Time and Time-SensitiveApplications § Real-Time system • Must meet an exact (or very nearly) deadline • Failure to meet deadline means system failures • e.g. space shuttle, air craft.. § Time-sensitive applications • Applications with real-time requirements • e.g. media player Distributed Shared Memory: Concepts and Systems 2
  • 3.
    Motivation § Needs oftime-sensitive application on commodity OS increase § However, § Most commodity OS aim to maximize system throughput • These don’t consider real-time nor time-sensitive applications § Most real-time OS focus on hard real-time scenario • System throughput is decreased and limited § Solution: General-purpose OS should offer both properties • Time Sensitive Linux (TS-Linux) Distributed Shared Memory: Concepts and Systems 3
  • 4.
    Problems of Time-SensitiveApplication § Time-sensitive applications require timely resources allocation § Minimize kernel latency is important issue • Timer latency due to inaccurate timer • Preemption latency due to threads executing in kernel cannot be preempted • Scheduling latency is time taken to schedule the event Distributed Shared Memory: Concepts and Systems 4
  • 5.
    Solutions § TSL improvesthree keys causing latency • Timer latency → Accurate timer mechanism ► Firm timer • Preemption latency → Responsive kernel ► Lock-breaking preemptible kernel • Scheduling latency → Effective scheduling ► Proportion-period scheduler ► Priority-based scheduler § TSL is version of Linux that can: • Effectively run time-sensitive applications • Effectively run throughput applications • Be implemented without modifying existing applications Distributed Shared Memory: Concepts and Systems 5
  • 6.
    Timer Mechanism § Periodictimers • Timer interrupts periodically; Max timer latency equals to the period • Reducing latency increase interrupt overhead § One-shot timers • Interrupts only when needs • Cost of reprogramming • High accuracy but high interrupt overhead § Example: Two task with periods 5, 7ms and run 35ms • Periodic timers with period of 1ms ► Maximum latency is 1ms; 35 interrupts • One-shot timer ► Relatively small latency; 11 interrupts (5ms, 7ms, 10ms, 14ms ..) Distributed Shared Memory: Concepts and Systems 6
  • 7.
    Timer Mechanism § Softtimers • One-shot timer have interrupt overhead • Reduce cost of context switch caused by interrupts • But, timer latency is increase § Procedure 1. At some points, system will arrive at “trigger states” o End of system call, exception handler or interrupt handler o CPU idle 2. This invoke event handler; there is no overhead since already context- switched 3. Soft timer checks for any pending events without hardware timer cost Distributed Shared Memory: Concepts and Systems 7
  • 8.
    Accurate Timer Mechanism– Firm timer § Firm timers • Combine all the advantages of three timers before • Overshoot parameter: bridge between one-shot and soft timer • This makes unnecessary interrupts avoided • High overshoot = soft; low = one-shot Distributed Shared Memory: Concepts and Systems 8 Time One-shot timer expired Overshoot syscall Programmed One-shot interrupt Dispatch and reprogram one-shot timer Interrupt does not occured In case no syscall in overshot period, it perform as one-shot timer
  • 9.
    Responsive Kernel § Kernelis responsive when non-preemptible section is small § The case scheduler is not able to run • Interrupts is disabled (has to ready in ready-queue) • Another thread is executing critical section in kernel § So, length of non-preemptible section is important • But traditional commodity kernel disable preemption for the entire kernel Distributed Shared Memory: Concepts and Systems 9
  • 10.
    Fine-grained Preemptible Kernel §Explicit preemption • Explicit insertion of preemption points inside the kernel • Kernel explicitly yields CPU to scheduler when meet preemption point • Has to be manually placed preemption points § Anytime preemption • Allow preemption anytime kernel not manipulating shared data structure • Shared kernel data must be protected using mutex or spinlock • High latency when spinlock are held long time Distributed Shared Memory: Concepts and Systems 10
  • 11.
    Fine-grained Preemptible Kernel §Lock-breaking preemption • Combine the above ideas • TSL use lock-breaking preemption • Split long spinlock section to multiple acquire-release points Distributed Shared Memory: Concepts and Systems 11 acq() ... // manipulate shared data rel() … // preemptive kernel, check for expired scheduler reacq() … rel() acq() … rel()
  • 12.
    Scheduling § Highest prioritygoes first • Very simple • But misbehaving and high-priority process can starve all other tasks • Temporal protection issue: ► don’t make misbehaved task consuming too much execution time Distributed Shared Memory: Concepts and Systems 12
  • 13.
    Scheduling § Proportion-period scheduling •Each task is allocated a fixed proportion of CPU • Provide temporal protection • Q and T are adjustable by feedback controller mechanism • After executing for time Q, it block until next T (or schedule for non-TS task) 13 T1 T2 2/3 1/3 Proportion Q Proportion Q Time Period T
  • 14.
    Scheduling § TSL combinesproportion-based and fixed priority scheduler • Fixed priority schedule doesn’t guarantee temporal protection • Make proportion Q with respect to fixed priority • Sounds good, but few exceptions Distributed Shared Memory: Concepts and Systems 14
  • 15.
    Scheduling § Exception case: C1 Server Highpriority Low priority Time C1 scheduled Server run C2 Mid priority Blocking call C1 Ready to runC2 run C2 preempts server : Priority inversion for C1
  • 16.
    Scheduling § Exception case: C1 Server Highpriority Low priority Time C1 scheduled Server run C2 Mid priority Blocking call C1 Ready to run Highest locking priority (HLP): When a task acquires a resource, it gets the highest priority that can acquire this resource e.g. server task is now priority of C1
  • 17.
    Evaluation § Setup: • Modifiedversion of Linux 2.4.16 • + Firm timer, lock-breaking and proportion-period scheduler • 1.5 GHz Intel Pentium 4 • 512 MB RAM Distributed Shared Memory: Concepts and Systems 17
  • 18.
    Evaluation § Micro benchmarks •Evaluate timer latency and preemption latency • Measure time that process actually sleep in time-sensitive process using nanosleep() § Timer latency • Standard Linux: 10ms • + firm timer: < 1ms § Preemption latency • Evaluate when number of system loads are run in background • Linux + firm timer: > 5ms • + lock-break: < 1ms Distributed Shared Memory: Concepts and Systems 18
  • 19.
    Evaluation § Evaluation onreal-world application • Measure audio/video sync skew in Mplayer (media player) § Evaluate under three scenarios 1. Non-kernel CPU load o User-level stress test is run in the background 2. Kernel CPU load o Large memory buffer is copied to a file (user to kernel space) 3. File-system load o Large directory is copied Distributed Shared Memory: Concepts and Systems 19
  • 20.
    Evaluation § Non-kernel CPUload Distributed Shared Memory: Concepts and Systems 20
  • 21.
    Evaluation § Kernel CPUload Distributed Shared Memory: Concepts and Systems 21
  • 22.
    Evaluation § File-system load DistributedShared Memory: Concepts and Systems 22
  • 23.
    Evaluation § Preemption overhead •Memory access (access 128MB array thus produce page fault) ► TSL overhead: 0.42% std 0.18% • Fork (create 512 processes) ► TSL overhead: 0.53% std 0.06% • File system (copy data from 2MB user buffer to 8MB file) ► TSL overhead: no significant overhead Distributed Shared Memory: Concepts and Systems 23
  • 24.
    Evaluation § Firm timeroverhead Distributed Shared Memory: Concepts and Systems 24
  • 25.
    Conclusion § TSL providesreal-time support for commodity OS by, • Firm timer • Responsive kernel • Proportion-based scheduler § TSL can be used in both time-sensitive and throughput oriented app Distributed Shared Memory: Concepts and Systems 25
  • 26.
    Discussion topics § Rethinkingthe Real-time • Requirements for RT system ► What are needed to ensure exact (or near-exact) timing? • Example of time-sensitive applications have to run in commodity OS • How to bring RT concept to commodity OS efficiently? Distributed Shared Memory: Concepts and Systems 26
  • 27.
    Discussion topics § Whatfeatures should we consider to build Distributed RTOS? Distributed Shared Memory: Concepts and Systems 27