LCA14: LCA14-506: Comparative analysis of preemption vs preempt-rt
Upcoming SlideShare
Loading in...5
×
 

LCA14: LCA14-506: Comparative analysis of preemption vs preempt-rt

on

  • 627 views

Resource: LCA14 ...

Resource: LCA14
Name: LCA14-506: Comparative analysis of preemption vs preempt-rt
Date: 07-03-2014
Speaker: Gary Robertson
Video: https://www.youtube.com/watch?v=QiguBicpB88
Website: http://www.linaro.org/
Linaro Connect: http://connect.linaro.org/
Slide: http://www.slideshare.net/linaroorg/lca14-lca14506-comparative-analysis-of-preemption-vs-preemptrt

Statistics

Views

Total Views
627
Slideshare-icon Views on SlideShare
627
Embed Views
0

Actions

Likes
0
Downloads
4
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    LCA14: LCA14-506: Comparative analysis of preemption vs preempt-rt LCA14: LCA14-506: Comparative analysis of preemption vs preempt-rt Presentation Transcript

    • Gary Robertson, LCA14, Macau LCA14-506: Comparative analysis of preemption vs preempt-rt
    • In this presentation we will try to illustrate the pro’s, con’s, and latency characteristics of several Linux kernel preemption models, and provide some guidance in selecting an appropriate preemption model for a given category of application. Overview of Topics Presented
    • Questions we will address include: • Which preemption model provides the best throughput? • Which model offers the lowest average latencies? • Which model offers the lowest maximum latencies? • Which model offers the most predictable latencies? • How do load conditions impact the respective latency performance of the various models? • What impact does CPU Frequency Scaling or CPU sleep states have on latency performance? • What is the best model for a given application type? Overview of Topics - continued
    • Our intent is to show relative trends between the preemption models under the same conditions… so the data presented were gathered thusly: • Each preemption model configuration was tested using identical tests running on the same InSignal Arndale. • Cyclictest was used for a run duration of two hours with a single thread executing at a SCHED_FIFO priority of 80 to realistically represent scheduling latency for a real- time process. • A cyclictest run was done with no system load, then another with an externally-applied ping flood, and another with back-to-back executions of hackbench running to represent maximum system loading. Test Rationale and Methodology
    • Latency Impact of CPU Frequency Scaling
    • Only three Linux preemption models are really interesting for anything other than desktop use: • the Server preemption model provides optimal throughput for applications where latencies are not an issue • the Low Latency Desktop preemption model provides low average latencies for interactive and ‘soft real-time’ applications • the Full RT preemption model provides the highest level of latency determinism for ‘hard real-time’ applications Tested Linux Preemption Models
    • Server Preemption Model Latencies Cyclictest with no system load CPU frequency scaling disabled Minimum Latency: 16 usec Average Latency: 24 usec Most Frequent Latency: 24 usec Maximum Latency: 572 usec Standard Deviation: 1.211041 Almost all latencies between 20 usec and 28 usec However, even at light loads, latencies out to 572 usec were observed. This is a consequence of all code paths through the kernel being non-preemptible.
    • Server Preemption Model Latencies Cyclictest with ping flood load CPU frequency scaling disabled Minimum Latency: 15 usec Average Latency: 23 usec Most Frequent Latency: 24 usec Maximum Latency: 592 usec Standard Deviation: 1.580778 Almost all latencies between 20 usec and 28 usec Note, however, that much longer latencies continue to be observed due to lack of any design efforts to avoid them. Also note that maximum latency is already beginning to creep upwards.
    • Server Preemption Model Latencies Cyclictest with hackbench load CPU frequency scaling disabled Minimum Latency: 17 usec Average Latency: 150655 usec Most Frequent Latency: 22 usec Maximum Latency: 2587753 usec Standard Deviation: 493977.9 The majority of latencies were between 21 usec and 25 usec, gradually tapering off to single digit frequencies at 204 usec. Note the duration of the max latency is 4000 times longer than under no load! Note also the much lower frequency percentage for the peak occurrence. This means a larger percentage of the higher latencies were observed, and illustrates the serious degradation of latency determinism under load in a non- preemptible kernel where latency was not a primary design consideration.
    • Low Latency Desktop Model Latencies Cyclictest with no system load CPU frequency scaling disabled Minimum Latency: 19 usec Average Latency: 28 usec Most Frequent Latency: 29 usec Maximum Latency: 57 usec Standard Deviation: 0.8698308 The majority of latencies were between 28 usec and 31 usec, quickly tapering off to single digit frequencies at 42 usec. Maximum latency was reduced tenfold under light loads vs. the Server model. This illustrates the significant improvements in latency performance under light loads with kernel preemption enabled.
    • Low Latency Desktop Model Latencies Cyclictest with ping flood CPU frequency scaling disabled Minimum Latency: 18 usec Average Latency: 29 usec Most Frequent Latency: 29 usec Maximum Latency: 131 usec Standard Deviation: 1.79573 The majority of latencies were between 28 usec and 32 usec, quickly tapering off to single digit frequencies at 80 usec. The reduced range of observed latencies indicates improved latency performance and predictability at moderate loads versus the Server model. However, as the next slide will show, latency performance in this model degrades seriously under heavy load, making Full RT a better choice for latency performance under heavy load conditions.
    • Low Latency Desktop Model Latencies Cyclictest with hackbench CPU frequency scaling disabled Minimum Latency: 19 usec Average Latency: 370606 usec Most Frequent Latency: 25 usec Maximum Latency: 4122148 usec Standard Deviation: 826092 The majority of latencies were between 24 usec and 26 usec, gradually tapering off to single digit frequencies at 105 usec. Note that the max latency was 70,000 times longer than with this model under no load! Max latencies were nearly double that for a Server model under heavy load, and latency predictability is low. This illustrates the combined impacts under heavy load of increased context switches without addressing priority inversion or FIFO queueing disciplines.
    • Full RT Preemption Model Latencies Cyclictest with no system load CPU frequency scaling disabled Minimum Latency: 19 usec Average Latency: 29 usec Most Frequent Latency: 29 usec Maximum Latency: 53 usec Standard Deviation: 1.031893 The majority of latencies were between 29 usec and 31 usec, quickly tapering off to single digit frequencies at 50 usec. Maximum latency was reduced tenfold under light loads vs. the Server model. This illustrates the significant improvements in latency performance under light loads with kernel preemption enabled. Under light load performance is very similar to that of the Low Latency Desktop model.
    • Full RT Preemption Model Latencies Cyclictest with ping flood CPU frequency scaling disabled Minimum Latency: 19 usec Average Latency: 29 usec Most Frequent Latency: 30 usec Maximum Latency: 59 usec Standard Deviation: 2.698587 The majority of latencies were between 29 usec and 31 usec, quickly tapering off to single digit frequencies at 53 usec. The reduced range of observed latencies indicates improved latency performance and predictability at moderate loads versus the Server model. Note that even at moderate loads the maximum latencies are less than half the duration of those seen in the Low Latency Desktop model.
    • Full RT Preemption Model Latencies Cyclictest with hackbench load CPU frequency scaling disabled Minimum Latency: 21 usec Average Latency: 29 usec Most Frequent Latency: 25 usec Maximum Latency: 156 usec Standard Deviation: 7.69571 The majority of latencies were between 24 usec and 26 usec, with a second group peaking between 43 and 44 usec, and quickly tapering off to single digit frequencies at 134 usec. Latency performance under heavy load is much better than in any of the other preemption models. With threaded interrupt handlers, priority inheritance and priority-based queuing disciplines, the real-time process is still able to meet much tighter scheduling deadlines despite heavy activity of other lower- priority threads.
    • Comparative Latency Performance
    • Comparative Latency Performance
    • Comparative Latency Performance
    • • For applications in which throughput and not latencies are the primary consideration, opt for the Server model • If quality of service is important but missed latency deadlines will not result in catastrophic failures, opt for the Low Latency Desktop model and size the hardware capacity to keep loading moderate • For host environments for ‘zero overhead Linux’ (ODP for example), Low Latency Desktop is a good choice • If latencies must be consistent even under high load conditions Full RT may be required • For applications based on POSIX real-time scheduling and priority-based preemption, use Full RT for best results What Preemption Model Is Best for Me?
    • The test scripts, data files, and graphs used to provide reference data for this presentation may be accessed online at the following URL: http://people.linaro.org/~gary.robertson/LCA14 Data References
    • More about Linaro Connect: http://connect.linaro.org More about Linaro: http://www.linaro.org/about/ More about Linaro engineering: http://www.linaro.org/engineering/ Linaro members: www.linaro.org/members
    • Preemption Model Characteristics Appendix A
    • The Server preemption model lies at one extreme of the latency vs. throughput continuum. Pro’s include: • Simplicity, maturity and robustness make this a very reliable platform • With no preemption the reduced number of context switches minimizes system overhead and maximizes overall throughput Server Model Characteristics
    • Con’s include: • The lack of preemption results in low average latencies under low loads but much higher latencies when the system is heavily loaded • The latencies imposed by different execution paths through the kernel result in a wide range of latency durations and low latency determinism Server Model Characteristics
    • The Low Latency Desktop preemption model holds the middle ground in the latency vs. throughput continuum. Pro’s include: • Under low to moderate load, latency range and predictability are significantly improved vs. the Server model • This preemption model is supported as part of the mainstream kernel and tends to be less trouble-prone than Full RT preemption Low Latency Desktop Model Characteristics
    • Con’s include: • The preemption of kernel operations and increased number of context switches create increased overhead and reduced performance relative to the Server model • The preemption of kernel operations results in higher average latencies vs. the Server preemption model • This preemption model does not perform as well under heavy system loads as other models Low Latency Desktop Model Characteristics
    • The following software-induced latency sources remain problematic in the Low Latency Desktop preemption model: • Exceptions, software interrupts, and device service request interrupts execute outside of scheduler control • Most mutual exclusion locking primitives are subject to priority inversion • Shared resources use FIFO-based queueing disciplines, meaning high-priority threads may have to wait behind lower priority threads for access to the resources These factors result in lower levels of latency determinism. Low Latency Desktop Model - continued
    • The Full RT preemption model represents the latency- centric end of the latency vs. throughput continuum. It attempts to mitigate all the remaining software-induced sources of latency. • Handlers for exceptions, software interrupts, and device service request interrupts are encapsulated inside threads which are under scheduler control • Priority inheritance is added for most mutual exclusion locking primitives to prevent priority inversions • Shared resources use priority-based queueing disciplines so that the highest-priority thread always gets first access to the resources Full RT Model Characteristics
    • The Full RT preemption model inevitably suffers from reduced overall throughput as a consequence of its efforts to maximize latency determinism: • Schedulable ‘threaded’ ISRs result in the highest levels of preemption and context switch overhead • Priority inheritance involves iterative logic to temporarily boost the priorities of all lock holders to equal that of the highest-priority lock waiters. This adds significant overhead to locking primitive code. • Priority-based queueing requires sorting the queue each time a new waiting thread is added Full RT Model Characteristics - continued
    • Pro’s include: • The most consistent and predictable latency performance available in any preemption model • The best support environment for creating priority- based multi-layered applications • The best hard real-time support available in a Linux environment Full RT Model Characteristics - continued
    • Con’s include: • Full RT preemption is supported only with a separately maintained kernel patch set • The latest supported RT kernel version always lags behind mainstream development • Mainstream drivers, libraries, and applications may not always function properly in the Full RT environment • Poorly designed or written real-time threads may starve out threaded interrupt handlers Full RT Model Characteristics - continued