Frame latency evaluation: when simulation and analysis alone are not enough

904 views
839 views

Published on

This talk is about temporal verification in real-time communication systems. Early in the design cycle, the two main approaches for verifying timing constraints and dimensioning the networks are worst-case schedulability analysis and simulation. The aim of the talk is to demonstrate that both provide complementary results and that, most often, none of them alone is sufficient. In particular, it will be shown that response time distributions that can be derived from simulations cannot replace worst-case analysis. This will be done on automotive case-studies using RTaW analysis and simulation software tools.

Published in: Technology, Design
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
904
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Frame latency evaluation: when simulation and analysis alone are not enough

  1. 1. Frame latency evaluation: when simulation and analysis alone are not enough Nicolas Navet, INRIA / RTaW Aurélien Monot, LORIA / PSA Jörn Migge, RTaW WFCS 2010 – Industry Day Nancy, 19/05/2010
  2. 2. Outlook 1. Context: early design phases where only simulation and analysis are available 2. Goal: see how simulation and analysis compare and point out their pitfalls 3. Method: insight from experiments on Controller Area Network © 2010 RTaW / PSA / Loria - 2
  3. 3. Why timing verification is required – Verify that performance requirements are met: deadlines, jitters, throughput – Select the hardware / software components: optimize costs – Meet some certification level: e.g., avionics, railway systems, power plants, etc Timing models: trade‐off to be found between  accuracy / complexity / computing time © 2010 RTaW / PSA / Loria - 3
  4. 4. RTaW mission: help designers build truly safe and optimized systems – Activities: Model-Based Design, dependability, formal and temporal verification – Communications systems : CAN, AFDX, FlexRay, SpaceWire, industrial Ethernet, TTP, etc … – Verification techniques: schedulability analysis, network-calculus, model-checking and simulation – Domains: aerospace, automotive and industry at large In our experience, 2 cases for timing verification : Certification is mandatory (e.g., DO178B ‐ DAL A): well  accepted No certification : various practices / levels of acceptance    © 2010 RTaW / PSA / Loria - 4
  5. 5. Type A: no timing validation whatsoever (early in the V-cycle) Practice: Carry-over of existing (proven in use) systems with domain-specific rules: “The load on an automotive CAN network must not be higher than 30%” “A frame pending for transmission for more than 30ms is cancelled out” etc… Sub‐optimal design : e.g., does one really need 5 (or more) distinct CAN  buses in a car?!  Potentially unsafe design with problems that are hard to reproduce and are  costly to repair later … © 2010 RTaW / PSA / Loria - 5
  6. 6. Type B: simulation is enough, worst-case never occurs anyway! Practice: software simulations, then simulations with HiL (Hardware in the Loop) as the ECUs become available … Hardware resources (too?) well optimized Unsafe results because the worst‐case sometimes occurs (and may even last  for a long time, see preliminary results later in the presentation)  A question that remain mainly open in timing verification :  “How often does the worst‐case actually occur ? “ First, get some insight with experimentations … © 2010 RTaW / PSA / Loria - 6
  7. 7. Type C: analysis says the system is safe, so we are covered … Practice: use some black-box software that implements worst-case timing analysis and concludes about the feasibility of the system Sub‐optimal design sometimes because overestimations / pessimistic  assumptions add up  Potentially unsafe design : – software are error‐prone,  – not everything is accurately modeled – analytic models – especially unpublished complex ones – can be wrong © 2010 RTaW / PSA / Loria - 7
  8. 8. Experimental setup © 2010 RTaW / PSA / Loria - 8
  9. 9. CAN communication stack a simplified view Middleware 5ms Frame-packing task Waiting queue: - FIFO - Highest Priority First Requirements on 2 temporal verification: - OEM specific 1 handle 150+ frames ≠ waiting queue policy at CAN Controller the microcontroller level limited number of 9 6 8 transmission buffers handle frame offsets buffer Tx CAN Bus © 2010 RTaW / PSA / Loria - 9
  10. 10. Scheduling frames with offsets ?! Principle: desynchronize transmissions to avoid load peaks 0 10 15 Periods 0 5 5 20 ms 0 0 5 15 ms 10 ms 0 10 20 30 40 50 60 70 80 90 100 110 5 5 2,5 2,5 Periods 0 0 20 ms 15 ms 10 ms 0 10 20 30 40 50 60 70 80 90 100 110 Algorithms to decide offsets are based on arithmetical properties of the periods and size of the frame © 2010 RTaW / PSA / Loria - 10
  11. 11. Network configuration Network Controller Area Network 125 kbit/s Set of messages Automotive body network generated with NETCARBENCH [8] http://www.netcarbench.org # ECUs 15 # frames 145 Workload 50.5% Periods [50,2000ms] with distributions from an existing car Frame offsets Optimized with DOA algorithm [3] Inter-ECU offsets All offsets are possible (clock drifts, ECU reboots, ECU boot sequence depends on sleep mode, etc) ECU clock drifts 3 cases: no drift, ±1ppm, ±1000ppm © 2010 RTaW / PSA / Loria - 11
  12. 12. RTaW software used in the study WCRT is not overestimated! RTaW-Sim : Fine-Grained Simulation of NETCAR-Analyzer : Timing Analysis Controller Area Network with Fault-Injection and Resource Usage Optimization for Controller Area Network (© Inria/Inpl) Capabilities RTaW‐Sim freely available at http://www.realtimeatwork.com starting from June 2010  © 2010 RTaW / PSA / Loria - 12
  13. 13. On why we should not trust analytic models for worst-case frame latency evaluation © 2010 RTaW / PSA / Loria - 13
  14. 14. Types of results achievable with worst- Max buffer utilization case analysis Frame worst-case response times non-optimized solution Worst-case inter-ECU offsets optimized solution 1 and 2 lower-bound Frames by decreasing priority © 2010 RTaW / PSA / Loria - 14
  15. 15. Analytic models need to be fine-grained frame offsets overlooked here … Analysis Setup: - Frame offsets : DOA algorithm [3] - Worst-case results whatever clock drifts and inter-ECU offsets ≈ 120ms ! To the best of our knowledge, there are no (usable) published results on this  © 2010 RTaW / PSA / Loria - 15
  16. 16. Analytic model needs to be fine grained Frame waiting queue is FIFO on ECU1 the OEM does not know or software cannot handle it … Analysis Setup: - Frame offsets : DOA algorithm [3] - Worst-case results whatever clock drifts and inter-ECU offsets - FIFO waiting queue on ECU1 Many high‐priority frames are delayed here because  a single ECU (out of 15) has a FIFO queue … © 2010 RTaW / PSA / Loria - 16
  17. 17. There is a gap between WCRT analytic models and reality IMHO – Traffic is not always well caracterized and/or well modeled e.g. aperiodic traffic ?! see [5] for some solution – Implementation choices really matter and standards do not say everything, eg. Autosar drivers – Analytic models are often much simplified abstraction of reality – optimistic (=unsafe): FIFO queue, hardware limitations such as non-abortable transmissions [4,7], etc – overly pessimistic: e.g. overlooking frame offsets, aperiodic traffic modeled as sporadic, etc – Analytic models are prone to errors remember “ CAN analysis refuted, revisited, etc” [6] ?! Bottom line: do not blindly trust analytic models! Systems should be conceived so as to be analyzable in temporal domain © 2010 RTaW / PSA / Loria - 17
  18. 18. On why we should not trust simulation models for worst- case frame latency evaluation © 2010 RTaW / PSA / Loria - 18
  19. 19. Are simulation results (max) close to worst- case response times ? Well … Simulation Setup: - Random inter-ECU offsets ≈ 40ms ! - no ECU clock drift © 2010 RTaW / PSA / Loria - 19
  20. 20. Are simulation results (max) close to worst- case response times ? with clock drifts Simulation Setup: Analysis/sim ≈ 2 ! - Random inter-ECU offsets - Slow and fast clock drifts - Sim duration: vehicle lifetime Whatever you do, you have little chance with simulation to find the worst‐case! © 2010 RTaW / PSA / Loria - 20
  21. 21. Are simulation results (max) close to worst- case response times ? with clock drifts Sum of max (simulation ) / Sum of WCRT (analysis) 1 0,8 Simulation Setup: 10 lowest priority 0,6 frames - Same as 10 highest priority previous slide 0,4 frames all frames 0,2 0 ± 500 ppm drift ± 5000 ppm drift ± 50000 ppm drift ± 1 ppm drift ± 1000 ppm drift ± 10000 ppm drift Increasing the clock drift rate is not enough … © 2010 RTaW / PSA / Loria - 21
  22. 22. Knowing the analysis results – including here worst-case inter-ECU offsets for each frame - simulation becomes more useful © 2010 RTaW / PSA / Loria - 22
  23. 23. Simulation helps validate assumptions made, correctness and tightness of WCRT analysis Simulation Setup: - worst-case inter-ECU offsets for frame 61 given worst-case trajectory is by NETCAR-Analyzer simulated for this frame - no ECU clock drift Difference comes here from the blocking factor that is not explicitly simulated © 2010 RTaW / PSA / Loria - 23
  24. 24. How often does the worst-case occurs: very often on certain trajectories … Simulation Setup: Average value is “close” - worst-case inter-ECU to maximum on the offsets for frame 61 given worst-case trajectory! by NETCAR-Analyzer - no ECU clock drift © 2010 RTaW / PSA / Loria - 24
  25. 25. Distribution of response times for frame 61 with and without clock drifts Probability 0,5 0,45 no drift:only two large 0,4 values 0,35 drift - after 30mn 0,3 drift - after 20mn 0,25 drift - after 10mn 0,2 no drift 0,15 0,1 0,05 no drift 0 drift - after 10mn 11,75 12,25 drift - after 20mn 12,75 13,25 29,25 29,75 drift - after 30mn 30,25 30,75 Even with clock drift, unusually large response times occur during more than 30mn!  © 2010 RTaW / PSA / Loria - 25
  26. 26. Conclusion: in the context of dependability constrained systems … – Simulation is not enough and analytic models are usually much simplified, often pessimistic and sometimes even wrong – Simulating the worst-case trajectory (and neighbours): – helps to validate analytic models : latencies, buffer occupation, etc – tells us about how long we stay in the worst-case situation – Our ongoing work: how often does the worst-case actually occur? do we really need to dimension for the worst-case for a given a SIL level? – Application to CAN, AFDX and switched Ethernet in aerospace, power plant and automotive domains © 2010 RTaW / PSA / Loria - 26
  27. 27. References © 2010 RTaW / PSA / Loria - 27
  28. 28. References [1] N. Navet, F. Simonot-Lion, editors, The Automotive Embedded Systems Handbook, Industrial Information Technology series, CRC Press / Taylor and Francis, ISBN 978-0849380266, December 2008. [2] RealTime-at-Work (RTaW), A Fine-Grained Simulation of Controller Area Network with Fault-Injection Capabilities, freely available on RTaW web site: http://www.realtimeatwork.com, 2010. [3] M. Grenier, J. Goossens, N. Navet, "Near-Optimal Fixed Priority Preemptive Scheduling of Offset Free Systems", Proc. of the 14th International Conference on Network and Systems (RTNS'2006), Poitiers, France, May 30-31, 2006. [4] D. Khan, R. Bril, N. Navet, “Integrating Hardware Limitations in CAN Schedulability Analysis“, WiP at the 8th IEEE International Workshop on Factory Communication Systems (WFCS 2010), Nancy, France, May 2010. [5] D. Khan, N. Navet, B. Bavoux, J. Migge, “Aperiodic Traffic in Response Time Analyses with Adjustable Safety Level“, IEEE ETFA2009, Mallorca, Spain, September 22-26, 2009. [6] R. Davis, A. Burn, R. Bril, and J. Lukkien, “Controller Area Network (CAN) schedulability analysis: Refuted, revisited and revised”, Real-Time Systems, vol. 35, pp. 239–272, 2007. [7] M. D. Natale, “Evaluating message transmission times in Controller Area Networks without buffer preemption”, in 8th Brazilian Workshop on Real- Time Systems, 2006. [8] C. Braun, L. Havet, N. Navet, "NETCARBENCH: a benchmark for techniques and tools used in the design of automotive communication systems", Proc IFAC FeT 2007, Toulouse, France, November 7-9, 2007. © 2010 RTaW / PSA / Loria - 28
  29. 29. Questions / feedback ? Please get in touch at : nicolas.navet@realtimeatwork.com aurelien.monot@mpsa.com jorn.migge@realtimeatwork.com © 2010 RTaW / PSA / Loria - 29

×