Timing verification of automotive communication architecture using quantile estimation


Published on

Slides of a paper at ERTSS'2014 co-authored by Nicolas NAVET (University of Luxembourg), Shehnaz LOUVART (Renault), Jose VILLANUEVA (Renault), Sergio CAMPOY-MARTINEZ (Renault) and Jörn MIGGE (RealTime-at-Work). Early stage timing verification on CAN traditionally relies on simulation and schedulability analysis, also known as worst-case response time (WCRT) analysis. Despite recent progresses, the latter technique remains pessimistic especially in complex networking architectures with gateways and heterogeneous communication stacks. Indeed, there are practical cases where no exact WCRT analysis is available, and merely upper bounds on the response times can be derived, on the basis of which unnecessary conservative design choices may be made. Simulation, on the other hand, does not provide anyguarantees per se and, in the context of critical networks, should only be used along with an adequate methodology. In this paper, we argue for the use of quantiles of the response time distribution as performance
metrics providing an adjustable trade-off between safety and resource usage optimization. We discuss how the exact value of the quantile to consider should be chosen with regard to the criticality of the frames, and illustrate the approach on two typical automotive use-cases.

Published in: Technology
  • Be the first to comment

Timing verification of automotive communication architecture using quantile estimation

  1. 1. Timing verification of automotive communication architecture using quantile estimation Nicolas NAVET (Uni Lu), Shehnaz LOUVART (Renault), Jose VILLANUEVA (Renault), Sergio CAMPOY-MARTINEZ (Renault) and Jörn MIGGE (RealTime-at-Work). ERTSS’2014 - Toulouse, February 5-7, 2014. February 07, 2014
  2. 2. 1 Outline  Early-stage timing verification of wired automotive buses – CAN-based communication architectures Schedulability analysis versus simulation Performance metrics : the case for quantiles derived by simulation 2 typical automotive use-cases ERTSS'2014 07/02/2014 - 2
  3. 3. 2 Automotive communication architectures  Increased bandwidth requirements & timing constraints  More complex & heterogeneous architectures with black-box ECUs  Optimized CAN networks for higher bus loads: priorities, frame offsets, gateways, communication stacks, etc  Verification activity of higher importance today, higher load levels calls for more accurate verification models  no margin for errors  Main performance metrics: frame response time = communication latency ERTSS'2014 07/02/2014 - 3
  4. 4. Schedulability analysis “mathematic model of the worst-case possible situation” VS Schedulability analysis : Simulation “mathematic reproduces the “program that model of the worst-case possible situation” behavior of a system” max number of max number of instances that can instances arriving after accumulate at critical critical instants instants  Upper bounds on the perf. metrics  Safe if model is correct and assumptions met  Models close to real systems  Often pessimistic  overdimensioning  Fine grained information  Might be a gap between models and real systems!  unpredictably unsafe then  Worst-case response times are out of reach! Occasional deadline misses must be acceptable ERTSS'2014 07/02/2014 - 4
  5. 5. RTaW : “enable designers to build provably safe and optimized critical systems” – Simulation and schedulability analysis for networks and ECU CAN, CAN FD, Arinc825, Ethernet, FlexRay, AFDX, etc… – OEM customers: Renault, PSA, Eurocopter, Astrium, ABB − RTaW/Sim Starter edition can be downloaded from www.realtimeatwork.com − No black box software: all schedulability analysis that are implemented are published Used in this study RTaW-Sim  CAN simulator with schedulability analysis and configuration algorithms ERTSS'2014 07/02/2014 - 5
  6. 6. 2 Metrics for the evaluation of frame latencies: the case for quantiles
  7. 7. Frame response time distribution Upper-bound with schedulability analysis (actual) worst-case response time (WCRT) Probability Simulation max. Q1 Q2 Response time Easily observable events Testbed / Simulation Infrequent events Long Simulation Rare events Schedulability analysis Q1: pessimism of schedulability analysis ?! Q2: distance between simulation max. and WCRT ?! ERTSS'2014 07/02/2014 - 10
  8. 8. Using quantiles means accepting a controlled risk Quantile Qn: P[ response time > Qn ] < 10-n Probability Upper-bound with schedulability analysis Simulation max. Q4 Q5 Probability < 10-5 one frame every 100 000 Response time  No extrapolation here, won’t help to say anything about what is too rare to be in simulation traces ERTSS'2014 07/02/2014 - 11
  9. 9. Identifying both deadline and tolerable risks Probability Simulation max. Q4 deadline Q 5 Response time 1. 2. 3. 4. Identify frame deadline Decide the tolerable risk  target quantile Simulate “sufficiently” long If target quantile value is below deadline, performance objective is met ERTSS'2014 07/02/2014 - 12
  10. 10. 1) Quantiles vs average time between deadline misses Quantile One frame every … Mean time to failure Frame period = 10ms Mean time to failure Frame period = 500ms Q3 1 000 10 s 8mn 20s Q4 10 000 1mn 40s ≈ 1h 23mn Q5 100 000 ≈ 17mn ≈ 13h 53mn Q6 1000 000 ≈ 2h 46mn ≈ 5d 19h … … … Warning : successive failures in some cases might be temporally correlated, this must be assessed! Use of distributions of successive quantile overshoots, linear and non-linear dependency analysis ERTSS'2014 07/02/2014 - 13
  11. 11. 2) Determine the minimum simulation length time needed for quantile convergence  reasonable # of values: a few tens … Tool support can help here: e.g. numbers in gray should not be trusted ERTSS'2014 [RTaW-sim screenshot] Reasonable values for Q5 and Q6 (with periods <500ms) are obtained in a few hours of simulation (with a highspeed simulation engine) – e.g. 2 hours for a typical automotive setup 07/02/2014 - 14
  12. 12. 3 Typical use-cases of quantile-based performance evaluation
  13. 13. Use-case 1: OBD2 request through a gateway 50% load – 500kbit/s 40% load – 500kbit/s OBD2 request Simulated production delay response Conservative assumptions: FIFO, transmission errors [RTaW-sim screenshot] Time between the OBD2 request frame and reception of the first answer frame must not be greater than 50ms once every 1000 requests ERTSS'2014 07/02/2014 - 16
  14. 14. Use-case 1: OBD2 request through a gateway Time between the OBD2 request frame and reception of the first answer frame must not be greater than 50ms once every 1000 requests Metrics OBD response times Min Average Q3 Q4 Q5 Q6 Max 31.94 34.29 46.55 49.31 53.45 55.32 56.57 Q3 Q4 Response time distribution ERTSS'2014 07/02/2014 - 17
  15. 15. Use-case 2: end-to-end response time of a 10ms control frame 10ms T13 frame Functional level impact: less than 1 frame every 106 above deadline=10ms is acceptable Q6 = 8.9 max= 12.1 ERTSS'2014 07/02/2014 - 18
  16. 16. Concluding remarks 1 Timing verification techniques & tools should not be trusted blindly 2 Simulation is well suited to systems that requires timing guarantees but  Are not well amenable to schedulability analysis  Or can tolerate deadline misses with a controlled level of risk 3 Some methodological aspects  Determine quantile wrt criticality, and simulation length wrt to quantile  Simulator and models validation  High-performance simulation engine needed for higher quantiles ERTSS'2014 07/02/2014 - 19