Innovation day 2013 2.5 joris vanderschrick (verhaert) - embedded system development

2
When first time right embedded system
developments need to become cost
effective
CONFIDENTIAL
Joris Vanderschrick
Business development embedded systems
joris.vanderschrick@verhaert.com
THEME 2: RISK MANAGEMENT IN INNOVATION

3
Joris Vanderschrick
Business Development Embedded Systems
joris.vanderschrick@verhaert.com

4
Cut
• Development
phases
• Functional
subsystems
Tangible
• Visualize
• Simulate
• Test
• Review
• Roadshow
Risk focus
• Criticalities
• Added value
• 360°
Early
• Rapid
prototyping
• First time right
Options
• Backup
• Buffer
• Requirements
vs. design
Risk based development methods
Reliability

5
Introduction
Reliability: The measure of a product’s ability to
…perform the specified function
…at the customer (within their use environment)
…over the desired lifetime

6
Customer satisfaction vs reliability
6%
17%

7
Cost of changes (ifo Reliability)
NJIT by Rishi R Persad

8
Objectives of a reliability approach
Objectives:
• Early identification of weak points in design to:
• Limit the risk/cost of modifications in production or deployment phase
• Reduce product failures/returns/recalls during the product lifecycle
• Improve time to market by early detection of weakness and flaws
• Minimize number of dead-on-arrivals
• Increase customer satisfaction

9
Different methods to define reliability
3 methods:
1. Theoretical Approach: Standards/Norms & Simulations
2. Pragmatic Approach: Accelerated Testing
3. Analytical Approach (PoF)

10
Example of a Verhaert approach
Step 1:
Inventarisation &
scoping
Step 2:
Calculations &
Simulations
Step 3:
Detailing of tests
Step 4:
Implementation of test
plan
Define Subsytems
Explore Norms &
Guidelines
FMECA, FTA,…
Failure Rate
calculations
following the
selected
Norm/Standard
Simulations
(Mech., Electr,
SW,…)
HALT vs
Traditional Testing
Theoretical Approach
Execution
Pragmatic approach
PoF Approach
Analytical ModelsFMMEA Define Failure
Mechanisms
Define acceleration
factors
Optimize analytical
models

11
CONFIDENTIAL
Inventarisation & Scoping

12
Inventarisation
The definition of the Reliability approach starts with the inventarisation by subsystem:
• System breakdown in subsystems-assembly-subassembly-components
• Typical systems: Electronic & electrical systems, mechanical, hydraulic, process
systems,…
• Which critical topics are relevant (FMECA: Failure modes)?
• How will these critical topics be evaluated ifo life-time. (via which norm or guideline)
This will invlove the inventarisation of the norms or guidelines that are the most relevant
for the application or intended purpose.

13
How? Example: FMECA
RPN = Severity x Occurrence x Detection
The RPN can then be used to compare issues within the analysis and to prioritize problems for corrective
Action. The ratings are defined by:
• Main published standards for this type of analysis, like SAE J1739, AIAG FMEA-3 and MIL-STD-
1629A.
• Industries and companies have developed their own procedures to meet the specific requirements of
their products/processes

14
Why use a FMECA
FMECA/FMEA is useful as a survey method to identify effects of major failure modes in a system
It can contribute to improved designs for products and processes, resulting in higher reliability, better
quality, increased safety, enhanced customer satisfaction and reduced costs.
• Avoid time and cost consuming design changes at a late stage in the development
• The tool can also be used to establish and optimize maintenance plans, control plans and other
quality assurance procedures.
• In addition, an FMEA or FMECA is often required to comply with safety and quality requirements,
such as ISO 9001, QS 9000, ISO/TS 16949, 13485, FDA,…
Remarks:
• Complex systems & processes makes the task of defining a detailed FMEA/FMECA time-consuming
• Assumes the causes of problems are all single event in nature (combinations of events = 1 event)
• The process relies on the right participants & open communication & cooperation
• Human error sometimes overlooked
It’s just a tool. Without a follow-up plan & actions, It will not improve the reliability of your system

15
Scoping
Evaluation & definition of the appropriate calculation methods of the failure rate
• For the defined building blocks (sub-systems) & specific parts, we will analyze which norm or
standard provides the best method for the evaluation & calculation of the failure rate.
Work packages
1.1. Voorbereiding
met AGFA
Reliability
Electronic
components
Reliability
Mechanical parts
General Approach
& Study logic ifo
reliability design &
production
Software reliability
ECSS-E-ST-33-01C Space Mechanisms
oScope of the standard: requirements applicable to the:
concept definition, design, analysis, development, production,
test verification and operation of space mechanisms
to meet the mission performance requirements

16
CONFIDENTIAL
Step 2: Calculations & Simulations

17
MTBF, FIT calculations (Prediction Method)
To obtain high product reliability, consideration of reliability issues should be integrated from the very
beginning of the design phase. This leads to the concept of reliability prediction.
• MTBF: Mean Operating Time Between Failures
• The failure rate of the system is calculated by summing up the failure rates of each component in
each category (based on probability theory). This applies under the assumption that a failure of any
component is assumed to lead to a system failure.
• Constant failure rate  Relevant for Useful life-time
• Fault is repairable
• MIL-HDBK-217F is probably the most internationally recognized empirical prediction method, by far.
Parts count Parts Stress

19
Simulations
FEA Simulations
FEM Analysis: (FEA)
FEA consists of a computer model (2D, 3D)of a material or design that is stressed and analyzed for
specific results.
It is used in new product design, and existing product refinement. A company is able to verify a proposed
design and will be able to perform to the client's specifications prior to manufacturing or construction.
What can you check at an early stage?
Point, pressure, thermal, gravity, and centrifugal static loads
Thermal loads from solution of heat transfer analysis
Enforced displacements
Heat flux and convection
Point, pressure and gravity dynamic loads
Examples:
• Drop/shock
• Bending, load
• Vibration
• Thermal cross points
• …

20
Simulations
DESTECS (Design Support and Tooling for Dependable Embedded Control Software)
• Inspiration
o Use collaborative multidisciplinary design of Embedded Systems
o Rapid construction and evaluation of system models
o Evaluated on industrial applications
• Need because of Embedded Systems
o More demanding requirements for Reliability, Fault Tolerance
o Increasingly distributed: more complex design possibilities  more fault scenario’s

22
Conclusions
Advantages of empirical methods:
• Easy to use, and a lot of component models exist.
• Relatively Indicators of inherent reliability.
• Provide an approximation of field failure rates.
Disadvantages of empirical methods:
• Based on statistical data & sometimes out-dated
• Not all components from new designs are described in
the Standard.
• Failure of the components is not always due to
component-intrinsic mechanisms but can be caused by
the system design.
Simulations
• Early validation of your system
• More and faster iterations
• Parallel hw & sw development
• Early full system validation and risk
mitigation without hw
• Less real-life testing
(= the poor man’s approach)

CONFIDENTIALCONFIDENTIAL
Pragmatic Approach

24
Not Traditional Testing!!
• Traditional (QA) testing is done before product release but after the design & development phase (ex.
Burn-in test, environmental testing, drop testing, shock & vibration testing,…)
• Many of today's products are capable of operating under extremes of environmental stress and for
thousands of hours without failure. Traditional test methods are no longer sufficient to identify design
weaknesses or validate life predictions.
Disadvantages
• Test under operating conditions  Takes too long
• Testing is costly! (equipment, time-consuming,…)
• Will not tell you anything about the realiability during useful life. Just about infant failures. (DOA)
• Too late in NPD process,  Design corrections will be
very expensive

25
Highly accelerated testing
HALT = Highly Accelerated Life Time Test
What?
• Highly accelerated life testing (HALT) techniques are important in uncovering many of
the weak links of a product DURING THE DESIGN PHASE
• These discovery tests rapidly find weaknesses using accelerated stress conditions
• Stresses are applied in a controlled, incremental fashion while the unit under test is
continuously monitored for failures
Why?
HALT reveals product failure modes in a matter of hours or days
Traditional test methods that can take weeks or even months to find, if at all
The purpose of HALT is to determine the operating and destruct limits of a design – why
those limitations exist and what is required to increase those margins. HALT, therefore,
stresses products beyond their design specifications.

26
Procedure?
• Using a test environment that is more severe than that experienced during normal equipment use.
• Done on early prototypes & different design concepts
Since higher stresses are used, accelerated testing must be approached with caution to avoid introducing
failure modes that will not be encountered in normal use. Accelerating factors used, either singly or in
combination, include:
• More frequent power cycling
• Higher vibration levels
• High humidity
• More severe temperature cycling
• Higher temperatures
‘ It’s not a Pass/Fail test but a discovery process! ’

27
Results
• Structural weaknesses
• Electronic weaknesses
• Component failures
• Component dislocation
• PCB delamination, via-cracking, …
• Solder failure
• Software failures due to component degradation
• Connector problems
• ...
• Information on product limits and product capabilities outside the limits
• Product weaknesses & design errors

28
Goals
HALT provides engineers with the opportunity to improve
product design, increasing its robustness and minimizing
possibility of costly warranty services and expensive
product recalls after release
Once the weaknesses of the product are uncovered and corrective actions taken, the limits of the
product are clearly understood and the operating margins have been extended as far as possible.
A much more mature product can be introduced much more quickly with a
higher degree of reliability.

29
Taking It a step further…
• Define the S-N curve for the specific failure mechanisms
• Use test data in a model relating the reliability (or life) measured under high stress conditions to that
which is expected under normal operation to determine length of life
• Accelerated test models relate the failure rate or the life of a component to a given stress such that
measurements taken during accelerated testing can then be extrapolated back to the expected
performance under normal operating conditions
 Design for Reliability!!!  PoF

EXAMPLE: Central Heating sensor

32
Thermal cycle vs measurement errors
Goal:
Life-time expectancy necessary for product = 10years
Verify the reliability of measurements with HALT test setup
Discover design weakness, improve & repeat test
Setup:
• acceleration : cycle 1x/day => 1x/hour
• acceleration : min-max temperatures & high transient
• statistical number of test samples (one is not enough)
• Identify & measure performance parameter(s)

35
Conclusions
• Upfront definition of evaluation criteria are important.
• Multiple failure modes
• Early failures
• Non-constant (random) failures
• Performance degradation over time: Quality of the measurements will degrade in time.
• Temperature induced (thermo-mechanical stress)

36
HALT vs Field & Traditional testing
•Time-consuming
•Network
•Costly Installations
•More spread on the test results
•Same test conditions cannot be
guaranteed: Difficult for quatative
comparison
Field testing
• Faster results (accelerated stress)
• Correct & increase design
reliability throughout the test
procedures
• Control over test conditions
• Main costs:
Fabrication of samples, test setup,
assembly, testing,…
HALT
Traditional testing
•Time-consuming (operational
stress)
• Expensive setups
• Expensive corrective actions
• Too late in design cycle
• Only for infant failures (DOA)

PoF approach

38
Current approaches = not sufficient?
• Mostly only FMECA executed. Rarely identifies design issues because of limited focus on the failure
mechanism
• Incorporation of HALT and failure analysis (HALT is test, not DfR; failure analysis is too late)
• MTBF/MTTF calculations tend to assume that failures are random in nature
Provides no motivation for failure avoidance
• Easy to manipulate numbers
Tweaks are made to reach desired MTBF
E.g., quality factors for each component are
modified
• Often misinterpreted
50K hour MTBF does not mean no failures in
50K hours
Source: Loughborough University
Alternative = Physics-of-Failure principle:
The use of science (physics,chemistry, etc.) to capture an understanding of failure mechanisms
and evaluate useful life under actual operating conditions

40
Focus on failure mechanisms
Failure Mode:
o The EFFECT by which a failure is OBSERVED, PERCEIVED or SENSED.
Failure Mechanism:
o The PROCESS (elect., mech., phy., chem. ... etc.) that causes failures.
FMMEA: Add failure mechanisms to FMEA

41
Example FMMEA
Center for Advanced Life Cycle Engineering (CALCE),
University of Maryland
Infusion Pump

42
Further break-down to PBA level
Failure site = CBGA IC broken-off from PCB
Failure Mode = Solder-joint fatigue
Failure effect: Solder-Joint crack
Solder-joint = Surface mount solder attachment.
Electrical interconnection & mechanical attachment of electronic
component on the PCB  but also critical heat transfer in
between

43
Example: Solder-joint cracks
Failure Mechanism: Solder-joint fatigue by CTE mismatch
Caused by the local thermal mismatches between the different material characteristics of IC, PCB and
solder itself = CTE mismatch. (Coeficient of Thermal expansion)
Result: Different thermal expansions, due to thermal energy dissipated  stress on solder joints 
fatigue
Fatigue leads to growing of the grains inside the solder  Result: Cracks!

44
S-N curve of solder-joint fatigue
• For each failure mode a S-N curve can be defined
• Solder-joint fatigue = Function of Thermal strain vs N cycles to failure
Established out of:
• Test data
• Statistics
• FE simulation
• Physical modeling

45
Acceleration
Acceleration:
Thermal swings (dT) in the operational environment
 accelerating the thermal strain
 accelerating solder-joint fatigue
 accelerating failure effect: Solder-joint crack
Acceleration test:
Thermal cycling test requirements:
• Heat/cool rate limited (transient)
• Allow for minimal dwell times at extreme temperatures: time is essential.
• Materials set limits to temperature extremes
 Establish accelerating factor = Thermal strain (accelerated temp conditions)/Thermal strain
(normal temp conditions)
Acceleration Model:
These are mathematical models that can extrapolate the Number cycles to failure under accelerated
Temp conditions to the number of cycles to failure under operational Temp conditions

46
Example: Solder-joint cracks
Establish test failure distribution and predict operational failure distribution
using the acceleration factors and the operational use of the product
Use test data in a model relating the reliability (or life) measured under high
stress conditions to that which is expected under normal operation to
determine length of life
Test
Point Operation
Point

47
Characteristics, benefits and limitations:
• Physics not statistics.
• The only way to predict long term wearout lifetime.
• Testing is in general done on specially designed test samples, not on the actual product.
• It is input for the design process. Can be established independent from design cycle. Time-to–
market!
• Requires profound understanding of technologies used in the product and the wearout physics
involved.
• Limitation: Establishing the S-N curves and acceleration factors is a tedious, time-consuming and
expensive job with a lot of pitfalls. Therefore, for many relevant failure mechanisms S-N or
acceleration factor information is not available.
• Still subject of scientific research.

48
CONFIDENTIAL
General Conclusion

49
Calculations
MTBF, FiT,…
Simulations
HALT
FMECA
FMMEA
PoF
Traditional
testing
Field Testing
Reliability
Cost for design changes

50
Calculations
MTBF, FiT,…
Simulations
HALT PoF
Traditional
testing
Field Testing
FMECA
FMMEA
Infant
failures
Useful Life (Normal
Operation)

52
VERHAERT MASTERS IN INNOVATION®
Headquarters
Hogenakkerhoekstraat 21
9150 Kruibeke (B)
tel +32 (0)3 250 19 00
fax +32 (0)3 254 10 08
ezine@verhaert.com
More at www.verhaert.com
VERHAERT MASTERS IN INNOVATION®
Netherlands
ESIC European Space Innovation Centre
Kapteynstraat 1
2201 BB Noordwijk (NL)
Tel: +31 (0)618 12 19 19
derk.schneemann@verhaert.com
More at www.verhaert.com
MASTERS IN INNOVATION® is a platform set up by VERHAERT to train, stimulate and incubate
you as an innovator.
We provide an extensive training program with different tracks and covering critical areas of new
products and business innovation.
Furthermore we manage the VERHAERT venturing program and organize our Innovation Day, an
annual conference on best practices and insights on new products & business innovation.

Innovation day 2013 2.5 joris vanderschrick (verhaert) - embedded system development

More Related Content

What's hot

Similar to Innovation day 2013 2.5 joris vanderschrick (verhaert) - embedded system development

More from Verhaert Masters in Innovation

Recently uploaded

Innovation day 2013 2.5 joris vanderschrick (verhaert) - embedded system development

Editor's Notes