Design for Testability DfT Seminar

2,274 views
2,051 views

Published on

Design for Testability (DfT) Seminar

Published in: Technology, Business
2 Comments
2 Likes
Statistics
Notes
No Downloads
Views
Total views
2,274
On SlideShare
0
From Embeds
0
Number of Embeds
25
Actions
Shares
0
Downloads
89
Comments
2
Likes
2
Embeds 0
No embeds

No notes for slide

Design for Testability DfT Seminar

  1. 1. Test Engineering Courtesy of Patrick D.T. O’Connor 62 Whitney Drive Stevenage Herts. SG1 4BJ UK www.pat-oconnor.co.ukwww.pat-oconnor.co.uk/testengineering/htm pat@pat-oconnor.co.uk pdtoconnor@ieee.org 1
  2. 2. Test EngineeringOutline (day 1):1. Introduction2. Stress, strength, failure ofmaterials3. Stress, strength, failure ofelectronics4. Variation and reliability5. Design analysis6. Development test principles 2
  3. 3. Test EngineeringOutline (day 2):7. Materials and systems test8. Electronics test9. Software10. Manufacturing test11. Testing in service12. Data collection and analysis13. Laws, regulations, standards14. Managing test 3
  4. 4. Test EngineeringWhy test?• Design uncertainty• Manufacturing• Variation• Maintenance• Regulations• Contracts 4
  5. 5. Test EngineeringCauses of failure• Design inherently incapable• Variation (parameters,environments)• Wearout• Other time-dependent mechanisms• Sneaks• Errors We must know them all! 5
  6. 6. Test EngineeringHow to test?• Test to succeed/test to fail?• Accelerated test• Systems and components• Technologies• Processes• Analysis and simulation 6
  7. 7. Test EngineeringTesting tales:• “Our engineers are paid to design right”• “Trains don’t need testing”• Ship engine for a locomotive?• We always have done this test• The telecomms system• MIL-STD-883 IC burn-in test• “Don’t overstress”• Too much test? 7
  8. 8. Test EngineeringDevelopment test principles•Failure costs exceed costs of test to detect & remove (Deming).•Failure-free design: selection, training,teams, leadership•Optimise test programme •Test adds value! 8
  9. 9. Test EngineeringDevelopment test costs• Test articles (“UUT”)• People X time• Facilities• Delay to market• Downstream opportunities(warranty, fixes, reputation, etc.) 9
  10. 10. Test EngineeringManagement aspects:• Design capability/risks• Markets, competition• Product environment, life• Suppliers• Regulations• Manufacturing, service 10
  11. 11. FAILURE CAUSES: MECHANICAL• Maximum stress, fracture• Stress cycling, fatigue, creep (vibration, temperature cycle)• Wear• Corrosion• Manufacture• Variation• Other (leaks, backlash, friction, ...) 11
  12. 12. MATERIAL STRESS, STRENGTH, FAILUREProperties:• Strength/elasticity (Hooke’s Law) – Stress (σ) = Young’s Modulus (E) X strain (ε)• Yield strength, ultimate tensile strength (UTS)• Toughness/brittleness (resistance to fracture: energy/volume)• Crack growth (Griffith’s Law) 12
  13. 13. MATERIAL STRESS, STRENGTH, FAILUREHooke’s Law: Stress σ Plastic Fracture Elastic Yield point Strain ε 13 Figure 2.1 Material behaviour in tensile stress
  14. 14. MATERIAL STRESS, STRENGTH, FAILURE Brittle:Stress cast iron σ ceramics Tough: MPA glass kevlar steels 400 alloys (Al, Ti, etc.) Ductile: 200 plastics copper solder 10 20 30 Strain ε %Figure 2.2 Tensile stress/strain behaviour of different materials (generalised) 14
  15. 15. FINITE ELEMENT ANALYSIS(MECHANICAL STRESS) (MSC) 15
  16. 16. MECHANICAL FAILURE CAUSES• Shock overload Constant failure/hazard rate (CFR/CHR) (Load - Strength Analysis)• Strength deterioration Increasing failure/hazard rate (IFR/IHR) Durability 16
  17. 17. CAUSES OF STRENGTH DETERIORATION• Fatigue (cyclic stress: vibration, handling, temperature cycling)• Creep (high temperature + mech. stress)• Wear (parts moving in contact: connectors)• Corrosion (electrolytic, contamination, ...)• etc. 17
  18. 18. FATIGUE: S - N CURVEStress S UTS Fatigue limit 1 10 100 1000 10000 100000 Cycles to failure N (log scale) 18
  19. 19. FATIGUE: MINER’S RULE M1 M2 Mk + + … =1 n1 n2 nk 19
  20. 20. “CLASSIC” FATIGUE FAILURE Initiating crack or damage Granular fracture surfaceCrackgrowth rings 20
  21. 21. DESIGN AGAINST FATIGUE• Reduce mech. stress concentrations (FEA)• Provide support for heavy components, connectors, etc.• Minimise thermal gradients• Know material fatigue properties particularly solder!• Design for safe life• Design for fail-safe• Design for inspection & test 21
  22. 22. VIBRATIONLeads to:• Fatigue• Wear• Loosening• Leaks• Noise 22
  23. 23. VIBRATIONMeasures:• Frequency (Hz)• Displacement (m)• Velocity (m/s)• Acceleration (peak) (m/s2 or gn)• Damping (reduces amplitude)• Noise, vibration and harshness (NVH) 23
  24. 24. VIBRATION: WATERFALL PLOT Figure 2.5 Waterfall plot of vibration data 24
  25. 25. TEMPERATURE EFFECTS• Expansion/contraction (TCE)• Softening, weakening, melting (metals, some plastics)• Charring (plastics, organics)• Drying/condensation/freezing• Other physical/chemical (Arrhenius’ Law)• Viscosity change, lubricant loss• Interactions (corrosion, …) 25
  26. 26. WEAR MECHANISMS• Adhesive• Fretting• Abrasive• Cavitation/Erosion• Corrosive 26
  27. 27. WEAR REDUCTION• Examine• Test/analyse• Lubricate (oils, MoS2-----)• Surface treatment (PTFE, …)• Stress reduction (mech, temp, vibration)• Material change (eg. non- abrasive) 27
  28. 28. CORROSION• Ferrous Alloys (Rust)• Non - Ferrous:- Al, Mg• Chemical• Electrolytic 28
  29. 29. PREVENTING CORROSION• Material selection• Surface protection - Anodising - Plating (Cr, Sn, ----) - Painting - Lubricating• Environmental protection (seals, desiccants) 29
  30. 30. OTHER MECHANICAL FAILURE MECHANISMS • Backlash (wear?) • Adjustments • Leaks • Loosening (fasteners) - Wear? - Maintenance? • etc. 30
  31. 31. MATERIAL SELECTION FOR RELIABILITY/DURABILITY• Metals:- Corrosion Protection Fatigue• Plastics, Rubbers:- Chemical Temperature stability UV sensitivity• Ceramics:- Fracture toughness• Composites:- Impact strength Delamination Erosion 31
  32. 32. Electrical/electronics Stress, Strength & Failure• Component selection• Stress derating (electrical, thermal)• EMI, EMC, ESD• Parameter variation• Connectors• Mechanical 32
  33. 33. Stress Effects• Current – temperature rise – drift• Voltage – current/overstress (EOS) – arcing, corona discharge• Power (W=I2R)• Temperature 33
  34. 34. Arrhenius’ Law λ=Kexp − E ⎡ ⎤ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎣ kT ⎥ ⎦ or λ= Kexp −A ⎡ ⎢ ⎢ ⎤ ⎥ ⎥ T ⎢ ⎢ ⎣ ⎥ ⎥ ⎦E = activation energy (0.3 - 1.5 eV)k = Boltzmann’s constant (8.63 x 10-5 eVK-1) 34
  35. 35. Temperature Effect on Reliabilityλ MIL217, Bellcore Reality 20 Rated (85/125) 200? T deg. 35 C
  36. 36. Drift Characteristics Carbon Resistor +70C Change in R% 0-0.5-1.0 50% PSR 100% PSR-1.5 1.0 1.5 2.0 Time hX1000 36
  37. 37. Semiconductor DeviceConstruction Features• Si preparation• Diffusion• Passivation*• Metallization*• Glassivation• Connection• Packaging (*multilayer) 37
  38. 38. Semiconductor Device Technologies• ASIC• Mixed signal (analog/digital/RF)• 3-5 (GaAs, InP)• Power (transistors, thyristors, GTO, IGBT)• Microwave (MMIC) 38
  39. 39. Microcircuit Mounting and Connection• DIP in PTH• Flat pack / SOIC• Surface mounting − Leadless chip carrier (LCC) − Pin grid array (PGA)/ball grid array (BGA) − Chip scale packaging (CSP) − Tape automated bonding (TAB)• IC sockets (DIP, LCC) 39
  40. 40. Semiconductor Device Failure Mechanisms1. Die Related• Crystal structure / impurity• Diffusion / masking• Passivation / dielectric breakdown (TDDB)• Electromigration• Passivation• Latch-up• Slow trapping, hot carriers, alpha particle• External: ESD / EOS / EMP 40
  41. 41. Semiconductor Device Failure Mechanisms2. Package Related • Adhesion • Bonding • Impurity / corrosion / inclusions • Hermeticity • Solderability 41
  42. 42. Passive Device Failure Mechanisms1. Resistors (Fixed) • Parameter drift • Open circuit • Noise2. Variables • As above plus: • Mechanical failure • Contact failure • Seal failure 42
  43. 43. Passive Device Failure Mechanisms3. Capacitors • Short circuit (dielectric breakdown) • Open circuit (high V) • Leakage (wet types) • Wire bond failure (open circuit) 43
  44. 44. Passive Device Failure Mechanisms4. Interconnections• PCB - ball bonds - track cracks (opens) - through hole opens - shorts• Wire/ribbon − breaks (fatigue, damage) − solder attach• Intermittents 44
  45. 45. SolderMajor contributor to failures!(SMT, BGA, >10K joints/board)• Inadequate wetting (contamination, oxidation)• Insufficient time (“second drop”)• Fatigue• Creep 45
  46. 46. Insulation• Damaged, cut, chafed, trapped, …• Overheated• Aged, embrittled• Eaten (rodents) 46
  47. 47. System/circuit Problems• Distortion• Jitter• Timing• Interference/compatibility (“noise”) (EMI/EMC)• Intermittents/no fault found (NFF) 47
  48. 48. EMI: Problems• High frequencies (MHz - GHz) (VHF-UHF!)• Close spacing (SMT, narrow tracks)• ASICs, mixed signals (digital, RF)• New regulations (UL, CE, etc.)• Lack of knowledge (designers, managers)• Basic EDA does not simulate 48
  49. 49. EMI Sources (internal)• Current loops (Lenz’s Law: reduce loop area)• Signal noise (components, conductors)• Ground noise 49
  50. 50. EMI Sources (external)• ESD• Switched inductive loads• Supply transients• Other systems (motors, radars, computers, peripherals) 50
  51. 51. EMI Protection• Shielding − Faraday Shield − Coax cables• Circuit protection − Capacitive (decoupling) − Inductive − Opto-couplers − Filters, regulators (on PCB) 51
  52. 52. Electrical Overstress/ Electrostatic DamageEOS/ESD• ICs ARE VULNERABLE!!• People generate 1 - 5 kV / 50 - 100 μJ• EOS / ESD can kill ICs• It can also do GBH• On-chip protection 52
  53. 53. EOS/ESD Protection• Connector separation for different voltage levels• Decoupling of ICs• Isolation (opto-couplers)• Handling / packaging / bonding• On-chip protection 53
  54. 54. Probability DistributionsHistogram and Probability Density Function pdf f(x) x 54
  55. 55. Normal DistributionProbability Variable -4 -3 -2 -1 1 2 3 4 X standard 55 Mean deviation s
  56. 56. ”Natural” Variation• Constant in time. Past = Future• ”Normal” Distribution Function (Mean, Standard Deviation)• ”Made by God” 56
  57. 57. Normal (Gaussian) Distribution• Central Limit Theorem• Symmetrical about mean/median μ• Standard deviation (SD) σ . Variance = σ2 in ±nσ : 1 2 3 6 lie: 68% 95% 99.7% 99.999999% 57
  58. 58. Variation in Engineering• Not ”normal”• Not constant in time. Past NOT = Future• Selection effects• Often deterministic (V = IR, F = ma)• Sometimes due to failures, errors,....• Occasionally catastrophic (discontinuous, eg. fatigue)• ”Made by man” 58
  59. 59. Curtailed DistributionProbability -4 -3 -2 -1 1 2 3 4 Mean X standard deviation s59 Variable
  60. 60. Effect of SelectionProbability -10% -5% Nom. +5% +10% Parameter 60
  61. 61. Skewed DistributionProbability Variable 61
  62. 62. Bimodal Distribution (typical human mortality)Probabilityof death atthis age 10 20 30 40 50 60 70 80 90 100 110 Variable (years) 62
  63. 63. Normal Distributions?1234 -nσ Mean nσ Four distributions with same mean and SD (from Shewhart) 63
  64. 64. Weibull Distribution β R = exp[-(t/μ) ]μ = Characteristic lifeβ = Shape parameter (slope) = 1 : CHR < 1 : DHR > 1 : IHRIf failure-free life = γ, replace t with (t - γ) 64
  65. 65. Distributed load and strengthProbability Load Strength L S Value L S a. Non-overlapping b. Overlapping distributions: distributions wide strength variation (low LR) L S L S d. Overlapping distributions: c. Curtailed strength distribution wide load distribution (high LR) 65
  66. 66. Distributed Load & StrengthFor Normally Distributed Load L and Strength S S- L σσ 2 2 L σL σS+σ L 2 2 66
  67. 67. Time-dependent load and strength Strength Load t’ Time/load cycles Log scale 67
  68. 68. Strength v. specification (time dependent) Time Probability Probability of failing at max. specified stress Specification Strength Figure 6.3 Strength vs. Specification (time-dependent) 68
  69. 69. Summary of High Reliability Design Principles• Determine most likely distributions of load and strength• Evaluate SM for intrinsic reliability• Determine protection methods (load limit, derate, screen, QC)• Analyse strength degradation modes• Test to corroborate, analyse results• Correct or control (redesign, safe life, maintenance,...) 69
  70. 70. Multiple VariationsTraditional Method:• Test effect of one variable at a time• Cannot test interactions 70
  71. 71. Statistical Design of Experiments DoE • Test all variables simultaneously • Randomisation • Analysis of variance (ANOVA): 1. Determines effects of all variables 2. Determines effects of all interactions (R.A.Fisher, 1926) 71
  72. 72. Genichi Taguchi• ”Loss to Society”• System Design• Parameter Design• Tolerance Design• Control & Noise Factors• Orthogonal Arrays• Brainstorm 72
  73. 73. DoE: Engineering Aspects• Statistical v. engineering significance• Randomisation• Cost effectiveness• Confirmation• SPC• CAE• Nonlinearity• Management 73
  74. 74. Confidence and Risk• s-confidence = probability that population parameter lies between “confidence limits”• Bigger sample, narrower confidence limits• Risk = (1 - confidence) (probability that parameter lies outside confidence limits)• s - confidence vs. engineering confidence 74
  75. 75. Statistical, Scientific and Engineering Confidence• Statistical test (binomial): items tested, 0 failures 0 1 10 20 80% s-confidence that R > 0 0.90 0.98 0.99 Data is entirely statistical, no prior knowledge• Scientific test: items dropped, all fall 0 1 10 20 confidence that all will fall 1 1 1 1 Information is deterministic• Engineering: can range from deterministic to statistical 75
  76. 76. Measures of Reliability• Failure Rate (FR) (λ)• Hazard Rate (HR for non-repairable items) (λ)• Mean Time Between Failures (MTBF) (M)*• Mean Time to Failure (MTTF) (M)*• Durability (failure free life; FR = 0)• Reliability R = Probability of no failures in time t = e-λt = e-t/M **(for constant failure/hazard rate) 76
  77. 77. Patterns of FailureThe Bathtub Curve Total IFR (wearout) CFR DFR (weak) 0 t Infant mortality Useful life Wearout 77
  78. 78. Variation: summary• Variation is seldom (never?) “normal”• Most important variation is in the tails – Less data – More uncertain – Conventional stats most misleading• Variation can change over time• Interaction effects• Variation made by people• Most engineering education maths only 78
  79. 79. Development Test PrinciplesCategories of test:• Functional (design proving/proof ofprinciple)• Reliability/durability• Contractual/safety/regulatory• Test and evaluation (T&E)• Beta testing 79
  80. 80. Development Test PrinciplesFill ”uncertainty gap”• Performance/safety: – demonstrate success – perform once• Reliability/durability: – test to fail – accelerated tests• Variation: – Taguchi/statistical experiments – Multiple tests? 80
  81. 81. Development Test Principles• Components, systems, interfaces• Software• External suppliers• FRACAS• Integrated test programme 81
  82. 82. Development Test PrinciplesTest economics: major driver ofdevelopment cost & time, BUT:• Failure costs increase duringproject phases (x10 rule: design,development, production, service)• Failure free design is cheaper!(experience, training, integratedengineering, design analysis) 82
  83. 83. Development Test Principles Strength v. SpecificationProbability Specification L Strength (stress to fail) 83
  84. 84. Development Test PrinciplesStrength v. Specification(transient & permanent failures)Probability Transient Permanent Specification Strength (stress to fail) 84
  85. 85. Development Test Principles Strength v. Specification (time dependent)Probability Time Specification Strength (stress to fail) 85
  86. 86. Development Test Principles• Failures are often due to combinedstresses/strengths (uncertain)• Failures are often influenced byinteractions (uncertain)• Failures often time-dependent (uncertain)• Causes of service failures can be shownby different test stresses, e.g. – vibration/temperature cycle – high frequency/low frequency 86
  87. 87. Development Test Principles Fundamental principle: increase(combined) stresses to cause failures,then use information to make productstrongerLimits:• Technology (e.g. solder melt)• Test capability• Economic 87
  88. 88. Development Test PrinciplesTesting at “representative” stresses, andhoping for no failures, is ineffective and awaste of resourcesExamples:• Engines on test beds• Cars on test tracks• “Simulated” environmental test (MIL-STD-781,MIL-STD-810, etc.) 88
  89. 89. Development Test PrinciplesEnvironments (1):• All relevant environments• Combined environments (CERT)• User• Environmental simulation? 89
  90. 90. Development Test PrinciplesEnvironments (2):• Thermal• Thermal fatigue (switching)• Vibration• Shock• Humidity• Power supply/load• Transients (ESD, EOS)• Pollution, corrosion• People, other animals• Etc. 90
  91. 91. Development Test PrinciplesAccelerated stress test• Miner’s Law for fatigue (mech, thermal)• Arrhenius Law for thermal acceleration?• Step-stress testing• Failure modes relevant, not stress levels! 91
  92. 92. Development Test PrinciplesHighly accelerated life test (HALT) (1)• Highly accelerated combined stresses(temperature, cycling, multi-axisvibration, others...)• Step stress to discover transient andpermanent limits• Time compression: orders of magnitude• Developed by Gregg Hobbs 92
  93. 93. Development Test PrinciplesHALT (2)• Special chambers, facilities (QualMark,Thermotron, Screening Systems, TEAM, ...)• Savings: time, space, energy• Optimise manufacturing screens (HASS)• Similar approaches: – Highly accelerated stress test (HAST) – Stress-induced failure test (STRIFE) – Failure mode verification test (FMVT ® Entela) – Etc. 93
  94. 94. HALT Philosophy (1)Stress limitsLower Lower Upper Upper Product operating destructdestruct operating spec. limit limit limit limit Stress (combined) • High stresses = small samples! 94
  95. 95. HASS Philosophy Precipitation screen Detection screenLower Lower Upper Upperdestruct operating Product operating destruct limit limit spec. limit limit Stress (combined) 95
  96. 96. HALT/HASS Philosophy (2)Stress(S) HALT/ HASS ESS in use Cycles to fail (Log N) 96
  97. 97. Accelerated Test ApproachTE p1051. What failures might occur in service? (FMEA,etc).2. List/analyse stresses, combinations.3. Plan how to apply.4. Apply single stresses, step increases to failure.5. Analyse failure, strengthen design.6. Iterate 4 & 5 to fundamental limits.7. Repeat with combined stresses.8. Iterate 5 & 6. 97
  98. 98. Accelerated Test ApproachExamples:• Mechanical (rotating, engines, etc.) – Old lubricants, filters – Low fluid levels (oil, coolant) – Out-of-balance• Electro-mech (printers, etc.) – Temp, vib, power V level, humidity, ... – Misalign shafts, etc. – Out-of-spec. materials (paper, friction, ...)• Electronic components/packages, etc. – Temp, vib (high frequencies), etc. – Use vibration transducers (speaker coils?) 98
  99. 99. Accelerated Test ApproachQuestions (TE p109):• How many to test?As many as practicable /economic• Can reliability (MTBF, durability) be measured?NO! It will be increased!• How do we know if failure on test could occur inservice?Analyse, use experience, THINK!• Product will see no vibration in service. Why vibrate ontest?Vibration on test can stimulate failures caused by temp.cycle, handling, etc. in service, QUICKLY!• Is the principle limited to temp, vib, elec stress?Not at all. Apply to fluid systems, mech tolerances, etc. 99
  100. 100. HALT/HASS Payoffs• Robust designs + capable processes = High Reliability• Reduced test time and cost• Feedback to design: reduce “uncertainty gap” on future products• Continuous improvement (“kaizen”) of design capability (products, processes) 100
  101. 101. Accelerated Test or DoE?Important Variables, Effects, etc. DoE/HALT?Parameters: electrical, dimensions, etc. DoEEffects on measured performanceparameters, yields DoEStress: temperature, vibration, etc. HALTEffects on reliability/durability HALTSeveral uncertain variables DoENot enough items available for DoE HALTNot enough time available for DoE HALT 101
  102. 102. Circuit Test Principles: Analog• DC: current, potential, resistance (AVO),capacitance, ...• AC: current, potential, impedance, waveforms,...• Signals: waveforms, gain, distortion, jitter, ... 102
  103. 103. Circuit Test Principles: Digital “Stuck at” faults (SA0, SA1) Ainputs O (output) B Truth table for 2-input AND gate Truth table: A B O Test vectors: 4 0 0 0 0 1 0 (combinational logic) 1 0 0 1 1 1 103
  104. 104. Circuit Test Principles: DigitalLogic classes:• Combinational: outputs follow inputs• Sequential: input dependent, also data flow,memory allocation• Dynamic: requires refresh/”keep alive” 104
  105. 105. Circuit Test Principles: Digital Fault types: • SA0, SA1 • Stuck at input • “At speed” • Pattern sensitive • Etc. 105
  106. 106. Manual Test Equipment• Basic instruments – DMMs, power meters, ...• Instruments – oscilloscopes, waveform generators, spectrum analysers, logic analysers, ...• Special instruments – RF testers, optical signal testers, hi volt, ...• PC - based 106
  107. 107. Automatic Test Equipment (ATE)• Vision: Automatic optical inspection (AOI), X-ray(AXI)• Manufacturing defects analyser (MDA)• In-circuit test (ICT)• Fixtureless/flying probe• Functional test (FT) (via circuit connectors)• Combined ICT/FT• Special test (RF, power supplies, manual, “hotrig”..) 107
  108. 108. Test CapabilityATE must:• Confirm correct operation of good circuits• Not classify good as faulty• Detect faulty items• Diagnose fault causes 108
  109. 109. Design for Test (DFT)Design must allow ATE to:• Initialize (start clocks, set logic states)• Control (e.g. open feedback loops, force logic, generate inputs)• Observe (access to important nodes)• Partition (reduce test program complexity) 109
  110. 110. Layout for ICT• Keep PCB edges clear• Location holes• Large components on top (for double sided PCBs)• Resistors between power lines and control signals (resets, enables, tristates)• Clock disable (provide link) 110
  111. 111. Built-in Test (BIT)• Boundary scan (IEEE 1149.1)• ASICs• Logic and function tests• Complexity, false alarms 111
  112. 112. EMI/EMC TestMust test for:• Radiated emissions• Conducted emissions (power lines, signal lines)• Compatibility (susceptibility) (radiated, power,signals)• Internal problems• Special situations (rail signalling, avionics,lightning, nuclear (NEMP, etc.)Standards and regulations 112
  113. 113. Test Control and Data Acquisition (DAQ)Test databus standards:• General purpose interface bus (GPIB) (IEEE488)• PC interface bus (PCI), PCI extensions for instruments (PXI)• VLSI extensions for instruments (VXI) 113
  114. 114. IC Test• Special/expensive ATE• Test cost ≅ IC manufacture cost!• IDDQ test• BIST• Standard tests (MIL-STD-883, etc.)• Rely on IC manufacturer’s tests 114
  115. 115. IDDQ Test Good device IDDQ 0.3 (mA) Defective device (at states 2,3,10, ...) 0.2 0.1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Etc. Node stateFigure 8.11 IDDQ plot 115
  116. 116. Standards, References, Software• MIL-STD-2165 (USA)• DEF STAN 00-13 (UK)• ‘Design for Testability’ - Jon Turino• ‘Testability Advisor’ - Logical Solutions Inc. 116
  117. 117. Software Reliability• All new systems involved (operating & test)• Cannot predict failure modes and effects• Cannot test complete system*• Errors are present in all copies*• S/W - H/W interfaces (keyboards, sensors, devices, emi)*Compare VLSI hardware 117
  118. 118. Hardware/SoftwareReliability Differences (1)1. Failures can be caused by 1. Failures are primarily due to design deficiencies in design, production, faults. use and maintenance.2. Failures can be due to wear or other 2. There are no wearout phenomena. energy-related phenomena. Software failures occur without warning,3. No two items are identical. Failures 3. There is no variation: all copies of a can be caused by variation. program are identical.4. Repairs can be made to make 4. There is no repair. The only solution equipment more reliable. is redesign (reprogramming5. Reliability may be time-related, with 5. Reliability is not time related. Failures failures occurring as a function of occur when a specific program step operating (or storage) time, cycles, or path is executed or a specific etc. input condition is encountered, which triggers a failure.6. Reliability may be related to 6. The external environment does not environmental factors affect reliability except insofar as it (temperature, vibration, humidity, might affect program inputs. etc.7. Reliability can be predicted, in 7. Reliability cannot be predicted from any physical bases, since it entirely principle but mostly with large depends on human factors in uncertainty, from knowledge of design. design, parts, usage, and environmental stress factors. 118
  119. 119. Hardware/Software Reliability Differences (2)8. Reliability can be improved by 8. Reliability cannot be improved by redundancy. since if one path fails, redundancy if the parallel paths are the other will have the error. identical.9. Failures can occur in components of 9. Failures are rarely predictable from a system in a pattern that is, to analyses of separate statements. some extent, predictable from the Errors are likely to exist randomly stresses on the components and throughout the program, and any other factors. Reliability critical lists statement may be in error. are useful to identify high risk items. Reliability critical lists are not appropriate.10. Hardware interfaces are visual; one 10. Software interfaces are conceptual can see a 10-pin connector. rather than visual.11. Computer-aided design systems 11. There are no computerised methods exist that can be used to create and analyse designs. for software design and analysis.12. Hardware products use standard 12. There are no standard parts in components as basic building software, although there are blocks. standardised logic structures. Software reuse is being deployed, but on a limited basis. 119
  120. 120. Software in Engineering• “Real time”• Wide range of interfaces (hardware, human, timing, ...)• Different levels of embedding (ASICs, PGAs, BIOS, ...)• Hardware/software options for functions• Electrically “noisy” environments• Usually smaller 120
  121. 121. Software ReliabilityERROR FAULT FAILURESources of error:• Specification (60%)• Design (20%)• Code(20%) (typo, numerical, omissions, etc.)• Timing/emi• Data (information) integrity 121
  122. 122. Error Reduction• Modular design• Error traps• Remarks• Spec & code review• Test 122
  123. 123. Fault Tolerance• Internal tests (rates of change, cycle times, logic)• Resets, fault indications• Redundancy, voting• Hardware failure protection 123
  124. 124. Languages• Machine code/microcode• Assembly level/symbolic assemblers – Both processor specific – Faster, less memory – Difficult, error prone• High level (HLL) (BASIC, Fortran, *Pascal, *Ada, *C, *C++) – Processor independent – Easier, error protection* – Assemblers, compilers• Programmable logic controllers (PLCs)• Assemblers, compilers 124
  125. 125. Software Testing (1)• Total paths = 2n (n = branches + loops)• Test specs – All requirements (“must do”, “must not do”) – Extreme conditions (timing, parameter values, rates of change, memory utilisation, ...) – Input sequences – Fault tolerance/error recovery 125
  126. 126. Software Testing (2)• Module & interface tests (“white box”) – Data /control flow – Memory allocation – Lookups – Etc.• System tests – Verification – Validation (“black box”) 126
  127. 127. Documentation• Specifications• Code, remarks• Notebooks• Changes, corrections• Test results: – Version – Test – Faults 127
  128. 128. Software Reliability Prediction and Measurement • Methods: – Error/bug count – Time-based (hours, days, CPU seconds) • “Cleanroom” approach (IBM) • Do not use! 128
  129. 129. Test in ManufactureManufactured items are either:1. Good2. Defective, but detected and fixed or scrapped3. Defective, but shipped, and might/will fail later We must inspect/test to discriminate 129
  130. 130. Manufacturing Test Principles (1)• All testing costs. So minimise (ideal = zero)• But: – Manufacturing processes generate variation & defects – Later costs of variation & defects can exceed costs of detection & correction/removal• So: – Must consider total life cycle (manufacturing, use, ...) Value-added testing 130
  131. 131. Manufacturing Test Principles (2)Test cost justification is difficult, because:• Test costs arise in manufacture; failure costsarise later•Failure occurrences and costs cannot bepredicted Some testing might be obligatory: calibration, EMI/EMC, safety, etc. 131
  132. 132. Test CapabilityTests must:• Identify good items• Detect defects (parts, processes, suppliers, ...)• Indicate defect source/location 132
  133. 133. Test Pass - Fail Logic Y Y Pass? Next Test OK? test N N Y Diagnose, Detect? repair N 133Figure 10.6 Test pass-fail logic
  134. 134. Test Criteria and Stresses• Manufacturing tests are not tests of the design• Manufacturing tests must not damage good items (contrast with development) 134
  135. 135. Manufacturing Test EconomicsAspects to consider:• Cost of test(s) (setup, run, repairs, ...)• Defects that might be generated upstream• Test capability• Alternatives to test (inspection, ...)• Methods to reduce/prevent defects• Downstream costs of undetected defects• 100% or sample test? 135
  136. 136. Manufacturing Test EconomicsExamples:• Screw• Integrated circuit• Automotive gearbox• Car• Spacecraft• Electronics assembly 136
  137. 137. Inspection and MeasurementInspection:• Visual (manual, automatic)Measurement:• Dimensional (metrology) – Micrometers, CMMs, ...• Parameters – mech. (strength, torque, ...) – elec. (instruments, ATE, ...) (Module 8) Inspection, measurement, test: not absolute definitions 137
  138. 138. Stress ScreeningDefinition: application of stresses to cause defectiveitems to fail/show without damaging good onesAlternative terms: • Environmental stress screening (ESS) • Burn-in (electronic components & systems) • STRIFE test • etc.Guidelines, etc: • US NAVMAT P-9492 • US MIL-STD-2164 • IEST ESSEH Guidelines 138
  139. 139. Highly Accelerated Stress Screening (HASS)• Highly accelerated stresses (temp., vib., elec., ...)• Developed via HALT in development testing• Stresses are not extrapolations of service conditions• Can be applied only to products that have been subjected to HALT in development 139
  140. 140. HASS Philosophy (1) Precipitation screen Detection screenLower Lower Upper Upperdestruct operating Product operating destruct limit limit spec. limit limit Stress (combined) 140
  141. 141. HALT/HASS Philosophy (2)Stress(S) HALT/ HASS ESS in use Cycles to fail (Log N) 141
  142. 142. HASS Philosophy (3)• Proof (safety) of screen (POS)• HASA (audit): sample v. 100%• Review/adapt (e.g. repeat POS)• Can apply to any technology (elec., mech.)• Keep flexible (no standard procedures) 142
  143. 143. Electronics Manufacturing FaultsIn rough order:• Solder problems (permanent/intermittent o/c or s/c, weak, ...)• Parts missing/wrong place/wrong value• Part parameters/functions• Damage (physical, ESD, ...)• System/assembly level (cables/connectors, variation, EMI/EMC, ...) In 1970’s list could have been reversed! 143
  144. 144. Electronics Test Options/Economics Board test: CM CF CΙ CA pass MDA pass ICT / pass Assemb le AOI Ship FT fail di fail fail dm df Diag nose/ repair CR C = cost d = pr opor tion failed Figure 10.3 Electr onics assembly t est flow example 144
  145. 145. Electronics TestOptions/EconomicsA simple model for the manufacturing and test cost perunit is:C = CA + CI + CM + CF + (CR + CM + CF ) (dI + dm + df )If, for example, CA = $200 CI = $10 CM = $10 CF = $20 CR = $50 dI = dm = df = 0.05then the total cost per unit would be $252 145
  146. 146. Fault Proportions & Coverage Coverage % Fault faults % AOI AXI MDA/ ICT FT HASSOpen circuit 25 40 95 85 95 *Insufficient solder 18 40 80 0 0 20-80Short circuit 13 60 99 99 95 *Component missing 12 90 99 85 85 *Component misaligned 8 80 80 50 0 0Component elec. para error 8 0 0 20/80 80 *Wrong component 5 15 10 80 90 *Other non-electrical 4 80 0 0 0 20-80Excess solder 3 90 90 0 0 0Component reversed 2 90 90 80 90 * 146
  147. 147. Assembly TestBoard 1Board 2 Test TestBackplane PSU Keypad Display 147
  148. 148. Electronic Assembly Burn- In (ESS) • Typically -30ºC to 70ºC, 5 cycles • Power on (monitor) • (Vibrate) • Finds production defects – Solder – Damage • Not effective against component defects (low temp, low stress) 148
  149. 149. Integrating Stress Screening• Integrate with functional test (FT)• Before/after AOI/ICT?• Assembly stages?: – Board – Intermediate – Final• Re-screen after repair? YES No fixed rules! 149
  150. 150. Post-Production Economics• TE Page 183 150
  151. 151. Electronic Component Test• All components tested by manufacturers• Generally not practicable/economic for OEMs/CEMs to test (IC tester $5M!)• No repair possible• Special cases: – Power devices? – Etc? 151
  152. 152. Electronic Component Population CategoriesFailureprobability Good population (zero failures) Infant “Freaks” mortality 10 100 1000 10000 Time (h) 152
  153. 153. IC Test• MIL - STD - 883 (TE p. 186) – Level A, B, C screens – Burn-in (125°C, 168h) – Plastic/hermetic packages (autoclave test)• Other standards (CECC, IEC, ...) Don’t use! 153
  154. 154. In-Service Test PhilosophyTest only:• If only way to determine correct function• To determine failure cause (diagnostic)• To confirm repair Optimise during development 154
  155. 155. Test Schedules• Continuous (BIT, monitors, ...)• Time run (electronics, aircraft, engines, ...)• Distance travelled (cars, trains, ...)• Operating cycles (electronics, aircraft engines, ...)• Calendar (calibration, seasonal, ...) Must be measured Intervals, tolerances 155
  156. 156. Examples• TE pages 191-193 156
  157. 157. Built-in (Self) Test (BIT/BIST)• Apply only to functions that are not observed• Keep it simple! – Sensors etc. fail – False alarms• Implement in software (no weight, power, complexity) 157
  158. 158. “No Fault Found” (NFF)Causes:• Intermittent failures (components, connections, ...)• Tolerance effects• Connectors• BIT false alarms• Incorrect diagnosis/repair• Inconsistent test criteria• People• Ambiguous cause: >1 suspect unit changed(Also “retest OK” (RTOK), etc.) 158 50% - 80% of repairs!
  159. 159. RCM Objectives• Optimises preventive maintenance (PM)• Balances cost, availability, reliability, safety 159
  160. 160. Maintenance Categories (1)Corrective (CM):• Failure repair• Unplanned• Expensive/unsafeMinimise by high reliability and durability, + effective PM 160
  161. 161. Maintenance Categories (2)Preventive (PM):• Failure Prevention• Planned• Less Expensive/Safe Optimise by RCM 161
  162. 162. RCM Decision Logic (1)Failure Pattern:• Increasing (wearout)? Consider replacement – Failure-free life (light bulbs/tubes, drive belts, bearings, ...)• Decreasing/constant? No replacement (electronics, ...) 162
  163. 163. RCM Replacement IntervalsHazard Rate (1) Decreasing hazard rate: scheduled replacement increases failure probability m 2m 3m Time Hazard Rate Constant hazard rate: scheduled replacement has no effect on failure probability m 2m 3m Time 163
  164. 164. RCM Replacement IntervalsHazard Rate (2) Increasing hazard rate: scheduled replacement reduces failure probability m 2m 3m TimeHazard Rate Increasing hazard rate: with failure-free life >m: scheduled replacement makes failure probability = 0 m 2m 3m Time 164
  165. 165. RCM Decision Logic (2)Failure Effect (FMECA):• Critical? Consider replacement / PM• Detectable? Consider PM (eg. fatigue) 165
  166. 166. RCM Decision Logic (3)Failure Cost:• High? Consider replacement (gearboxes, engines, ...)• Low? Consider replacement on failure (light bulbs/tubes, hydraulic hoses (?), ...) 166
  167. 167. RCM Decision Logic (4) FR No NoIncreasing? Replacement Yes FE No Failure No Replace On Critical? Cost Failure High? Yes Yes Failure YesDetectable? PM No ScheduledReplacement 167
  168. 168. (Incipient) Failure Detection MethodsMechanical:• Manual (corrosion, wear, condition, ...)• NDT for fatigue (ultrasonic, dye penetrant, radiographic, ...)• Oil analysis (spectroscopic, magnetic)• Vibration/acousticElectrical/Electronic:• Built-in test• Functional test/calibration 168
  169. 169. Stress Screens for Repairs• Proves repair effectiveness• Reduces NFF• Use HASS if units subjected to HALT/HASS 169
  170. 170. Calibration• Regular test to ensure accuracy – Measuring devices – Instruments – Sensors• Traceability• Accuracy (ISO5725)• Management, records, labels 170
  171. 171. Organisation and ResponsibilitiesTest Department:• Provide facilities (strategic, tactical)• Knowledge (methods, requirements, regulations, standards, ...)• External facilities (contracts, hire, ...)• Maintenance and calibration• Training 171
  172. 172. Organisation and ResponsibilitiesProjects:• Create and manage team• Plan and manage testing• Liaison with Test Department• Identify/obtain project-specific requirements 172
  173. 173. Organisation and ResponsibilitiesDesign:• Design product• Design processes (manufacture, test, maintenance)• Integrate design analysis & development test• Design review (specification, pre-test, pre- production) 173
  174. 174. Test ProceduresInclude:• Organisation and responsibilities• Methods (design analysis, test)• Test planning and action• Failure reporting (FRACAS)• Project/design reviews• Integration (development, production, maintenance test)• Test equipment maintenance & calibration• In-service maintenance & calibration 174
  175. 175. Development Test ProgrammeWhat/when to test?• Components, modules, system• Component test: – earlier – more/cheaper – higher stresses – selection• External suppliers’ products• Output module(s) first 175
  176. 176. Development Test ProgrammeHow many to test?• As many as practicable (components/modules/systems)• Consider design analyses, risks, time, costs• Rotate items through tests (e.g. Software, proving, environmental, ...) Ever heard of too much testing? 176
  177. 177. Testing Purchased ItemsBase testing on:• Project requirements• Existing knowledge – supplier’s data – past use• Application/risks/novelty/costs ...• Supplier’s test programme/results Integrate! Retain Repeat 177
  178. 178. In-House v. External FacilitiesIn-house: External:• Core technologies • Lower capital outlay (?) /confidentiality• Designers more • Better facilities involved /expertise (?)• More flexible (?)• Cheaper (?) Consider balanced use of both TE homepage (/testservices.htm) 178
  179. 179. Project Test Plan (1)Include:• Requirements (performance, reliability, standards, ...)• Failures that must/should not occur• Design/design analysis inputs (design review)• Tests to be performed• Test items/allocations• Suppliers’ test requirements• Integration through project phases• Responsibilities (primary, support)• Schedules 179
  180. 180. Project Test Plan (2)• Single test plan• Link to other project plans – reliability – safety – quality, ...• Link/refer to procedures, standards, ... Flowchart: TE Fig. 14.1 (p. 241) Example: Appendix 3 180
  181. 181. Manufacturing Test Plan• Develop from development test results• HALT/HASS Flowchart: TE Fig. 14.2 (p. 242) Example: Appendix 4 181
  182. 182. Management Issues• Training – degree courses – short courses – on-the-job (HALT/HASS)• Integration – across functions – through phases• Economics – Long v. short term – Test adds valueThe Practice of Engineering Management, P.D.T. O’Connor (Wiley) 182
  183. 183. The Future of Test• Virtual test – EDA, FEA, CFD, ... – Simulation – Virtual reality• “Intelligent” CAE – Integrated physics, variation, ergonomics, ... – automatic design• Internet• Test hardware (BIT, “Sentient™”, ...)• Computer-based test• Teaching (?) 183

×