Overview of halt a paradigm shift for rev3

966 views
1,001 views

Published on

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
966
On SlideShare
0
From Embeds
0
Number of Embeds
583
Actions
Shares
0
Downloads
32
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Modeling can be useful and an important tool for LIMITED applications – such a Finite Element Analysis – especially in mechanical wear, hinges, bearings, Single BGA – but many if not most have more than one stress driverComponent failure doesn’t always result in systems failure ….Mention the 26 capacitors in a system that was defective but only 5 locations cause hard failures of the system…Variation of manufacturing quality, excursions, and errors or abuse are the cause of the vast majority of failures before technological obsolesence
  • Modeling can be useful and an important tool for LIMITED applications – such a Finite Element Analysis – especially in mechanical wear, hinges, bearings, Single BGA – but many if not most have more than one stress driverComponent failure doesn’t always result in systems failure ….Mention the 26 capacitors in a system that was defective but only 5 locations cause hard failures of the system…Variation of manufacturing quality, excursions, and errors or abuse are the cause of the vast majority of failures before technological obsolesence
  • Old AdageChain is only as strong as weakest LengthElectronics systems are only as robust as the weakest componentLike a chain – most electronics systems have significant strength - won’t know until you find the limit – In thermal its mostly OPERATIONAL LIMITS ONLY – no need to go beyond operation - dangerous If limits are found, you can use the “stress Capability” as a benchmark of materials and system strength to test underNot testing beyond Spec – may not find “weak link” before it’s a field reliability problem
  • Old AdageChain is only as strong as weakest LengthElectronics systems are only as robust as the weakest componentLike a chain – most electronics systems have significant strength - won’t know until you find the limit – In thermal its mostly OPERATIONAL LIMITS ONLY – no need to go beyond operation - dangerous If limits are found, you can use the “stress Capability” as a benchmark of materials and system strength to test underNot testing beyond Spec – may not find “weak link” before it’s a field reliability problem
  • Since we know that thermal stimulation can skew the timing,We also know that there will be SOME variation in the timing parametrics of critical components in quantity production – how much will it vary in the future or if a second source of that component is Substituted during the systems production life
  • The challenge in development is that the component and assembly variations will occur throughout the production life of the productWe probably do not know the how wide these variations might be and how they will affect our producthopefully occur in a small percentage of the population – 1 to 3%The probability of seeing these failures with 10 – 20 samples is very LOW We can help see these potential reliability risks when we use thermal stress to skew the silicon
  • The challenge in development is that the component and assembly variations will occur throughout the production life of the productWe probably do not know the how wide these variations might be and how they will affect our producthopefully occur in a small percentage of the population – 1 to 3%The probability of seeing these failures with 10 – 20 samples is very LOW We can help see these potential reliability risks when we use thermal stress to skew the silicon
  • Overview of halt a paradigm shift for rev3

    1. 1. Overview of HALT and HASS: AParadigm Shift in Reliability TestingKirk Gray, Principal ConsultantAccelerated Reliability Solutions, L.L.C.Email: kirk@acceleratedreliabilitysolutions.comPhone: 512-554-3111Central Texas Electronics AssociationElectronics Design, Manufacturing & Test Symposium
    2. 2. 2NTSB chairman Deborah Hersman“The design and certification assessment and the assumptionsthat were made were not born out by what we saw,” …speakingabout the battery fires in Boston and Japan. “We had two eventsin two weeks on two separate aircraft. The fleet has less than100,000 hours and [Boeing] did not expect in their assessment tosee a smoke event in but less than 1 in every 10,000,000 hours.”From NTSBWebsiteFrom NTSBWebsite
    3. 3. 3The U.S. Food and Drug Administration todayissued a proposed order aimed at helpingmanufacturers improve the quality and reliabilityof automated external defibrillatorsFDA has received approximately 45,000 adverseevent reports between 2005 and 2012 associatedwith the failure of these devices
    4. 4. 4Electronics Reliability Prediction
    5. 5. Fundamental mismatch of time scales5Wear out IS NOT a significant cause of un-reliability forvast majority of (non-mechanical) electronicsassembliesHazard rateTime7 to 30 years?TechnologyObsolescenceWear outUse period 5-7 yearsThe Life Cycle Bathtub Curve
    6. 6. Costs and Reliability Development6Most of the costs of unreliability occur in the firstseveral months or yearsHazard rateTime7 to 15 years?TechnologyObsolescence$$$$$Warranty ReturnsPerception of poor qualityLost future sales!Prediction-Modelingof wear out
    7. 7. There is a Drain in the Bathtub Curve7$$$$$$ $???(ex. Powergeneration systems)
    8. 8. Basis for HALT8A chain is only as strong as its weakestlink.
    9. 9. Fundamental Basis for HALT9Fastest way to find a weak link is find thestrength and stress limit…Pull until it breaks
    10. 10. HALT10 HALT – Highly Accelerated Life Test Controlled Stepped Stress test to empiricaloperational and sometimes destruct limits A methodology and significant reliabilityparadigm shift – not a type of stress or type ofchamber A discovery process, - a Stimulation not aSimulation process
    11. 11. The Base HALT processContinue until operating & destruct limitsof UUT are found or until test equipmentlimits are reached.Level of Applied Stress Stimuli(vibration or thermal)Time (hour:minute)Step AStep BStep CStep D0:00 0:10 0:20 0:30Thermal Steps: typically +10ºC and -10ºCVibration steps: typically 5-10 Grmscold11
    12. 12. The Base HALT processContinue until operating & destruct limitsof UUT are found or until test equipmentlimits are reached.Level of Applied Stress Stimuli(vibration or thermal)Time (hour:minute)Step AStep BStep CStep D0:00 0:10 0:20 0:30Thermal Steps: typically +10ºC and -10ºCVibration steps: typically 5-10 GrmsHeat12
    13. 13. The Base HALT processContinue until operating & destruct limitsof UUT are found or until test equipmentlimits are reached.Level of Applied Stress Stimuli(vibration or thermal)Time (hour:minute)Step AStep BStep CStep D0:00 0:10 0:20 0:30Thermal Steps: typically +10ºC and -10ºCVibration steps: typically 5-10 Grmsvibration13After finding limits with each single stress, try thermalcycling, combinations of stress
    14. 14. Outputs of HALT141) List of potential failure mechanisms, weaknessesand the relevance to field failures.2) Opportunity to increase reliability “tolerance” atlowest costs when done early.3) Safe limits and stress boundaries to create a costeffective HASS processes.
    15. 15. HALT IS15 Deterministic- find weak links by stressing to inherent limits Based on fundamental limits (strength) of standard electronicsand - not end-use environmental conditions Done while products are powered and functionally monitoredduring stress application Not a quantifiable “life” test - no stress test for systems canaccelerate all fatigue or chemical degradation mechanisms atequivalent field rates
    16. 16. Common Responses to HALT Discoveries16 “Of course it failed, you took it above specifications” “It will never see that stress level in use” “It wasn’t designed for that vibration level” “If you wanted it to operate in those conditions wewould have designed it for those conditions”Why many companies claiming to do “HALT” only dothe first part – find limits but do nothing to improvethem
    17. 17. HALT/HASSIS NOT to develop products to survive extremeconditions or environments17
    18. 18. HALT vs. HASS Stress Levels18Stress Level(example temperature)UpperDesignSpecUpperOperatingLimitUpperStrengthLimitLowerDesignSpecLowerOperatingLimitLowerStrengthLimit(rare)HALTHASSTypicalUse
    19. 19. Level of AppliedStress StimuliTime (hour:minute)Upper operationthermal limitLowerOperationthermal Limit0:00 0:10 0:20 0:30Vibration levelvibtemperatureTemperature level• Rapid thermal cycling (up to 60ºC/minute) – two to five thermal cycles• Combined with multi-axis vibration – vary intensity during application• Other stresses (power cycling, voltage margining, frequency margining)HASS and HASA Process Parameters19Vibration levelUDL
    20. 20. The Stress Strength Model20In assemblies and structures there is a loadLoad = cumulative life-cycle fatigue (aging) from normal useStrength = the material fatigue life of electronic materialsAs Long as Load < Strength no failures occurStress/StrengthLoad Strength# of field units
    21. 21. The Stress Strength Model21 For multiple units - results distribution around thenominal valuesVariations in end-use stresses for the most part are uncontrolledVariations in assembly strengthAlso applies to thermal and electrical stress/strengthStress StrengthLoad= cumulativefatigue from useStrength= assemblyfatigue strength# of field unitsHighFatigueLowFatigue LowStrengthHighstrength
    22. 22. Stress/Strength Diagram and Failures22If Stress < Strength, no failures occur As fatigue damage (aging) accumulates, the mean ofthe strength shifts left Field failures occur when the two distributions overlap -the weakest units are subjected to the highest life cyclestressesStress Strengthcumulativefatiguedamage# of unitsStrengthLowFatigueHighFatigueHighstrengthLowStrength
    23. 23. Stress/Strength Diagram and Failures23If Stress < Strength, no failures occur As fatigue damage (aging) accumulates, the mean ofthe strength shifts left Field failures occur when the two distributions overlap -the weakest units are subjected to the highest life cyclestressesStress Strengthcumulativefatiguedamage# of unitsFieldFailures StrengthLowFatigueHighFatigue
    24. 24. Develop Tolerance of Variations24New Path to Optimal Reliability: Find the functional margins by testing to absolute operational limits(thermal OTP defeated) Improve margins by finding the limiting component(s) and determiningif and how the margin can be improved Small changes can result in large margin gainsStress Strength#offieldunitsGreatest tolerance forManufacturing variationMaximize Functional margin=
    25. 25. Thermal Stress to Skew Parametrics25 Marginal designs may not be observable until asufficient number of units are in the field The field is a costly place to find these marginalconditionsParametrictiming valueParametric timingvalue#units#units100 100,000Limited samples During development Mass Production variationParameterSpecificationLower oplimitUpper oplimitLower oplimitUpper oplimitParameterSpecificationMarginaloperationregions1,000
    26. 26. Timing Skewing from Thermal Stress26Applying thermal stress stimulates a timing shiftParametrictiming valueunits100ParameterSpecificationLower oplimitUpper oplimitCold• Thermal Step Stress and cycling provides a muchhigher probability of detecting low incidence rate issues• Cold speeds up signal propagation
    27. 27. Timing Skewing from Thermal Stress27Applying thermal stress stimulates a timing shiftParametrictiming valueunits100ParameterSpecificationLower oplimitUpper oplimitHOT• Thermal Step Stress and cycling provides a muchhigher probability of detecting low incidence rate issues• Heat slows down signal propagation
    28. 28. Power supply HALT and HASS Success28A worldwide leader in the development andmarketing of power conversion and controlsystem solutions was looking for methods toreduce warranty returns.The Company makes power supplies andsubsystems used in Capital Equipment for thin filmapplications
    29. 29. Company’s First HALT and HASS Results29Results after HALT To HASS on the a shipping powersupply unit (6KW) DC production Manufacturing testing for reliability cycle time wasreduced from 4 days of “burn-in” to a HASSprocess lasting 40 minutes per 2 units. Warranty returns were reduced 90% - from 5.0% to0.5% - within months after introduction of designchange and HASS
    30. 30. 21st Century path to optimizingelectronics reliability30 Design using good design practices and lessons learned Test to empirical operational limits and sometimesdestruct Determine root causes of limits Understand the physics of failure or limit Remove or improve weak links – to make robust products Apply combined safe stresses to make shortestscreening (HASS/HASA) processes Improve screens - Find best discriminators to observefailures in manufacturing capability or process control
    31. 31. 31Relevant Quote“A new scientific truth does not triumph byconvincing its opponents and makingthem see the light, but rather because itsopponents eventually die, and a newgeneration grows up that is familiar withit”-Max Planck, Scientific Autobiography

    ×