Your SlideShare is downloading. ×
0

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Accelerated reliability techniques in the 21st century

22,986

Published on

Accelerated life testing (ALT) is widely used to expedite failures of a product in a short time period for predicting the product’s reliability under normal operating conditions. The resulting ALT …

Accelerated life testing (ALT) is widely used to expedite failures of a product in a short time period for predicting the product’s reliability under normal operating conditions. The resulting ALT data are often characterized by a probability distribution, such as Weibull, Lognormal, Gamma distribution, along with a life-stress relationship. However, if the selected failure time distribution is not adequate in describing the ALT data, the resulting reliability prediction would be misleading. In this talk, we provide a generic method for modeling ALT data which will assist engineers in dealing with a variety of failure time distributions. The method uses Erlang-Coxian (EC) distributions, which belong to a particular subset of phase-type (PH) distributions, to approximate the underlying failure time distributions arbitrarily closely. To estimate the parameters of such an EC-based ALT model, two statistical inference approaches are proposed. First, a mathematical programming approach is formulated to simultaneously match the moments of the EC-based ALT model to the ALT data collected at all test stress levels. This approach resolves the feasibility issue of the method of moments. In addition, the maximum likelihood estimation (MLE) approach is proposed to handle ALT data with type-I censoring. Numerical examples are provided to illustrate the capability of the generic method in modeling ALT data.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
22,986
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
172
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Started with demand for monthly webinars to increase company footprint . DFR DRF & DFSS Solar Reliability Green Reliability Root cause analysis Medical risk based testing DOE for Reliability DOE for software testing DOE for simulation
  • Design for Reliability (DFR)AXD axiomatic design
  • Stress condition: Static, Cyclic, Dynamic LoadingFM’s: Load loss, creep, set, yielding, Fracture, damaged spring end , fatigue, buckling ,surging, complex stress change(t)Fracture, fracture fatigue, fretting corrosionCauses: Hydrogen embrittlement, flaws, high temp operation, stress concentrations from nicks, misalignment, low & high freq vibration, cycling temperature, corrosive atmosphere, insufficient space for operation, resonance surging. Sharp bends on spring endsFailures occur near spring ends. Conical, barrel, hourglass shaped
  • Transcript

    • 1. Accelerated Reliability Techniques in the 21st Century Mike Silverman & Lou LaValle, Doug Goodman, & Milena Krasich ©2013 ASQ & Presentation Silvermanhttp://reliabilitycalendar.org/webinars/
    • 2. ASQ Reliability Division English Webinar Series One of the monthly webinars on topics of interest to reliability engineers. To view recorded webinar (available to ASQ Reliability Division members only) visit asq.org/reliability To sign up for the free and available to anyone live webinars visit reliabilitycalendar.org and select English Webinars to find links to register for upcoming eventshttp://reliabilitycalendar.org/webinars/
    • 3. Accelerated Reliability Techniques in the 21 st Century byLou LaVallee, Senior Reliability Engineer, Ops A La Carte Doug Goodman, CEO, Ridgetop Group Mike Silverman, Managing Partner, Ops A La CarteMilena Krasich, Sr Principal Systems Engineer, Raytheon Page 3
    • 4. ABSTRACT• As product development cycles are shortening, the need for more accelerated reliability tools is becoming increasingly more important.• This webinar will focus on the best reliability tools focused on accelerating the reliability learning process. In this webinar, we will focus on four important accelerated tools: • Robust design and Reliability Engineering Synergy • Prognostics as a Tool for Reliable Systems • HALT • Accelerated Reliability Growth TestingThe audience will understand a variety of reliability tools thatcan be used to accelerate product development from designthrough testing. This will also provide a snapshot of thelearning that will be provided as part of the AcceleratedStress Testing and Reliability Workshop for 2013. Page 4
    • 5. Agenda• Introduction 5 min• Robust Design and Reliability Engineering Synergy 10 min Lou LaVallee, Ops A La Carte• Prognostics as a Tool for Reliable Systems10 min Doug Goodman, Ridgetop Group• HALT/AST – History and Trends 10 min Mike Silverman, Ops A La Carte• Accelerated Reliability Growth Testing 10 min Milena Krasich, Raytheon• Tieing them all together 5 min• Questions 10 min Page 5
    • 6. Biography for Lou LaVallee• Mr. LaVallee is founder of Upstate Reliability Engineering Services an upstate New York based consulting firm delivering advanced reliability support to a wide variety of industries. He joined forces with Ops a la Carte in 2010.• He has a strong technical background in physics, engineering materials/polymer science and a solid grounding in consumer product design, development, and delivery. His comprehensive background includes electronic films , robust design, modeling & analytics, critical parameter management, six sigma DFSS & DMAIC, optimization of product quality/reliability, experimental design, reliability test methods, and design tool development and deployment. He successfully managed systems engineering groups for development of ink jet print heads at Xerox Corp.• Mr. LaVallee has held other technical management positions in manufacturing technology, engineering excellence (trained several thousand engineers worldwide). He also managed the robust engineering center at Xerox for 10 years, managed a high volume printing product quality and reliability group, and worked extensively with high volume printing product service organization.• He has strong validation experience of design quality and reliability through product reviews and customer interaction Mr. LaVallee holds a Bachelor of Science degree in Physics (BS), and an MS from the University of Rochester in materials/polymer engineering.• He holds several U.S. patents involving fluidics and engineering design processes. He is currently a senior reliability engineering consultant with Ops a la Carte LLC.. Mr. LaVallee is an ASQ certified reliability engineer. Page 6
    • 7. Biography for Doug Goodman• Mr Goodman is CEO and Founder of Ridgetop Group, Inc. an Arizona-based leader of advanced diagnostic, prognostic and health management tools, instrumentation, and rad hard microelectronics.• He is accustomed to being a pioneer of innovative electronic technology and establishing engineering firsts. His comprehensive background encompasses low-noise instrumentation design, design-for-test (DFT), fault simulation techniques, and design tool development at firms such as Tektronix and Honeywell.• He was also part of the team that developed the first DSP-based IF processing for spectrum analyzers.• He successfully steered engineering at Analogy Inc. (electromechanical design simulation tools) as vice president until its IPO. Afterwards, he moved to co- found and head Opmaxx Inc., a design-for-test IP firm that later merged with Credence Systems.• Mr. Goodman also serves on the Board of Engineering Synthesis Design, Inc. (ESDI), a waveform and surface metrology instrumentation firm based in Tucson, Arizona. (ESDI.com). Page 7
    • 8. Biography for Mike Silverman• Mike Silverman is the founder and a managing partner at Ops A La Carte, a Professional Consulting Company that has an intense focus on helping customers with end-to-end reliability.• Mike has over 25 years of experience in reliability engineering, reliability management and reliability training. He is an experienced leader in reliability improvement through analysis and testing.• Through Ops A La Carte, Mike has had extensive experience as a consultant to high-tech companies, and has consulted for over 500 companies in over 100 different industries in most of the US and 15 countries around the world.• Mike is an expert in accelerated reliability techniques and owns HALT and HASS Labs, one of the oldest and most experienced reliability labs in the world.• Mike has recently completed his first book on reliability entitled “How Reliable Is Your Product: 50 Ways to Improve Product Reliability”.• Mike has authored and published 25 papers on reliability techniques and has presented these around the world including Canada, China, Germany, Japan, Korea, Singapore, Taiwan, and the USA. He has also developed and currently teaches over 30 courses on reliability techniques.• Mike is the chair of this year’s ASTR conference and chair of the Santa Clara Valley IEEE Reliability Society. Page 8
    • 9. Biography for Milena Krasich• Milena Krasich is a Senior Principal Systems Engineer in Raytheon Integrated Defense Systems, RAM Engineering Group in MA.• Prior to joining Raytheon, she was a Senior Technical Lead of Reliability Engineering in Design Quality Engineering of Bose Corporation, Automotive Systems Division. Before joining Bose, she was a Member of Technical Staff in the Reliability Engineering Group of General Dynamics Advanced Technology Systems formerly Lucent Technologies, after the five year tenure at the Jet Propulsion Laboratory in Pasadena, California.• While in California, she was a part-time professor at the California State University Dominguez Hills, where she taught graduate courses in System Reliability, Advanced Reliability and Maintainability, and Statistical Process Control. At that time, she was also a part-time professor at the California State Polytechnic University, Pomona, teaching undergraduate courses in Engineering Statistics, Reliability, SPC, Environmental Testing, Production Systems Design.• She holds a BS and MS in EE from the University of Belgrade, Yugoslavia, and is a California registered Professional Electrical Engineer.• She is also a member of the IEEE and ASQC Reliability Society, and a Fellow and the president Emeritus of the Institute of Environmental Sciences and Technology. Currently, she is the Technical Advisor (Chair) to the US Technical Advisory Group (TAG) to the International Electrotechnical Committee, IEC, Technical Committee, TC56, Dependability. Page 9
    • 10. &Accelerated Stress Testing and Reliability Workshop October 9-11, 2013 San Diego, CA Accelerating Reliability into the 21st Century Keynote Presenter Day 1: Vice Admiral Walter Massenburg Keynote Presenter Day 2: Alain Bensoussan, Thales Avionics CALL FOR PRESENTATIONS: We are now Accepting Abstracts. Email to: don.gerstle@gmail.com. Guidelines on website www.ieee-astr.org For more details, click here to join our LinkedIn Group: IEEE/CPMT Workshop on Accelerated Stress Testing and Reliability
    • 11. Accelerated Reliability Techniques in the 21 st Century Page 11
    • 12. IntroductionIn this webinar, we will introduce four of the mosteffective reliability techniques that can acceleratereliability learning on your product.• Robust Design and Reliability Engineering Synergy• Prognostics as a Tool for Reliable Systems• Highly Accelerated Life Testing (HALT)• Accelerated Reliability Growth Testing (RGT)We invite you to determine which can be mosteffective for your reliability program. Page 12
    • 13. Robust Design and Reliability Engineering Synergy by Lou LaVallee Page 13
    • 14. Poll Question 1Have you ever used Design for RobustnessTechniques ? a) We use all the time b) We’ve used a few times c) We tried once d) We haven’t used but are planning to e) We have never used Page 14
    • 15. Robust Design &Reliability Engineering Synergies Louis LaVallee Sr Reliability Consultant Ops A La Carte
    • 16. Abstract for full tutorial Robust Design (RD) Methodology is discussed for hardwaredevelopment. Comparison is made with reliability engineering (RE)tools and practices. Differences and similarities are presented. Proximity to ideal function for robust design is presented andcompared to physics of failure and other reliability modeling andprediction approaches. Measurement selection is shown to stronglydifferentiates RD and reliability engineering methods. When andhow to get the most from each methodology is outlined. Pitfalls foreach set of practices are also covered. (This presentation is a tasteof a larger presentation to be delivered in San Diego) Page 16
    • 17. Many Design methods & Interfaces AXD TRIZ QFD DFR PUGH DOE ROBUST DESIGN VA/VE DFSS 6 CP/CS MNGMT Page 17
    • 18. RD Reliabilit y Life Tests P-diagram Root cause Analysis Tolerance Design Expt Layout Ideal Function POF Response DOE RCM Tuning Engineering Maintainability CBM 6 FlexibilityLean Science Warranty $ Robust Design Simulation ReliabilityQuality Testing ModelsLoss Reuse FMECA transformability HALT/HASS Planning S/N RSM ADT Life prediction Redundancy Online QC ALT Parameter design FTA Availability Generic Function RBD Page 18
    • 19. Robustness is… “The ability to transform input to output as closely to idealfunction as possible. Proximity to ideal function is highlydesirable. A design is more robust if ratio of useful part toharmful part [of input energy ] is large. A design is morerobust if it operates close to ideal, even when exposed tovarious noise factors, including time”Reliability is… “The ability of a system, subsystem, assembly, or component to perform its required functions under stated conditions for a specified period of time” Page 19
    • 20. Harmful Variation & Countermeasures• Search for root cause & eliminate it• Screen out defectives (scrap and rework)• Feedback/feed forward control systems• Tighten tolerances (control, noise, signal factors)• Add a subsystem to balance the problem• Calibration & adjustment• Robust design (Parameter design & RSM)• Change the concept to better one• Turn off or turn down the power• Correct design mistakes (e.g. installing diodes backwards) Page 20
    • 21. Robustness Growth S/N Factors Can be changed today time S/N Factors Can be changed in 1 week time S/N Benchmark Target Factors Can be changed in 2 weeks Robustness gains time Page 21
    • 22. Progression of Robustness to Ideal Function Development A B C LSL USL Zero Defects Cpk Static S/N Dynamic S/N RatioWhen a product’s performance deviates from target, its quality isconsidered inferior. Such deviations in performance cause losses tothe user of the product, and in varying degrees to the rest ofsociety. Page 22
    • 23. Useful Input signals Output Main Function Mi Y=f(x)+ Harmful Output Noise Control Factors FactorsTaxonomy of Design Function -- P Diagram Page 23
    • 24. Transformability & Robustness ImprovementResponse Response N1 N1 N2 N2 0,0 M signal 0,0 M signal Minimizing the effects of noise factors on transformation of input to output improves reliability. Sensitivity increase can be used for power reduction. Noise factor here might be fatigue cycles, or stress in one or two directions, or … Page 24
    • 25. Typical Failure Modes and Causes for Mechanical SpringsTYPE OF SPRING/STRESS FAILURE MODES FAILURE CAUSES CONDITION - Load loss - Parameter change- Static (constant deflection - Creep - Hydrogen embrittlement or constant load) -Compression Set - Yielding - Fracture - Damaged spring end - Corrosive atmosphere - Cyclic (10,000 cycles or - Fatigue failure - Misalignment more during - Buckling - Excessive stress range of the life of the spring) - Surging reverse stress ** - Complex stress change as a - Cycling temperature function of time … - Dynamic (intermittent - Surface defects - Fracture - Excessive stress range of occurrences of - Fatigue failure reverse stress a load surge) - Resonance surging Page 25
    • 26. Ideal Function & Failure Modes If data remain close to ideal function, even underpredicted stressful conditions of use, and there is no wayfor failure to occur without affecting functional variation ofthe data, then moving closer to ideal function is highlydesirable. For example, spring fatigue, if it did occur woulddramatically change force-deflection (F-D) data and inflatevariation. Similarly, for yielding, F-D results would changeand inflate the variation. Other failure modes would followin most cases. Page 26
    • 27. Measurement System Ideal Function Y= M+e M=true value of measurand Y=measured valueAuto Steering Ideal function Y= M+e M=steering wheel angle Y=Turning radiusCommunication system ideal function Y=M+e M=signal sent Y=signal receivedCantilever beam Ideal Function Y= M/M*+e M=Load M*=Cross sectional areaFuel Pump Ideal Function Y= M Y=Fuel volumetric flow rate M=IV/P current, voltage,& backpressure Page 27
    • 28. Summary• RD methods and Reliability methods both have functionality at their core. RD methods attempt to optimize the designs toward ideal function, diverting energy from creating problems and dysfunction. Reliability methods attempt to minimize dysfunction through mechanistic understand and mitigation of the root causes for problems.• RD methods actively change design parameters to efficiently and cost effectively explore viable design space. Reliability methods subject the designs to stresses, accelerating stresses, and even highly accelerated stresses, [to improve time and cost of testing]. First principle physical models are considered where available to predict stability.• Both RE and RD methods have strong merits, and learning when and how to apply each is a great advantage to product engineering teams. Page 28
    • 29. Prognostics as a Toolfor Reliable Systems by Doug Goodman Page 29
    • 30. Poll Question 2Have you ever used Prognostics ? a) We use all the time b) We’ve used a few times c) We tried once d) We haven’t used but are planning to e) We have never used Page 30
    • 31. Reliability “Bathtub Curve” Prognostics Trigger Point Failure rate Infant Useful Life Normal Mortality Period Wearout Period TimeThreshold Trigger Points Advanced Warning of Failure (RUL)are selectable Page 31
    • 32. Prognostic Solutions• Ridgetop develops electronic prognostic solutions for critical systems: – Sensor array detectors – Harnesses for “prognostics-enabling” critical systems, and – Sentinel software to comprise a complete solution. Page 32
    • 33. Electronic Prognostics• Electronics are the keystone to successful deployment of complex systems (50+ MPUs in an automobile)• Large MTBF and Statistical Process Control and Centering methods are not sufficient alone for reliability due to “outliers” (e.g. Toyota Prius, Deepwater Horizon Drilling Rig, Boeing 787)• Ridgetop technology exists to pinpoint degrading systems before they fail; supporting operational readiness objectives and cost-saving Prognostics/Health Management (PHM) and Condition Based Maintenance (CBM) initiatives. Page 33
    • 34. Prognostics Health Management (PHM) Page 34
    • 35. Degradation Rates Depend on Environmental Conditions MTBF statistical expected lifeUsage Environment Usage monitoring would provide a safety benefit if actual usage is more severe than predicted (see the red region, T1). T1 T2 Service life can be extended beyond normal replacement time if the actual usage severity is known (see the green region, T2). PHM enables replacement only upon evidence of need Page 35
    • 36. Degradation Example Good Power System Degraded Power System State of State of Health Health Degraded VR Threshold End-of-Life End-of-LifeBoth supplies provide regulated voltage, but one is degraded and willsoon fail. Page 36
    • 37. Prognostic Advantages• Prognostics provides advanced warning of impending failure conditions on critical systems.• Physical evidence of degradation is the basis for service, not an arbitrary time interval.• PHM and CBM maintenance strategies can reduce support costs through optimized timing of service and parts replacement.• Autonomic logistics systems can be established, placing spare parts and provisions where needed. Page 37
    • 38. Faults Occur at Multiple Levels in Systems Page 38
    • 39. Sentinel NetworkTM• Collection and analysis hub for PHM• Scalable, system level State of Health (SoH) Analysis & Prognostics• Automatic SNMP-based Sensor Network Discovery• Troubleshooter• System stability cost reduction for tactical networks Page 39
    • 40. Airborne Power System Monitoring• PHM applied to power systems in harsh environment• Apache Helicopter where vibration, heat, shock all can reduce lifetime of deployed systems• Extracts and processes eigenvalues as a metric of health Page 40
    • 41. Prognostic Health Management Ecosystem 2 Identified Design Integrated ImprovementsCommunicate Diagnostic/PrognosticsPHM sensor Design Platform data 3 Address 1 ECRs Real-time and Health & RUL Improve Subsystem Parts OEM CBM Actions 4 Scheduler Minimize Inventory 5 Replacement Parts Line Replaceable Parts Unit (LRU) Maintenance Page 41
    • 42. HALT/ASTby Mike Silverman Page 42
    • 43. Poll Question 3Have you ever used the technique HALT ? a) We use all the time b) We’ve used a few times c) We tried once d) We haven’t used but are planning to e) We have never used Page 43
    • 44. INTRODUCTION HALT began 40 years ago with a simple idea of testing beyond specifications in order to better understand design margins. Over the past 40 years, thousands of engineers around the world have been exposed to the concepts of HALT and have tried the techniques. What have we learned in the past 40 Years? Page 44
    • 45. HISTORY OF HALT/HASS HP started performing Stress for Life (STRIFE) testing in the early 70’s. Some people consider STRIFE the predecessor to HALT.  Reliability cannot be achieved by adhering to detailed specifications. Reliability cannot be achieved by formula or by analysis. Some of these may help to some extent, but there is only one road to reliability. Build it, test it, and fix the things that go wrong. Repeat the process until the desired reliability is achieved. It is a feedback process and there is no other way.  David Packard, 1972 Page 45
    • 46. HISTORY OF HALT/HASS Dr. Gregg Hobbs officially coined the term HALT in 1988. For the next two decades, Gregg traveled around the world teaching the concepts of HALT and HASS. Many of you in this room probably attended that seminar. Page 46
    • 47. HISTORY OF HALT/HASS Over the next 17 years, HALT labs popped up around the world. Today I estimate there are about 200 HALT labs in the world. This has exposed literally thousands of engineers to the processes. However, the methodology being practiced is inconsistent. Standards have and are being written (more like guidelines) Books have been published Conferences have been formed Page 47
    • 48. HISTORY OF HALT/HASS Standards/Guidelines/References to HALT/HASS  IEC 62506: “Accelerated Testing”  IEST-RP-PR003: “HALT and HASS”  IPC-9592: “Performance Parameters for Power Conversion Devices” Page 48
    • 49. HISTORY OF HALT/HASS Books on HALT/HASS  “Accelerated Reliability Engineering: HALT and HASS”, Gregg Hobbs, 2000  “HALT, HASS, and HASA Explained”, Harry W. McLean, 2009  “Improving Product Reliability: Strategies and Implementation”, Levin and Kalal, 2003  “Accelerated Testing and Validation”, Alex Porter of Intertek, 2004  “How Reliable Is Your Product: 50 Ways to Improve Your Product Reliability”, Mike Silverman, 2010 Page 49
    • 50. HISTORY OF HALT/HASS Conferences  This ASTR Conference started in 1995 and we have held every year since except 2001. This is the only conference solely dedicated to accelerated testing  Other conferences with an ALT track  Applied Reliability Symposium (ARS)  Reliability and Maintainability Symposium (RAMS)  Other conference with a reliability focus  IRPS  Prognostics Conference (two of them) Page 50
    • 51. WHAT IS HALT?HALT: A design technique used to discover productweaknesses and improve design margins. The intent is tosystematically subject a product to stress stimuliwell beyond the expected field environments in order todetermine and expand the operating and destruct limits ofyour product.- 50 Ways to Improve Your Product Reliability, Mike Silverman Page 51
    • 52. WHAT IS NOT HALT?What are some classic HALT misconceptions: My product does not experience vibration so we can’t use it The spec for this component is 70C so we can’t go above that in HALT We can’t drill holes in the product because it will change the airflow We must mount it in the same direction as it will be mounted in the field We don’t need to go above the first failure point because that is what will fail first Run to preset levels (remember this is not a pass/fail test) Don’t stress beyond specifications Only perform HALT at system level Just perform HALT only when diags are fully ready Page 52
    • 53. BASICS OF HALT Start low and step up the stress, testing the product during the stressing Page 53
    • 54. BASICS OF HALT Gradually increase stress level until a failure occurs Page 54
    • 55. BASICS OF HALT Analyze the failure Page 55
    • 56. BASICS OF HALTMaketemporaryimprovements Page 56
    • 57. BASICS OF HALTIncreasestress andstartprocessover Page 57
    • 58. BASICS OF HALTFundamentalTechnological Limit Page 58
    • 59. BASICS OF HALT Classic S-N Diagram (stress vs. number of cycles) S0= Normal Stress conditionsS2 N0= Projected Normal LifeS1S0 N2 N1 N0 Page 59
    • 60. BASICS OF HALT Classic S-N Diagram (stress vs. number of cycles) Point at which failures become non-relevant S0= Normal Stress conditionsS2 N0= Projected Normal LifeS1S0 N2 N1 N0 Page 60
    • 61. BASICS OF HALT Lower Lower Upper UpperDestruct Oper. Product Oper. Destruct Limit Limit Operational Limit Limit Specs Stress Page 61
    • 62. BASICS OF HALT Lower Lower Upper UpperDestruct Oper. Product Oper. Destruct Limit Limit Operational Limit Limit Specs Destruct Margin Operating Margin Stress Page 62
    • 63. NEW ADVANCES IN HALT Along with improvements in chamber technology, there have been advances in the methodology as well.  Harry McLean’s HALT Calculator  To determine “Guard Band” Limits during the HALT Plan  To determine AFR after HALT  Using FMEA to determine specific areas to test for  Linking HALT to ALT  Using HALT for software/firmware issues Page 63
    • 64. FUTURE OF HALT AND HASS The number of companies performing HALT will continue to rise as more labs obtain HALT equipment The need for more education will continue to increase Standards/guidance docs will gain more importance as more companies and labs are doing HALT, many incorrectly. Chambers will need to provide stresses in addition to temperature and vibration to keep up with the physics of the failures (especially due to smaller packages and MEMs devices). Move away from people and move to process HALT as acronym will fade away Less HALT and more emphasis on DFR including HALT Page 64
    • 65. CONCLUSION In this presentation  we took you through 40 years of HALT  showed you advances that have been made  pointed out areas where improvements are still needed Page 65
    • 66. Accelerated Reliability Growth Testing by Milena Krasich Page 66
    • 67. Poll Question 4For the last RGT you performed, did youhave a chance to plan the duration and thestresses? a) Planned both b) Planned duration only c) Planned test environments d) Did not plan RG test Page 67
    • 68. Tutorial ObjectivesShow a synopsis of the tutorial which will be presented at theASTR 2013. Reliability Growth Test objectives Explain traditional Reliability Growth test methodology Show shortcomings of the traditional methods • Entire item failure rate not calculated • Test duration too long for the modern high reliability items • Little or no relationship of reliability and stresses Show principles of the Physics of Failure test methodology Show how the Reliability growth test based on PoF is constructed Show how the expected stresses are applied and accelerated Show reliability measures Show advantages of the test PoF test design and acceleration Show achieved considerable test cost reduction. Page 68
    • 69. Traditional RG Test MethodologyApplied stresses in test - magnitude equal to those in use Involved assumptions of stress average magnitudesOverall test duration determined based on the initial and goalreliability measure: failure rates Mean Time BetweenFailures, MTBF (or MTTF)Environmental or operational stresses applied in sequence orsimultaneously at the use levels Applied stress duration determined by engineering judgment Overall test duration and stress application are unrelated to use profiles or required life or mission of the productAdditional errors: Random failure rates and those of not corrected failure modes not added into the final failure rates Page 69
    • 70. Mathematics of Traditional Reliability Growth Failure modes types in test:  Systematic: corrected in test (Type B), not corrected (Type A), Random - constant 0,06 Item (t ) B (t ) A (t ) r (t ) 1 Item (t ) t A (t ) r (t ) 1 Item (t ) t 0,05Failure intensity/failure rate (failures/hour) Only type B failure modes failure 0,04 rates are accounted for in a reliability test program – those that show growth expressed by the 0,03 power law model; the type A and S(t)= A(t)+ r(t)+ B(t) random remain constant. 0,02 r(t) The only failure modes with decreasing failure rates 0,01 B(t) (power law) A(t) 0 0 1000 2000 3000 4000 5000 6000 Test duration (hours) Page 70
    • 71. Information Needed – Information ObtainedTest duration is mathematically determined from: 1 tF log F tF log 1 t1 tF log t1 F 1 t1 1 t1 tF e  Where:  F = final product MTBF (for mitigated. “fixed” failure modes only) – given goal  I = initial product MTBF (for failure modes that will be mitigated) - assumed  tF =test duration needed to achieve the final MTBF for fixed failure modes  tI = initial test time (has various explanations) - assumed  = parameter initially assumed, then determined from analysis (power law) –  the test duration might change dependent on failure mitigation success The test duration too high for high reliability items and depends on three assumed parameters Failure rate of the non-mitigated failure modes and those considered random are assumed or predictedInformation obtained from mathematical Reliability Growth technique isNOT product MTBF, it only is MTBF from the improved, B, failure modes The A type (not fixed) and random failure modes are – forgotten! Page 71
    • 72. Physics of Failure and ReliabilityFailures occur when an item is not strong enough to withstand one ormore attributes of a stress:  Level, duration, or repetitions of its application • The higher the level the shorter duration or less repetitions induce a failure The area of overlap of strength and stress distributions represents probability of failure for each of the stresses; L, L = mean and standard deviation of the load distribution = b L S, S = mean and standard deviation of the strength distribution = a S • If the mean of strength is a k times multiple of the mean of stress (load) and the standard deviations of each are a and b times their respective mean values, reliability of an item regarding each use stress (i), and the total reliability will be: S k L_i L_i RItem (t0 ) RStressi (ti ) Ri (k , L _i) 2 2 i 1 a k L_i b L_i Page 72
    • 73. Physics of Failure Reliability – Margin k Selection Allocate reliability regarding each of the expected stresses in use  The cumulative damage and ultimately failure due to a stress is proportional to the stress level and its duration. For the stress applied at the same level as in life, the cumulative damage model is: D(t ) S (t ) dt 1.00 t For the allocated reliability 0.95 regarding each stress, select 0.90 the value of margin k which 0.85 would multiply its duration in use to be applied in test; 0.80 Apply stresses simultaneously Reliability Reliability 0.75 b=0,5 whenever possible; 0.70 a=0,05 b=0,2 If the same stress type is a=0,05 b=0,05 applied at different levels in a=0,05 0.65 b=0,2 a=0,02 use, recalculate their durations 0.60 b=0,1 a=0.02 to the highest level (using 0.55 b=0,05 a=0,02 acceleration factors); The most common values for a 0.50 1.00 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 and b are: Multiplier k a = 0.05, b = 0.2 Page 73
    • 74. Test AccelerationEach of the stresses is accelerated in test to allow for shorter testdurationTotal item failure rate is the sum of its failure rates regarding eachindividual stress ( 0 is the item total failure rate in use condition and A isthe accelerated item total failure rate (in reliability growth is equivalentto ): NS A ATest 0 Aj i i 1 j iProduct j exists when the stresses 1 to j produce the same failure mode.Stress acceleration models for different stresses – example:  inverse power law model (usually applicable to thermal cycling, vibration, shock, humidity);  Arrhenius model (used for temperature acceleration using absolute temperature);  Eyring model (used also when the thermal stress is a factor in process acceleration);  step stress model, where the stress is increasing in steps;  fatigue model representing the degradation due to the repetitious stress. Page 74
    • 75. Test Example B Failure Modes Stress/Requirement/Property Symbol/Value Units Determination of factor k – for majorProduct life t0 h stresses: 1Time ON ta h/day Ri (t 0 ) R0 (t 0 ) 4 0.946 k=1.5Internal temperature when ON T ON °C 1Internal temperature when OFF T OFF °C 0,95 0,9Temperature change T Use °C 0,85Rate of temperature change ς Use °C/min 0,8Number of thermal cycles cT Cycles/day Reliability 0,75 a=0,1Temperature rise over the ambient T °C b=0,1 0,7Relative humidity RH Use % 0,65Distance travelled in product life D miles 0,6Vibration level in use W Use g 0,55Operational. (ON/OFF) cycling c Cycles/day 0,5 1,00 1,05 1,10 1,15 1,20 1,25 1,30 1,35 1,40 1,45 1,50 Stresses: Multiplier k Thermal cycling Thermal exposure (thermal dwell) Thermal dwell (normalize exposure when OFF to duration Humidity at ON temperature): Vibration tON _ N tON tOFF exp Ea 1 1 Operational cycling kB TOFF 273 TON 273 Thermal cycling tON _ N 8,754 hours TTest m 1/ 3 NTC _ Use k Duration of accelerated exposure:ATC ARamp _ Rate Test NTC _ Test TUse Uset ATC ARamp _ Rate tT _ Test tON _ N k exp Ea 1 1 kB TON 273 TTest 273 One thermal cycle in test = 24 hours in life tT _ Test 168.1 h Page 75 .
    • 76. Test Example, Cont.  The thermal exposure is combined with the thermal cycling, distributed over the high temperature:  The test cycle profile: tTC 2 (ramp time) (temp.Stabilizat ion ThermalDwell) Dwell at cold 125 tTC 2 22.3 5 52.3 min 0.875 h 10  Humidity: Test 95% RH and temperature TRH= 85 °C (65 °C chamber + 20 °C internal temperature rise) h RHUse Ea 1 1 t RH _ Test _ Test tON _ N exp RH Test kB TON 273 TRH 273 h 2.3 t RH _ Test 300 h  Vibration: 150,000 miles, 150 hours per axis vibration at 1.7 g rms. Test level: 3.2 g rms w WUse tVib _ Test k tVib _ Use WTest With : w 4 tVib _ Test 18 hours per axis Data for reliability plotting:Failure Time to Cumulative (t) log(t) log[ (t)] failure time to h failure Initial B failure modes MTBF 100,000 hours, final 106hours (n=24) 1 3,821.33 91,711,92 91 ,711.92 4.96 4.96 Initial test time: 100 hours 2 5,781.33 138,751.92 69 ,375.96 5.14 4.84 3 14,016 336,384.00 112 ,128 5.53 5.05 Total traditional test time: 4.6x103hours 4 18,563.44 445 522,56 111, 380.64 5.65 5.05 t 0*k 131.400 3 ,153 ,600 788 ,400 6.50 5.90 Final test reliability (B failure modes): 0.99997 Final MTBF (improved failure modes):1,431,964 hours Page 76 Total accelerated test time; 526 hours
    • 77. Why Accelerated Reliability Growth ?The test duration covers product entire life  It allows detection of all design problems, not only those that appear in a small fraction of product life  It enables estimate of failure rate regarding product random events, disregarded in traditional RG testing  The failure rate achieved by design improvement with the random failure rate provides realistic estimate of total product reliabilityTest duration is determined based on required total reliability in view ofproduct physical cumulative damage from life stresses in use;Test acceleration allows achievement of very reasonable test duration,shorter than traditional mathematically derived testing  The reliability improvement through test is no longer cost prohibitiveTest failure times are projected to their appearance in real life and theanalysis uses this data;Even though covering the product expected life (durability information), it isstill considerably shorter than the traditional reliability Page 77
    • 78. SummaryWhat each of these 4 techniques have incommon is that1) Each is a progressive accelerated reliability technique being used today2) Each will be highlighted as tutorials in our Accelerated Stress Testing and Reliability (ASTR) Workshop Oct 9-11 in San Diego Page 78 78
    • 79. SummaryWhat each of these 4 techniques have incommon is that1) Each is a progressive accelerated reliability technique being used today2) Each will be highlighted as tutorials in our Accelerated Stress Testing and Reliability (ASTR) Workshop Oct 9-11 in San Diego Page 79 79
    • 80. To find out more about this year’s Accelerated Stress Testing & Reliability (ASTR) Workshop October 9-11, 2013 San Diego, CA EMAIL: mikes@opsalacarte.com (MIKE IS THIS YEAR’S ASTR GENERAL CHAIR)ALL THAT RESPOND WILL BE ENTERED INTO A DRAWING FOR A FREE REGISTRATION TO THE CONFERENCE Page 80
    • 81. Q&A Page 81
    • 82. Contact Information Ops A La Carte, LLC Ops A La Carte, LLC Mike Silverman Lou LaVallee Managing Partner Senior Reliability Engineer (408) 472-3889 (585) 281-1882mikes@opsalacarte.com loul@opsalacarte.com www.opsalacarte.com www.opsalacarte.com Raytheon Ridgetop Group Milena Krasich Doug Goodman Sr. System Engineer CEO 978-440-1578 (520) 742-3300Milena_Krasich@raytheon.com doug.goodman@ridgetopgroup.com www.ridgetopgroup.com Page 82

    ×