SlideShare a Scribd company logo
1 of 16
Download to read offline
SPE 171517-MS
Estimating Probability of Failure for Drilling Tools with Life Prediction
K. Carter-Journet, A. Kale, D. Zhang, E. Pradeep, T. Falgout, and L. Heuermann-Kuehn, Baker Hughes.
Copyright 2014, Society of Petroleum Engineers
This paper was prepared for presentation at the SPE Asia Pacific Oil & Gas Conference and Exhibition held in Adelaide, Australia, 14–16 October 2014.
This paper was selected for presentation by an SPE program committee following review of information contained in an abstract submitted by the author(s). Contents of the paper have not been
reviewed by the Society of Petroleum Engineers and are subject to correction by the author(s). The material does not necessarily reflect any position of the Society of Petroleum Engineers, its
officers, or members. Electronic reproduction, distribution, or storage of any part of this paper without the written consent of the Society of Petroleum Engineers is prohibited. Permission to
reproduce in print is restricted to an abstract of not more than 300 words; illustrations may not be copied. The abstract must contain conspicuous acknowledgment of SPE copyright.
Abstract
Drilling tools are subject to numerous operational parameters such as revolutions per minute (RPM), vibration (lateral, stick-
slip and axial), pressure, torque and temperature. These parameters can greatly fatigue even the most robust tool depending
on where and how the tool is operated. Lifetime prediction methodologies represent an affordable and statistically significant
way to estimate the probability of failure (risk) of drilling tools in a cost effective way. Understanding the potential risk is
vital to ensuring reliability, performing the most efficient maintenance on the equipment and improving drilling performance.
Sophisticated risk-modeling techniques reduce uncertainty in drilling operations by making use of readily available opera-
tional field data, thus eliminating the need for costly laboratory experiments. Blind spots in the decision making process are
eliminated by proactively identifying precursors to costly failures in the field. Preemptive guidance during maintenance peri-
ods, for parts that may have otherwise been overlooked based strictly on procedure, is enabled. Statistical models that relate
the operating environment to component life are derived from field component failure data, and introduce a fresh way to
boost the drilling tool efficiency. A Bayesian-based model selection technique is also developed which incorporates operating
environment variables after each successful drilling run to dynamically select the model that gives the best survival probabil-
ity, ensuring maximum utilization of a component, while avoiding failure and improving the overall reliability of the tool in
the field. The implementation of lifetime prediction methodologies also leads to lowered life-cycle and maintenance costs,
reduced risk and improved operational performance. The paper presents the methodology used to estimate the probability of
failure of drilling tools and further illustrates how to reach risk-informed decisions.
Introduction
Optimum drilling services minimize the non-productive time (NPT) experienced from tool degradation and/or failures. This
objective of reliability starts with innovative tool design and spreads to the primary areas of application engineering, mainte-
nance and well site execution. A universal approach for greater project efficiency, with minimized risk1
, is necessary as the
oil and gas industry seeks unconventional sources to meet increasing demands.
Almost every product and service is designed to reduce costs, lessen risk or increase productivity during activities related to
hydrocarbon extraction, further advancing reservoir performance. Consistent methodologies, which provide preemptive guid-
ance for optimizing drilling parameters and reducing the probability of failures in the field, are necessary. These types of
methodologies are especially important when analyzing electrical component anomalies. For instance, the reliability of elec-
tronic-printed circuit board assemblies (PCBAs) in the bottomhole assembly (BHA) is vital to the success of any drilling op-
eration. PCBAs are multi-scale devices (encased in electronic packaging) comprising multiple components and the geometric
dimensions of individual components may vary in size and composition and are not easily assessable without disassembling a
tool. Electronic packaging can also be subject to thermal expansion mismatch, accelerated corrosion, dendrite growth, metal
whiskers, solder fatigue and outgassing which can lead to failure. Understanding the risk and amount of consumed life of
PCBAs prior to deploying a drilling tool into the field improves reliability and overall drilling performance. The ever-present
need for more flexibility in drilling regimes, greater reliability of drilling tools and higher rates of penetration puts further
1
Risk, for the purpose of this paper, refers to the uncertainty in drilling tool performance at the component level and/or as a
whole. Risk centers on predicting the probability of failures that can lead to severe damage to the tool and/or the inability to
perform the run or function to the best advantage. The consequences can be technical, safety, cost, or schedule related. The
ability to quantify and understand risk provides a foundation for proactive risk management throughout the drilling tool’s
lifetime.
2 SPE 171517-MS
strain on the drilling tool’s electrical components. High-performance drilling tools in the industry must drill in harsher envi-
ronments, higher temperature (often beyond 150°C), vibrations (levels exceeding 15g) and pressures (30Kpsi or more) along
horizontal paths (rather than conventional vertical bore holes) at increasing depths and abrasive formations because more
readily accessible oil and gas reserves were depleted long ago (Figure 1).
These demanding conditions can often influence companies’ decisions to operate drilling tools beyond their design specifica-
tions. The trend also leads to higher maintenance costs and more frequent system downtime. However, instead of over-
maintaining drilling tools, companies must target enhancing system performance. For example, focus should be on
preventing failure and reducing system downtime, meeting customer demands, reducing maintenance costs and ensuring
equipment reliability.
Failure in the field extends a planned drilling program beyond the scheduled time frame, adding unnecessary cost.
Consequently, the capability to estimate the probability of failure (PoF) for drilling tools by using lifetime prediction
methodologies introduces an alternative way to avert expensive downhole tool failures and ensures the success of any drilling
operation by indicating the overall risk.
Background: Lifetime Prediction Methodologies
In lifetime prediction analysis and reliability engineering, the output of the analysis is always an estimate (ReliaSoft, 2005).
The true value of probability of failure (PoF), probability of success (reliability), mean life, parameters of a distribution or
any other applicable parameter is never truly known. In fact, these values will (likely) remain unknown for all practical pur-
poses. However, through the use of lifetime prediction analysis, engineers are able to use operational field data to determine
the PoF for parts, components and systems. Understanding the PoF is useful in determining whether drilling tools must be
used in harsh or benign environments for a desired length of time without failure. Lifetime prediction is inclusive of an as-
sortment of statistical techniques ranging from best-fit modeling to machine learning and text/data mining as a way to ana-
lyze historical and current data for making predictions about the future (forecasting). Current lifetime prediction techniques
often require test data obtained throughout extended periods that approach the actual life of a part; this type of testing can be
costly and time-consuming. An alternate approach to laboratory testing is to obtain, catalog and statistically analyze opera-
tional field data using predictive analytics.
To meet the growing demand for more reliable drilling tools, there is mounting interest in the area of health prognostics for
electronics components by using physics-based models, operational field data, design and qualification testing data and in-
service inspections data. The field data can be used to build a part/tool profile2
to evaluate operational fitness based on the
historical usage of an entire population of the same part/tool. This is very similar to the way a doctor compares the results of
an individual’s blood test against a specified range from a larger population to determine if the values are within an accepta-
ble range. The determination factors into what risk category (low, medium or high) the individual belongs in. A risk-
informed decision is made that determines whether any corrective action is required. In the case of a part/tool, the risk indi-
cates the fitness of the part/tool to operate optimally in the next run (meaning a recommendation is made on whether to pro-
2
A profile would describe events for a part, from manufacture through end of life. Natural or induced factors (performance or
environmental) would be included. All associated failures (confirmed or unconfirmed) would be documented.
Figure 1: Illustration of Drilling System
Rig
Drill Pipe
Drilling Tool Abrasive Formation
SPE 171517 3
ceed with using the tool as is, perform some level of maintenance or entirely replace/ retire it).
Identifying precursors to failure and quantifying the associated risk in real-time is challenging because it is not realistic to
take measurements during drilling; an example is when PCBAs are built-in inside the tool and require disassembly to perform
tests and measurements. Therefore, using algorithms as a diagnostic tool to detect anomalies is a fast and practical approach.
Table 13
provides further details on contemporary methodologies used for lifetime prediction.
Table 1: Lifetime Prediction Methodologies
Methodology Description (+/-) FD RD TD R&M SP Comp RC SS
Measurement of
failure precur-
sors
The first process of measuring failure precur-
sors as indicator of impending failure is es-
tablished on the hypothesis that a degraded
circuit board produces a significantly different
signature from that of a defect-free board.
Detecting
Anomalies using
fuses/ sensors
The second technique of research in elec-
tronics prognostics and health management
(PHM) uses sacrificial circuits like fuses, ca-
naries, circuit breakers and self-diagnostics
sensors for detecting whether the device is
operating outside of its design limits. Sacrifi-
cial circuits are widely used in consumer
electronics products and appliances.
Physics based
modeling
The third approach for life prediction uses
modeling and simulation to relate the funda-
mental physical and chemical behavior of
materials to the action of surrounding envi-
ronment and applied loads. Typically for elec-
tronics, the PoF- based modeling process
starts by exposing the product to highly ac-
celerated life tests (HALT) and highly accel-
erated stress tests (HAST) to find the signifi-
cant mode(s) and root-cause(s) of failure.
Field data driven
analytics & Sta-
tistical modeling
The fourth methodology gained momentum
because of availability of large volumes of
data and limitations of data-agnostic meth-
ods.
Proposed Lifetime Prediction Methodology
This paper introduces a methodology to estimate the lifetime of drilling electronics using operational field data, drilling dy-
namics and historical maintenance information. Reliability analyses on specific drilling parameters and Bayesian statistics are
combined in a probabilistic framework. Parameter estimation is used to calibrate statistical equations to field data, and proba-
bilistic analysis is used to obtain the likelihood of failure. Model parameters are represented as random variables, each with a
probability distribution. The methodology takes into account that drilling electronics in downhole conditions can have varied
failure modes, and each failure mode can be caused by the interaction of multiple variables, either independently or interde-
pendently. Several candidate models were developed to account for the inability to model each failure mode of a component
in the field. Bayesian updating further improves the model results by updating prior probability estimates to produce a poste-
rior probability estimate established upon operational run history updates for individual part numbers (PNs) within a drilling
tool. The inclusion of Bayesian updating adds precision to dynamically selecting more accurate failure models for a selected
part as a function of usage. Sophisticated risk-modeling techniques can reduce uncertainties in drilling operations for oil and
gas companies by quantitatively identifying the risk.
3
Fully darkened circles denote complete correlation with method.
4 SPE 171517-MS
The essential information necessary to estimate the probability of failure is entrenched in the historical life cycle data
normally found within a company’s Failure Reporting Analysis and Corrective Action System (FRACAS). FRACAS data is
important because it reveals when/how components fail, provides detail on the material properties, loads (electrical and
mechanical), material response, the physics of failure and corresponding corrective actions (upgrade or revision). The data is
fed directly into a life prediction model that is used to assist in the Risk-Informed Decision Making (RIDM4
) process,
maintain drilling tools and increase reliability (Figure 2).
Operational Field Data Requirements
Foremost, oil and gas companies must have the ability to predict business outcomes and make risk-informed decisions that
enable them surpass their competitors. This is heavily contingent on how successful these companies are at harnessing the
available data. In the case of the proposed lifetime prediction approach, historical data is leveraged to forecast future perfor-
mance of drilling tools to the part level. However, this cannot be effectively done unless these companies understand the data
available.
Field data-driven models for lifetime prediction of electronic assemblies in drilling operations is challenging for two reasons.
First, not all of the factors impacting component life can be measured in real time. Second, the data that can be measured has
errors and noise because of limitations of the measurement system and human factors. Challenges for historical data of
PCBAs (Figure 3) include variable operating environment, incomplete information on failures and operating history and sta-
tistical variation in components (manufacturing defects, material properties, etc…).
4
The primary objective of RIDM is to provide the decision maker with the necessary risk information to make a choice that
has the most potential for successfully meeting objectives (ex: completing a drilling mission without failure and within the
specified timetable).
Figure 2: Proposed Life Prediction Methodology
Optimize
Drilling
Performance
Improve
Reliability
and
Reduce
Risk
Create Independent
Models for each Part
Number (PN) in a
Drilling Tool
Utilize Models to Evaluate Proba-
bility of Failure (Generate Results
and Plots)
Model is Invalid
Reduce
Maintenance
Costs
Gather Operational Field
Data (Failures and Sus-
pensions)
Environment, run,
and failure data
Repair and
Maintenance
Data
Develop Life Prediction
Model
Select/Screen Desired
Data (Filter miss-
ing/incomplete data)
Consider Per-
centage of
missing data
Consider Size of
Dataset
Data must have high fidelity and quality.
Select Appropriate Lifetime Distribution
to Fit Data (Weibull, Lognormal, etc…)
Variable Selection
(Drilling Hours, Temperature and
Vibration)
Train Data & Create Model
(Outlier Detection, Weighting Factors,
Bayesian Updating, and Best Fit
Models)
Test
Model
Lifetime Prediction
(Risk and Remain-
ing Useful Life)
Model is Valid
Utilize Probability of
Failure in Risk-Informed
Decision Making Process
(RIDM)
SPE 171517 5
There are some basic requirements for integrating historical data with predictive models. The first requirement is that the
operational field data used in lifetime prediction must be plentiful5
and the second requirement is that the parts must be serial-
ized6
. This may require that the data is assembled and formatted in a way that provides the necessary data fields for each PN.
Failure (parts that are no longer operational) and suspension (parts that have either been scrapped or have not failed) data
must be included in the dataset for each PN. Basic information for a PN must also include the job number, run and incident
information for any failures.
Table 2 presents some data fields that are helpful in developing a predictive lifetime model (list is not all-encompassing; oth-
er fields may also be applicable):
Table 2: Potential Drilling Tool Data Fields of Interest
Data Field Name Field Description
Part Number (PN) Referencing identifier for a group of parts that share common design.
Serial Number (SN) Unique identifier of a single piece with associated history.
Revision (Rev) Tracking method for non-functionality related changes associated with a PN.
Upgrade History (per Revi-
sion)
Whether or not the PN-SN-Revision has multiple combinations.
Repair History (per Revision) Whether or not this PN-SN-Revision has any repair activity.
Last Repair Date The last repair activity date during the reported time frame.
Scrap Whether or not the most recent component has finally been scrapped.
Scrap Reason The scrap reason if the most recent component has finally been scrapped
Last Maintenance Level Most recent level of maintenance performed on this PN-SN-Rev.
Last Maintenance Location
Geographical location where the most recent maintenance activity was complet-
ed.
Product Description The top level assembly display name.
Last Job Location Geographical location where the tool was most recently operated.
Last Job Number Referencing (SAP) number used to link with a customer’s well name.
Drilling Hours The length of time the BHA/bit was actively making hole.
Circulation Hours The length of time drilling fluid was pumped through the BHA.
Distance Drilled The total distance drilled by a PN-SN in the reported time frame.
Average Temperature The average of average temperatures in the reported time frame.
Average RPM The average of average revolutions per minute (RPM) in the reported time frame.
Min Depth In The minimum of Depth-In in the reported time frame.
Average Flow Rate The average of average flow rate in the reported time frame.
5
There must be sufficient failure data for a part to be adequately modelled. Since there is some pre-screening of the data prior
to building a model, there must be at least a year’s worth of data to use in the process.
6
Each part has a part number (PN) and a serial number (SN) that uniquely identifies it and its applicable history. One PN
may have multiple SNs and multiple revisions.
Figure 3: PCBA Images
6 SPE 171517-MS
Data Field Name Field Description
Incident Date The incident date of the last failure in the reported time frame.
Incident Description Description of events that led to the incident in the reported timeframe.
Root Cause Description The root cause of an incident in the reported timeframe.
Failure Mode The failure mode of the last failure reported time frame.
Average Vibration The average vibration (axial, lateral, and stickslip).
Model Development
Models are developed at the part number level first (each revision of a part is evaluated separately). Next all of the parts that
are installed in a specific drilling tool are grouped together to provide the overall risk of that tool. Therefore, each part’s con-
tribution to the overall condition of the drilling tool can be assessed. Comprehensive data for the entire history of the part is
required to analyze the relationship between operating environment and life.
A typical time to failure model comprises a life distribution function to incorporate the statistical scatter in failure time and a
life characteristics function (Appendix A) that describes a general relation between failure time and stress levels (Kale et al.,
2014). Weibull, lognormal and exponential distributions are considered in this methodology for each part’s model. The life
characteristic can be any life measure such as the mean, median or hazard rate that represents a bulk property of the distribu-
tion. The life characteristic is expressed as a function of stress (as shown in Appendix A). The unknown parameter of the
composite model is determined by tuning the model equation to fit field data using the Iterative Maximum Likelihood Esti-
mation technique.
Optimizing Problem
One of the main focuses of this paper is to optimize allocation of assets by incorporating operational constraints on life and
reliability of individual components that make up the tool. A case study is presented describing a scenario where two assets
in a maintenance shop are awaiting overhaul. Furthermore, there are a fixed number of spare parts that can be used as re-
placements and stringent threshold reliability constraints that each tool must meet after a maintenance action. The optimizing
problem is to maximize reliability of the two assets by swapping the existing parts between the two tools and the additional
spare parts. Calculating which assets to swap is done by using the constrained linear programming algorithm. A linear pro-
gramming problem may be defined as the problem of maximizing or minimizing a linear function subject to linear con-
straints. The constraints may be equalities or inequalities.
(1)
where x represents the vector of variables whose optimum values are to be determined, C is a vector containing the sensitivity
of objective function with respect to each unknown x, B are vectors of known coefficients representing constraint bounds, A
is a matrix of coefficients containing sensitivity of constraints with respect to each x and the superscript T stands for matrix
transpose. The unknown variables xi represents the system reliability of the ith
asset. For typical assets used in drilling sys-
tems, it is fair to assume that failure of a single component leads to service failure; consequently, the reliability of assets are
modeled using the series system (Equation 2).
∏ ( ) (2)
The number of subcomponents in the asset is represented by n. The example used in this paper shows the application of the
linear programming method to optimize allocation of individual subcomponents between two assets awaiting maintenance.
When considering typical assets (such as PCBAs) used in drilling, the vector C = {1, 1} since reliability of each asset has to
be maximized, A = {1, 1}. The example also incorporates a typical scenario where an asset is deployed in more critical jobs
(example: award based contract). In this scenario, the selected drilling tool will need to have a higher threshold for reliability
than the other. The best parts from the lower reliability drilling tool will be swapped to the tool with higher reliability. The
overall asset optimization problem can be summarized as:
∏ ( ) ∏ ( ) (3)
SPE 171517 7
∏( )
∏( )
The optimization problem is solved using the simplex method in MS Excel. Theoretical discussion of the simplex LP method
can be found in Dantzig et al., (1997) and Murty et al., (1983).
Case Study 1: Utilizing Lifetime Prediction for Identifying Risk
Drilling tool parts are maintained or replaced depending on how many circulating hours they are exposed to. Those circulat-
ing hours may be in the range of the mean time between failure (MTBF7
) or exceeded it. Predicting time to failure of PCBAs
within a drilling tool prior to deployment to the field is the example used in this case study. This is a similar type of diagnosis
as to when someone takes their vehicle in for maintenance. The vehicle has sensors that reveal data on the current condition
of the vehicle depending on how it is used. The results of the screening indicates the type of maintenance required so the ve-
hicle can continue to perform optimally and achieve the mission (operate without failure when the owner is driving).
Using the location of the upcoming run, a desired risk threshold (example 50% or half of the calculated operating life) can be
set to determine whether the parts have consumed more than the risk threshold.
Fig. 4: Drilling Tool
Based on the location of the upcoming run, a desired risk threshold (example 50%, half of the calculated operating life) can
be set to determine whether the parts have consumed more than the risk threshold.
Baseline Case: Initial Drilling Tool Assessment
A drilling tool was analyzed in April 2014 prior to deployment. In the case study, the risk threshold 8
is set at 50% which is a
conservative setting (the expected life of each part is displayed graphically and interpreted in Figure 5).
Figure 6 shows the prediction range for actual drilling hours9
(DrillHrs) and a fuel gauge chart that enables the user to see the
percentage of life consumed.
7
MTBF is the predicted elapsed time between inherent failures of a system during operation.
8
The risk threshold is the amount of risk that a job is willing to incur. For instance, in the case of an award driven contract
where risk is less tolerable, electronics with greater than 50% life consumed are considered risky. However, for a contract
where the drilling conditions are more benign, greater risk in the range of 75% could be more acceptable.
9
Actual drilling hours are represented by the diamond.
Figure 5: Data Interpretation
8 SPE 171517-MS
Further guidance on replacing parts at a maintenance cycle can also be extracted from the results, depending on how the data
is flagged (in this case, >50% risk) (as shown in Table 3).
Table 3: Predicted Life of Electronic Parts in Drilling Tool Results – April 2014
Results Interpretation Key
Low Risk: 0.0 – 0.25 Medium Risk: 0.25 – 0.50 High Risk: >0.50 Uncertainty in prediction due to missing data (>30%)
A maintenance center has the ability to distinguish that one part (xxxx-7) requires replacement and can apply lower levels of
maintenance to the other electronics (with low or medium risk) or leave them undisturbed (alleviating induced failure because
of human error or process escapes). In this case, serial number xxxx-7 can be examined more closely to observe/perceive the
predictions for each run. Four parts are identified as medium risk and can either be assigned a certain level of maintenance or
can be left untouched depending on associated risk. In addition, the maintenance facility can incorporate the risk values into a
sparing forecast for the parts shown in Table 3. The incorporation of historical data and utilization of lifetime prediction risk
values can now be used as indicators for future demand. As a result, the accuracy of the sparing forecast is improved and ad-
ditional cost savings are generated.
Figure 7 shows the run history of the part flagged as high risk. The diamonds represent the actual drilling time of the part.
Although the drilling hours do not reached the upper limit of the confidence bound, there is still uncertainty. A recommenda-
tion is made for the repair10
, if applicable, or replacement of the part before the next run.
10
Repair is more likely to be considered at the assembly level. PCBAs are more likely to be replaced.
Part Number Serial Number Last Job No
Cumulative
Temperature
C
Cumulative
StickSlip
Cumulative
Lateral
(g_RMS) DrillHrs [h]
Worst Case
Life 25Q
Predicted
Mean Life 75Q
Best Case
Life Risk Part Description Comments
1 xxxx-1 10000 50 (L0) 0.35 (L1) 1.39 (L2) 23.33 301.10 428.47 518.63 622.46 834.54 0.00 PCBA (1) Missing Data For 1Runs..(12%)
2 xxxx-2 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 132.53 236.27 307.20 377.36 505.40 0.00 BATTERY
3 xxxx-3 10000 91.57 (L1) 0.32 (L1) 1.39 (L2) 332.25 278.65 451.23 601.87 731.80 997.62 0.11 PCBA (1) Missing Data For 1Runs..(6%)
4 xxxx-4 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 373.68 587.34 744.87 937.90 1390.76 0.00 PCBA
5 xxxx-5 10000 50 (L0) 0.32 (L1) 1.34 (L2) 216.99 216.73 389.55 534.26 739.48 1229.68 0.03 TRANSDUCER (1) Missing Data For 2Runs..(11%)
6 xxxx-6 10000 50 (L0) 0.35 (L1) 1.39 (L2) 23.33 124.35 204.65 266.52 318.90 418.51 0.00 BATTERY ASSY (1) Missing Data For 1Runs..(12%)
7 xxxx-7 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 378.20 470.53 543.91 647.85 922.11 0.81 PCBA (1) Missing Data For 3Runs..(13%)
8 xxxx-8 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 547.85 776.58 953.76 1183.45 1750.25 0.15 MAGNETOMETER (1) Missing Data For 3Runs..(13%)
9 xxxx-9 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 414.99 617.51 761.79 940.06 1285.00 0.38 ACCELEROMETER (1) Missing Data For 3Runs..(13%)
10 xxxx-10 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 528.26 916.09 1224.80 1612.98 2303.18 0.12 PCBA (1) Missing Data For 3Runs..(13%)
11 xxxx-11 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 253.96 517.32 736.48 1008.65 1805.98 0.44 PCBA (1) Missing Data For 3Runs..(13%)
12 xxxx-12 10000 91.57 (L1) 0.32 (L1) 1.39 (L2) 332.25 223.98 480.48 685.26 969.48 1686.29 0.12 SENSOR (1) Missing Data For 1Runs..(6%)
13 xxxx-13 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 266.49 581.45 923.16 1315.53 2178.79 0.00 POWER SUPPLY
14 xxxx-14 10000 91.57 (L1) 0.32 (L1) 1.39 (L2) 332.25 223.98 480.48 685.26 969.48 1686.29 0.12 SENSOR (1) Missing Data For 1Runs..(6%)
15 xxxx-15 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 266.49 581.45 923.16 1315.53 2178.79 0.00 POWER SUPPLY
16 xxxx-16 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 528.26 916.09 1224.80 1612.98 2303.18 0.12 PCBA (1) Missing Data For 3Runs..(13%)
17 xxxx-17 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 414.99 617.51 762.53 940.06 1285.00 0.38 ACCELEROMETER (1) Missing Data For 3Runs..(13%)
18 xxxx-18 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 414.99 617.51 762.53 940.06 1285.00 0.38 ACCELEROMETER (1) Missing Data For 3Runs..(13%)
19 xxxx-19 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 514.10 748.73 927.73 1151.33 1714.98 0.19 MAGNETOMETER (1) Missing Data For 3Runs..(13%)
20 xxxx-20 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 123.18 202.70 264.81 316.51 416.86 0.00 BATTERY ASSY
Figure 6: Graphical Representation of Predicted Life of Electronic Parts in Drilling Tool – April 2014
SPE 171517 9
The probability of failure increases as the part’s life (measured in drilling hours) reaches the 75% life estimate of the predic-
tion (as shown on the chart on the left-hand side of Figure 7). If the part was to remain unchanged in the drilling tool and
operated with the same or similar conditions as shown in Table 4, then the probability of failure continues to increase.
Table 4: Predicted Life versus. Run for Part Outside of Risk Threshold
Decision Case
The drilling tool was analyzed again in May 2014, prior to deployment. In this case study, the risk threshold is set at 50%
again. Maintenance was previously performed based on the recommendations made in Table 3. Fig. 8 shows the prediction
range for actual drilling hours and a fuel gauge chart that enables the user to distinguish the percentage of life consumed.
Last Job
No
Cumulative
Temperatur
e C
Cumulative
Lateral
(g_RMS)
Cumulative
StickSlip
(g_RMS)
DrillHrs
[h]
Worst Case
Life 25Q
Predicted
Mean Life 75Q
Best Case
Life Risk Comments
10000 66.50 1.15 0.29 692.17 378.20 470.53 543.91 647.85 922.11 0.81
10000 66.72 1.15 0.28 677.75 377.84 470.27 543.06 647.52 921.70 0.79
10000 66.77 1.15 0.28 677.42 378.16 470.66 543.40 648.06 922.53 0.79
10000 66.80 1.15 0.28 674.42 377.97 470.41 543.08 647.71 922.10 0.79
10000 66.81 1.15 0.28 673.92 377.80 470.20 542.84 647.42 921.70 0.79
9000 66.91 1.14 0.28 672.34 378.17 470.66 543.45 648.15 922.93 0.78 Missing Data For This Run
9000 66.91 1.14 0.28 599.51 381.24 474.37 548.12 653.31 930.57 0.63 Missing Data For This Run
9000 66.91 1.14 0.28 595.68 381.26 474.40 548.16 653.36 930.63 0.62 Missing Data For This Run
6000 66.91 1.14 0.28 594.01 381.21 474.33 548.08 653.25 930.50 0.62
6000 68.60 1.25 0.31 585.11 347.15 432.64 499.96 595.60 846.42 0.72
5800 67.58 1.32 0.33 541.21 335.40 417.57 483.78 574.98 811.59 0.67
5123 69.40 1.44 0.36 495.11 309.81 386.13 447.29 530.82 745.99 0.65
5123 71.92 1.44 0.34 432.61 170.63 325.55 450.20 619.38 1060.60 0.47
4726 76.43 1.42 0.23 348.44 202.54 385.29 535.00 736.66 1279.67 0.20
4726 77.31 1.46 0.24 339.24 197.54 375.06 521.84 716.61 1244.64 0.20
4575 78.99 1.56 0.25 317.14 188.14 356.96 496.13 681.03 1179.46 0.20
4575 75.97 1.46 0.24 283.72 200.54 381.58 528.37 727.27 1263.72 0.13
3723 76.04 1.47 0.24 273.80 200.15 380.77 527.33 725.79 1261.03 0.12
3723 50.00 1.25 0.13 85.70 338.55 629.71 867.77 1188.73 1764.98 0.00
3723 50.00 1.12 0.15 46.80 312.93 584.04 801.60 1102.97 1764.98 0.00
2555 50.00 0.85 0.12 42.70 337.10 623.24 857.42 1189.40 1764.98 0.00
1859 50.00 0.82 0.11 30.00 356.99 661.58 913.72 1261.60 1764.98 0.00
Figure 7: Predicted Life versus Run for Part Outside of Risk Threshold
10 SPE 171517-MS
Fig. 8: Graphical Representation of Predicted Life of Electronic Parts in Drilling Tool – May 2014
Most of the serial numbers have remained the same. However, based on the previous data there have been some changes
made to the tool build. There is also a noticeable change in the percentage of missing data, which these dynamic models take
into account, and a new risk is calculated for each part. The update to the data has also had a positive impact on the results, as
shown in Table 5.
Table 5: Predicted Life of Electronic Parts in Drilling Tool Results – May 2014
Results Interpretation Key
Low Risk: 0.0 – 0.25 Medium Risk: 0.25 – 0.50 High Risk: >0.50 Uncertainty in prediction due to missing data (>30%)
In Table 3 and Table 5, serial numbers 7, 9, 11, 17 and 18 are identified as high/ medium risk in Table 3, and appropriate
maintenance actions were taken. Parts that had risk lower than the risk threshold were used and previous parts that met the
risk threshold remained unchanged. Table 5 enables for better decision making ability because the probability of failure can
be assessed before a tool is sent into the field.
Case Study 2: Sparing Optimization
This section will show the application of optimization technique developed in the previous section to determine best possible
selection of sub-components that will maximize the overall system reliability of both the assets (example: Fig. 9).
Part Number Serial Number Last Job No
Cumulative
Temperature
C
Cumulative
StickSlip
Cumulative
Lateral
(g_RMS) DrillHrs [h]
Worst Case
Life 25Q
Predicted
Mean Life 75Q
Best Case
Life Risk Part Description Comments
1 xxxx-1 10000 50 (L0) 0.35 (L1) 1.39 (L2) 23.33 301.10 428.47 518.63 622.46 834.54 0.00 PCBA (1) Missing Data For 1Runs..(12%)
2 xxxx-2 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 132.53 236.27 307.20 377.36 505.40 0.00 BATTERY
3 xxxx-3 10000 91.57 (L1) 0.32 (L1) 1.39 (L2) 332.25 278.65 451.23 601.87 731.80 997.62 0.11 PCBA (1) Missing Data For 1Runs..(6%)
4 xxxx-4 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 373.68 587.34 744.87 937.90 1390.76 0.00 PCBA
5 xxxx-5 10000 50 (L0) 0.32 (L1) 1.34 (L2) 216.99 216.73 389.55 534.26 739.48 1229.68 0.03 TRANSDUCER (1) Missing Data For 2Runs..(11%)
6 xxxx-6 10000 50 (L0) 0.35 (L1) 1.39 (L2) 23.33 124.35 204.65 266.52 318.90 418.51 0.00 BATTERY ASSY (1) Missing Data For 1Runs..(12%)
7 xxxx-7 10000 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 237.15 448.23 628.43 865.80 1511.01 0.49 PCBA (1) Missing Data For 2Runs..(8%)
8 xxxx-8 10000 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 571.43 850.71 1066.84 1343.04 2055.70 0.05 MAGNETOMETER (1) Missing Data For 2Runs..(8%)
9 xxxx-9 10000 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 426.25 638.38 800.76 989.99 1285.00 0.22 ACCELEROMETER (1) Missing Data For 2Runs..(8%)
10 xxxx-10 10000 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 453.21 792.79 1076.48 1410.04 2236.77 0.13 PCBA (1) Missing Data For 2Runs..(8%)
11 xxxx-11 10000 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 254.15 538.13 773.65 1101.58 2038.32 0.33 PCBA (1) Missing Data For 2Runs..(8%)
12 xxxx-12 10000 91.57 (L1) 0.32 (L1) 1.39 (L2) 332.25 223.98 480.48 685.26 969.48 1686.29 0.12 SENSOR (1) Missing Data For 1Runs..(6%)
13 xxxx-13 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 266.49 581.45 923.16 1315.53 2178.79 0.00 POWER SUPPLY
14 xxxx-14 10000 91.57 (L1) 0.32 (L1) 1.39 (L2) 332.25 223.98 480.48 685.26 969.48 1686.29 0.12 SENSOR (1) Missing Data For 1Runs..(6%)
15 xxxx-15 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 266.49 581.45 923.16 1315.53 2178.79 0.00 POWER SUPPLY
16 xxxx-201 9859 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 453.21 792.79 1076.48 1410.04 2236.77 0.13 PCBA (1) Missing Data For 2Runs..(8%)
17 xxxx-1715 9859 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 426.25 638.38 800.76 989.99 1285.00 0.22 ACCELEROMETER (1) Missing Data For 2Runs..(8%)
18 xxxx-1869 9859 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 426.25 638.38 800.76 989.99 1285.00 0.22 ACCELEROMETER (1) Missing Data For 2Runs..(8%)
19 xxxx-1975 9859 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 571.43 851.33 1066.84 1344.01 2055.70 0.05 MAGNETOMETER (1) Missing Data For 2Runs..(8%)
20 xxxx-20 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 123.18 202.70 264.81 316.51 416.86 0.00 BATTERY ASSY
SPE 171517 11
Fig. 9: Asset Swapping
The reliability of each component is calculated using the life prediction method described in Kale et al., (2014). The system
reliability of the asset is calculated using Equation 2. The baseline reliability of the two assets is show in Table 6. First and
third columns in the table list the name of the subcomponent that makes up an asset, second and the fourth columns show the
risk of failure of individual subcomponent in the asset. The part name shown in each row represents unique parts serial num-
bers present in each asset. For example, Asset1-01 is functionally different from Asset1-02, and Asset1-02 is functionally
different from Asset1-03 and so on. Part names with same numerical index represent a different sample of same functional
part. For example, Asset1-01 and Asset2-01 represent identically manufactured parts that have the same functionality. The
last row in the table shows the overall system reliability of the assets computed using Equation 2.
Table 6: Risk of Individual Sub-Components in Two Assets
Part Name Asset1 Risk of Asset 1 Part Name Asset2 Risk of Asset 2
Asset1-01 5×10-4
Asset2-01 5×10-4
Asset1-02 5×10-4
Asset2-02 5×10-4
Asset1-03 5×10-4
Asset2-03 0.025
Asset1-04 5×10-4
Asset2-04 5×10-4
Asset1-05 0.051 Asset2-05 5×10-4
Asset1-06 5×10-4
Asset2-06 5×10-4
Asset1-07 0.076 Asset2-07 0.038
Asset1-08 0.001 Asset2-08 5×10-4
Asset1-09 5×10-4
Asset2-09 0.029
Asset1-10 0.031 Asset2-10 5×10-4
Asset1-11 5×10-4
Asset2-11 5×10-4
Asset1-12 5×10-4
Asset2-12 5×10-4
Asset1-13 5×10-4
Asset2-13 0.061
Asset1-14 5×10-4
Asset2-14 0.061
Asset1-15 5×10-4
Asset2-15 5×10-4
Asset1-16 5×10-4
Asset2-16 5×10-4
Asset1-17 0.030 Asset2-17 0.0287
Asset1-18 0.030 Asset2-18 0.0287
Asset1-19 0.030 Asset2-19 0.0287
System Reliability 0.774 0.735
The sum of the system reliability for the two assets is (calculated by Equation 3) is 1.51. The total system reliability of these
two assets is maximized by swapping parts using the simplex linear programming method. The condition for swapping parts
is that the system reliability of each asset must be at least 50%. Table 7 shows the results for this optimization.
12 SPE 171517-MS
Table 7: Risk of Individual Sub-Components in Two Assets after Swapping Sub-Components Between Them to Max-
imize Cumulative System Reliability
Part Name Asset 1 Risk of Asset 1 Part Name Asset 2 Risk of Asset 2 Comments
Asset1-01 5×10-4 Asset2-01 5×10-4
Asset1-02 5×10-4 Asset1-02 5×10-4
Asset1-03 5×10-4 Asset2-03 0.025
Asset1-04 5×10-4 Asset2-04 5×10-4
Asset2-05 5×10-4 Asset1-05 0.051 Swap
Asset1-06 5×10-4 Asset2-06 5×10-4
Asset2-07 0.038 Asset1-07 0.076 Swap
Asset2-08 5×10-4 Asset1-08 0.00055 Swap
Asset1-09 5×10-4 Asset2-09 0.03
Asset2-10 5×10-4 Asset1-10 0.03 Swap
Asset1-11 5×10-4 Asset2-11 5×10-4
Asset1-12 5×10-4 Asset2-12 5×10-4
Asset1-13 5×10-4 Asset2-13 0.061
Asset1-14 5×10-4 Asset2-14 0.061
Asset1-15 5×10-4 Asset2-15 5×10-4
Asset1-16 5×10-4 Asset2-16 5×10-4
Asset2-17 0.0287 Asset1-17 0.03 Swap
Asset2-18 0.0287 Asset1-18 0.03 Swap
Asset2-19 0.0287 Asset1-19 0.03 Swap
System Reliability 0.881 0.646
Table 7 shows that combined reliability of the two assets can be maximized by swapping parts between them. The fifth col-
umn in Table 7 shows which part was swapped between the two assets. The outcome of this swapping of parts is that the
reliability of the first asset is increased to 88% from a baseline value of 77% and the reliability of second asset is reduced to
64% from a baseline value of 73%. A scenario is presented where there are additional new spare parts (assumed to have risk
of 0) available to replace Asset-07, Asset-17, Asset-18 and Asset-19. The cumulative system reliability of the two assets is
maximized by swapping parts between them and utilizing the additional spare parts. The results for the optimization are
shown in Table 8. The overall reliability of assets can be enhanced by optimally utilizing the existing subcomponents in the
tool. Cost-based optimization can be achieved to add the economics of spares and repairs in the decision making process to
determine which component must be replaced and which must remain active. The cost of failure factor may be added to de-
cide the optimal maintenance interval and level of repairs and replacements from the calculated risk of failure and cost of
maintenance.
SPE 171517 13
Table 8: Risk of Individual Sub-Components in Two Assets after Swapping Sub-Components between Them and Uti-
lizing Additional Spare Parts to Maximize Cumulative System Reliability
Part Name Asset 1 Risk of Asset 1 Part Name Asset 2 Risk of Asset 2 Comments
Asset1-01 5×10-4 Asset2-01 5×10-4
Asset1-02 5×10-4 Asset1-02 5×10-4
Asset2-03 0.025 Asset1-03 5×10-4 Swap
Asset1-04 5×10-4 Asset2-04 5×10-4
Asset1-05 0.051 Asset2-05 5×10-4
Asset1-06 5×10-4 Asset2-06 5×10-4
Asset2-07 0.038 Asset1-07 0.000 New part/Swap
Asset1-08 0.001 Asset2-08 5×10-4
Asset2-09 0.029 Asset1-09 5×10-4 Swap
Asset1-10 0.031 Asset2-10 5×10-4
Asset1-11 5×10-4 Asset2-11 5×10-4
Asset1-12 5×10-4 Asset2-12 5×10-4
Asset2-13 0.061 Asset1-13 5×10-4 Swap
Asset2-14 0.061 Asset1-14 5×10-4 Swap
Asset1-15 5×10-4 Asset2-15 5×10-4
Asset1-16 5×10-4 Asset2-16 5×10-4
Asset2-17 0.0287 Asset1-17 0.000 New part/Swap
Asset2-18 0.0287 Asset1-18 0.000 New part/Swap
Asset2-19 0.0287 Asset1-19 0.000 New part/Swap
System Reliability 0.676 0.999
Conclusion
The paper presents how maintenance plans and reliability for drilling tools can be improved, while reducing cost, by taking
advantage of the forecasting capability that lifetime prediction provides. The improvements ultimately lead to preventing
costly failures in the field. Lifetime prediction is a way to make risk-informed decisions and is a catalyst to maintaining drill-
ing tools by letting the data show where improvement/change must be made. This shift in standard practice leads to lower
maintenance costs without sacrificing reliability. Improvement in sparing forecasts is an additional benefit to this methodolo-
gy because higher risk parts are more easily identified.
Future work will focus on refining model predictions by using additional environmental variables, incorporating other statis-
tical methodologies and integrating data from design and qualification tests to optimize drilling performance.
Acknowledgements
The authors thank Baker Hughes for permitting them the chance to work on such a trailblazing methodology.
Nomenclature
Drillhrs = Drilling hours
BHA = Bottomhole assembly
FRACAS = Failure Reporting and Corrective Action System
HALT = Highly accelerated life test
HAST = Highly accelerated stress test
MLE = Maximum likelihood estimation
NPT = Nonproductive Time
PCBA = Printed circuit board assembly
PHM = Prognostics and health management
PoF = Probability of failure
PN = Part number
RIDM = Risk-Informed Decision Making
RPM = Revolutions per minute
SN = Serial number
F = Failure
L = Lateral vibration
Mi = ith
model identifier
14 SPE 171517-MS
N = Symbol used to represent negative decision, generally
“no” or “0”
S = Symbol used to represent stick-slip or suspensions
T = Temperature
X = Vector of parameters such as temperature and vibrations
Y = Symbol used to represent affirmative decision, generally “yes” or “1”
f = Probability density function
m = Number of models
n = Number of records
p = Probability
p(a|b) = Conditional probability of occurrence of event a provided b is true
revid = revision identifier
tf = Time to failure (drilling hours)
wi = Weight of ith
data point
xave = Average value of parameter x
xstdev = Standard deviation of parameter x
α = Calibration parameters of reliability model
= Likelihood
η = Characteristic life or scale factor of a probability distribution
β = Shape factor of a probability distribution
σ = Standard deviation
λ= Hazard function
{CF} = Set of life data for confirmed failure
{O} = Set of outliers
{S} = Set of life data for suspension
{UF} = Set of life data for unconfirmed failure
Load, Stress and Severity are used interchangeably to describe the impact of the operational environment (mechanical and
thermal) on the durability of parts.
Nominal part is a representative part that has a life equal to the average of several parts produced using the same manufactur-
ing process and operating under identical conditions.
Suspensions are used in reliability modeling to represent hours accumulated on parts that are in operation or removed from
service for reasons other than failure.
Bibliography
Barker, D., Dasgupta, A., and Pecht, M. (1992, February). PWB solder joint life calculations under thermal and vibrational
loading. Journal if the IES, 35(1), 17-25.
Chatterjee, K., Modarres, M., and Bernstein, J. (2012). Fifty Years of Physics of Failure. Journal of Reliability Information
Analysis Center.
Duffek, D. (2004). Effect of Combined Thermal and Mechanical Loading on the Fatigue of Solder Joints. University of Notre
Dame. Notre Dame: Master's Thesis.
Garvey, D. R., Baumann, J., Lehr, J., and Hines, J. (2009). Pattern Recognition Based Remaining Useful Life Estimation of
Bottom Hole Assembly Tools. SPE/IADC Drilling Conference and Exhibition. Amsterdam, The Netherlands.
George B. Dantzig and Mukund N. Thapa. 1997. Linear programming 1: Introduction. Springer-Verlag.
Kale, A. A., Carter-Journet, K., Heuermann-Kuehn, L., Falgout, T., and Zurcher, D. (2014). A Probabilistic Approach to
Reliability and Life Prediction of Electronics in Drilling and Evaluation Tools.
Litt, J., Soditus, S., Hendricks, R., and Zaretsky, E. (2001). Structural Life and Reliability Metrics Benchmarking and
Verification of Probabilistic Life Prediction Codes. 5th Annual FA/_JAir Force/NASA/Navy Workshop.
Mishra, S., and Pecht, M. (2002). In-situ Sensors for Product Reliability Monitoring. Proceedings of SPIE, 4755, pp. 10-19.
Murty, Katta G. (1983). Linear programming. New York: John Wiley and Sons Inc. pp. xix+482. ISBN 0-471-09725-X. MR
720547.
Reich, M. (2004). The Fascinating Workd of Drilling Technoligy: Products from Baker Hughes and their Functions. Celle:
Baker Hughes.
Tuchband, B. A. (2007). Implementation of Prognostics and Health Management for Electronic Systems. College Park:
University of Maryland.
SPE 171517 15
Appendix A
A. General Log-Linear Model
The relation between characteristic life and stress variables are represented by using one of the three models generalized as
log-linear (GLL), proportional hazard (PH) and cumulative damage (CD). The GLL model represents life using Equation A-1
( ̅) ∑ ∑ ∑
(A-1)
where ̅ = {T, L, S}. For a Weibull distribution, the probability density function is shown in Equation A-2, where β is the
shape parameter, η is the scale parameter and α’s are unknown parameters calculated from field data using the maximum
likelihood estimation technique.
( ̅) ( ̅) (̅)
(A-2)
The probability density function (PDF) for an exponential distribution can be obtained by simply putting β=1 in Equation A-
1. For lognormal distribution, the probability density function for a GLL stress function is shown in Equation A-3
( ̅)
√
(
( ) (̅)
)
(A-3)
B. Proportional Hazard Model
For a proportional hazard model, the hazard rate of a component is affected by hours in operation and stress variables. The
instantaneous hazard rate of a part is given by Equation A-4
( ̅)
( ̅)
( ̅)
( ) ( ̅ ̅) (A-4)
where f is the probability density function and R is the reliability function. The instantaneous hazard rate, λ0, is a function of
time only and the stress function, η, is a function of operating stresses such as temperature, vibration etc. The list of unknown
model parameter ̅ is obtained by calibrating the model to test data using the maximum likelihood estimation (MLE). The
stress function, η, is given by Equation A-5
( ̅) ∑ ∑ ∑
(A-5)
Substituting Equation A-5 in Equation A-2, the hazard function for a Weibull distribution is written using Eq. (A-6)
( ̅) ( ) ∑ ∑ ∑
(A-6)
C. Cumulative Damage Model
The cumulative damage model incorporates the effect of time varying stress on life of components. The model takes into ac-
count the impact of damage accumulated at each stress level on the reliability of parts. Damage accumulation can take place
at various rates for various stress levels and can be determined using the linear damage sum (Miner’s rule), the inverse power
law or cycle counting techniques such as rainflow counting. The cumulative damage model used in this paper is established
from Miner’s rule which is based on the hypothesis that if there are n different stress levels and the time to failure at the ith
stress σi is Tfi, then the damage fraction, p, is given by Equation A-7
∑ (A-7)
where ti is the number of cycles accumulated at stress σi and failure occurs when the damage fraction equals unity. The prob-
ability distribution functions for Weibull and lognormal distributions are obtained by substituting equation A-7 in equations
A-2 and A-3, respectively. Given the stress variables ̅ { }, the PDF for a
Weibull distribution is given by
( ̅) ∫
∑ ( ) ∑ ∑ ( ) ( )
16 SPE 171517-MS
( ̅) ( ̅)( ( ̅)) (( ( ̅)))
(A-8)
D. Characteristic Life Function
The life characteristic function describes a general relation between failure time and stress levels. The life characteristic can
be any time to failure measure such as the mean, median, hazard rate etc. that represents a bulk property of a probability dis-
tribution. Ideally, the function must incorporate the governing equations that represent the physical phenomenon of degrada-
tion of the material under the application of load. Typical electronic circuit boards used in drilling and evaluations are com-
plex and the governing equations representing degradation and failure mechanism are difficult to model; therefore, the paper
evaluates several empirical functions between stress variables and selects the one that best fits the field data.

More Related Content

Viewers also liked

Hope on the Horizon
Hope  on  the  HorizonHope  on  the  Horizon
Hope on the HorizonMakala D.
 
Learn from mistakes
Learn from mistakes Learn from mistakes
Learn from mistakes Prasenjit Das
 
Nevergiveup 111218200539-phpapp01
Nevergiveup 111218200539-phpapp01Nevergiveup 111218200539-phpapp01
Nevergiveup 111218200539-phpapp01Kuntal Arora
 
Createyourownlife
CreateyourownlifeCreateyourownlife
Createyourownlifeakarshini
 
Einsteinlesson 2 presented by sompong yusoontorn
Einsteinlesson 2 presented by sompong yusoontornEinsteinlesson 2 presented by sompong yusoontorn
Einsteinlesson 2 presented by sompong yusoontornNethaji Balaraman
 
7signsthatyougiveup
7signsthatyougiveup7signsthatyougiveup
7signsthatyougiveupfauzan532
 
Goingforgoal 111017225255-phpapp01
Goingforgoal 111017225255-phpapp01Goingforgoal 111017225255-phpapp01
Goingforgoal 111017225255-phpapp01Nanette Revilla
 
21 ways to learn from failures
21 ways to learn from failures21 ways to learn from failures
21 ways to learn from failuresMarc Heleven
 
Mohammadali presented by sompong yusoontorn
Mohammadali presented by sompong yusoontornMohammadali presented by sompong yusoontorn
Mohammadali presented by sompong yusoontornTaher Khan
 
Itdoesntmean
ItdoesntmeanItdoesntmean
Itdoesntmeanvivekmaha
 

Viewers also liked (20)

Meaningoffailures
MeaningoffailuresMeaningoffailures
Meaningoffailures
 
Hope on the Horizon
Hope  on  the  HorizonHope  on  the  Horizon
Hope on the Horizon
 
Learn from mistakes
Learn from mistakes Learn from mistakes
Learn from mistakes
 
Howtogetawaywithbranding
HowtogetawaywithbrandingHowtogetawaywithbranding
Howtogetawaywithbranding
 
Nevergiveup 111218200539-phpapp01
Nevergiveup 111218200539-phpapp01Nevergiveup 111218200539-phpapp01
Nevergiveup 111218200539-phpapp01
 
Createyourownlife
CreateyourownlifeCreateyourownlife
Createyourownlife
 
Carrot egg coffeebean
Carrot egg coffeebeanCarrot egg coffeebean
Carrot egg coffeebean
 
Einsteinlesson 2 presented by sompong yusoontorn
Einsteinlesson 2 presented by sompong yusoontornEinsteinlesson 2 presented by sompong yusoontorn
Einsteinlesson 2 presented by sompong yusoontorn
 
Successfulpeople
SuccessfulpeopleSuccessfulpeople
Successfulpeople
 
Kiss
KissKiss
Kiss
 
7signsthatyougiveup
7signsthatyougiveup7signsthatyougiveup
7signsthatyougiveup
 
Thisisatestoflife
ThisisatestoflifeThisisatestoflife
Thisisatestoflife
 
Passion, purpose, value
Passion, purpose, valuePassion, purpose, value
Passion, purpose, value
 
Goingforgoal 111017225255-phpapp01
Goingforgoal 111017225255-phpapp01Goingforgoal 111017225255-phpapp01
Goingforgoal 111017225255-phpapp01
 
Winner
Winner Winner
Winner
 
21 ways to learn from failures
21 ways to learn from failures21 ways to learn from failures
21 ways to learn from failures
 
10 powerful words
10 powerful words10 powerful words
10 powerful words
 
You failed
You failedYou failed
You failed
 
Mohammadali presented by sompong yusoontorn
Mohammadali presented by sompong yusoontornMohammadali presented by sompong yusoontorn
Mohammadali presented by sompong yusoontorn
 
Itdoesntmean
ItdoesntmeanItdoesntmean
Itdoesntmean
 

Similar to SPE 171517_Estimating Probability of Failure _2014_Final

Risk assessment for blast furnace using fmea
Risk assessment for blast furnace using fmeaRisk assessment for blast furnace using fmea
Risk assessment for blast furnace using fmeaeSAT Publishing House
 
SMRP 24th Conf Paper - Vextec -J Carter
SMRP 24th Conf Paper - Vextec -J CarterSMRP 24th Conf Paper - Vextec -J Carter
SMRP 24th Conf Paper - Vextec -J Carterjcarter1972
 
Informing product design with analytical data
Informing product design with analytical dataInforming product design with analytical data
Informing product design with analytical dataTeam Consulting Ltd
 
Ageing of Industrial Plant (BPPT_Jakarta_06-08-2003)
Ageing of Industrial Plant (BPPT_Jakarta_06-08-2003)Ageing of Industrial Plant (BPPT_Jakarta_06-08-2003)
Ageing of Industrial Plant (BPPT_Jakarta_06-08-2003)Jonathan Lloyd
 
Failure analysis of polymer and rubber materials
Failure analysis of polymer and rubber materialsFailure analysis of polymer and rubber materials
Failure analysis of polymer and rubber materialsKartik Srinivas
 
Integrating reliability in conceptual process design an optimization approach
Integrating reliability in conceptual process design an optimization approachIntegrating reliability in conceptual process design an optimization approach
Integrating reliability in conceptual process design an optimization approachIAEME Publication
 
Senior Process Engineer, Sep
Senior Process  Engineer, SepSenior Process  Engineer, Sep
Senior Process Engineer, Sepsayed ammar
 
Offshore Rig Life Extension
Offshore Rig Life ExtensionOffshore Rig Life Extension
Offshore Rig Life ExtensionImran Choudury
 
Improving the Availability of Lift Stations through Optimized Redundant / Bac...
Improving the Availability of Lift Stations through Optimized Redundant / Bac...Improving the Availability of Lift Stations through Optimized Redundant / Bac...
Improving the Availability of Lift Stations through Optimized Redundant / Bac...Vecoin
 
Optimal Maintainability of Hydraulic Excavator Through Fmea/Fmeca
Optimal Maintainability of Hydraulic Excavator Through Fmea/FmecaOptimal Maintainability of Hydraulic Excavator Through Fmea/Fmeca
Optimal Maintainability of Hydraulic Excavator Through Fmea/FmecaIJRESJOURNAL
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Probabilistic fatigue design of shaft for bending and torsion
Probabilistic fatigue design of shaft for bending and torsionProbabilistic fatigue design of shaft for bending and torsion
Probabilistic fatigue design of shaft for bending and torsioneSAT Publishing House
 

Similar to SPE 171517_Estimating Probability of Failure _2014_Final (20)

Risk assessment for blast furnace using fmea
Risk assessment for blast furnace using fmeaRisk assessment for blast furnace using fmea
Risk assessment for blast furnace using fmea
 
Rbi final report
Rbi final reportRbi final report
Rbi final report
 
Society of Petroleum Engineers : Model Based Engineering
Society of Petroleum Engineers : Model Based EngineeringSociety of Petroleum Engineers : Model Based Engineering
Society of Petroleum Engineers : Model Based Engineering
 
Instrument cost estimating
Instrument cost estimatingInstrument cost estimating
Instrument cost estimating
 
SMRP 24th Conf Paper - Vextec -J Carter
SMRP 24th Conf Paper - Vextec -J CarterSMRP 24th Conf Paper - Vextec -J Carter
SMRP 24th Conf Paper - Vextec -J Carter
 
Informing product design with analytical data
Informing product design with analytical dataInforming product design with analytical data
Informing product design with analytical data
 
Seminar Reliability
Seminar ReliabilitySeminar Reliability
Seminar Reliability
 
Ageing of Industrial Plant (BPPT_Jakarta_06-08-2003)
Ageing of Industrial Plant (BPPT_Jakarta_06-08-2003)Ageing of Industrial Plant (BPPT_Jakarta_06-08-2003)
Ageing of Industrial Plant (BPPT_Jakarta_06-08-2003)
 
ch01.pdf
ch01.pdfch01.pdf
ch01.pdf
 
Failure analysis of polymer and rubber materials
Failure analysis of polymer and rubber materialsFailure analysis of polymer and rubber materials
Failure analysis of polymer and rubber materials
 
Integrating reliability in conceptual process design an optimization approach
Integrating reliability in conceptual process design an optimization approachIntegrating reliability in conceptual process design an optimization approach
Integrating reliability in conceptual process design an optimization approach
 
Senior Process Engineer, Sep
Senior Process  Engineer, SepSenior Process  Engineer, Sep
Senior Process Engineer, Sep
 
Offshore Rig Life Extension
Offshore Rig Life ExtensionOffshore Rig Life Extension
Offshore Rig Life Extension
 
Poster
PosterPoster
Poster
 
Improving the Availability of Lift Stations through Optimized Redundant / Bac...
Improving the Availability of Lift Stations through Optimized Redundant / Bac...Improving the Availability of Lift Stations through Optimized Redundant / Bac...
Improving the Availability of Lift Stations through Optimized Redundant / Bac...
 
Optimal Maintainability of Hydraulic Excavator Through Fmea/Fmeca
Optimal Maintainability of Hydraulic Excavator Through Fmea/FmecaOptimal Maintainability of Hydraulic Excavator Through Fmea/Fmeca
Optimal Maintainability of Hydraulic Excavator Through Fmea/Fmeca
 
MIL-STD-810H.pdf
MIL-STD-810H.pdfMIL-STD-810H.pdf
MIL-STD-810H.pdf
 
AEG – Failure Analysis Services
AEG – Failure Analysis ServicesAEG – Failure Analysis Services
AEG – Failure Analysis Services
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Probabilistic fatigue design of shaft for bending and torsion
Probabilistic fatigue design of shaft for bending and torsionProbabilistic fatigue design of shaft for bending and torsion
Probabilistic fatigue design of shaft for bending and torsion
 

SPE 171517_Estimating Probability of Failure _2014_Final

  • 1. SPE 171517-MS Estimating Probability of Failure for Drilling Tools with Life Prediction K. Carter-Journet, A. Kale, D. Zhang, E. Pradeep, T. Falgout, and L. Heuermann-Kuehn, Baker Hughes. Copyright 2014, Society of Petroleum Engineers This paper was prepared for presentation at the SPE Asia Pacific Oil & Gas Conference and Exhibition held in Adelaide, Australia, 14–16 October 2014. This paper was selected for presentation by an SPE program committee following review of information contained in an abstract submitted by the author(s). Contents of the paper have not been reviewed by the Society of Petroleum Engineers and are subject to correction by the author(s). The material does not necessarily reflect any position of the Society of Petroleum Engineers, its officers, or members. Electronic reproduction, distribution, or storage of any part of this paper without the written consent of the Society of Petroleum Engineers is prohibited. Permission to reproduce in print is restricted to an abstract of not more than 300 words; illustrations may not be copied. The abstract must contain conspicuous acknowledgment of SPE copyright. Abstract Drilling tools are subject to numerous operational parameters such as revolutions per minute (RPM), vibration (lateral, stick- slip and axial), pressure, torque and temperature. These parameters can greatly fatigue even the most robust tool depending on where and how the tool is operated. Lifetime prediction methodologies represent an affordable and statistically significant way to estimate the probability of failure (risk) of drilling tools in a cost effective way. Understanding the potential risk is vital to ensuring reliability, performing the most efficient maintenance on the equipment and improving drilling performance. Sophisticated risk-modeling techniques reduce uncertainty in drilling operations by making use of readily available opera- tional field data, thus eliminating the need for costly laboratory experiments. Blind spots in the decision making process are eliminated by proactively identifying precursors to costly failures in the field. Preemptive guidance during maintenance peri- ods, for parts that may have otherwise been overlooked based strictly on procedure, is enabled. Statistical models that relate the operating environment to component life are derived from field component failure data, and introduce a fresh way to boost the drilling tool efficiency. A Bayesian-based model selection technique is also developed which incorporates operating environment variables after each successful drilling run to dynamically select the model that gives the best survival probabil- ity, ensuring maximum utilization of a component, while avoiding failure and improving the overall reliability of the tool in the field. The implementation of lifetime prediction methodologies also leads to lowered life-cycle and maintenance costs, reduced risk and improved operational performance. The paper presents the methodology used to estimate the probability of failure of drilling tools and further illustrates how to reach risk-informed decisions. Introduction Optimum drilling services minimize the non-productive time (NPT) experienced from tool degradation and/or failures. This objective of reliability starts with innovative tool design and spreads to the primary areas of application engineering, mainte- nance and well site execution. A universal approach for greater project efficiency, with minimized risk1 , is necessary as the oil and gas industry seeks unconventional sources to meet increasing demands. Almost every product and service is designed to reduce costs, lessen risk or increase productivity during activities related to hydrocarbon extraction, further advancing reservoir performance. Consistent methodologies, which provide preemptive guid- ance for optimizing drilling parameters and reducing the probability of failures in the field, are necessary. These types of methodologies are especially important when analyzing electrical component anomalies. For instance, the reliability of elec- tronic-printed circuit board assemblies (PCBAs) in the bottomhole assembly (BHA) is vital to the success of any drilling op- eration. PCBAs are multi-scale devices (encased in electronic packaging) comprising multiple components and the geometric dimensions of individual components may vary in size and composition and are not easily assessable without disassembling a tool. Electronic packaging can also be subject to thermal expansion mismatch, accelerated corrosion, dendrite growth, metal whiskers, solder fatigue and outgassing which can lead to failure. Understanding the risk and amount of consumed life of PCBAs prior to deploying a drilling tool into the field improves reliability and overall drilling performance. The ever-present need for more flexibility in drilling regimes, greater reliability of drilling tools and higher rates of penetration puts further 1 Risk, for the purpose of this paper, refers to the uncertainty in drilling tool performance at the component level and/or as a whole. Risk centers on predicting the probability of failures that can lead to severe damage to the tool and/or the inability to perform the run or function to the best advantage. The consequences can be technical, safety, cost, or schedule related. The ability to quantify and understand risk provides a foundation for proactive risk management throughout the drilling tool’s lifetime.
  • 2. 2 SPE 171517-MS strain on the drilling tool’s electrical components. High-performance drilling tools in the industry must drill in harsher envi- ronments, higher temperature (often beyond 150°C), vibrations (levels exceeding 15g) and pressures (30Kpsi or more) along horizontal paths (rather than conventional vertical bore holes) at increasing depths and abrasive formations because more readily accessible oil and gas reserves were depleted long ago (Figure 1). These demanding conditions can often influence companies’ decisions to operate drilling tools beyond their design specifica- tions. The trend also leads to higher maintenance costs and more frequent system downtime. However, instead of over- maintaining drilling tools, companies must target enhancing system performance. For example, focus should be on preventing failure and reducing system downtime, meeting customer demands, reducing maintenance costs and ensuring equipment reliability. Failure in the field extends a planned drilling program beyond the scheduled time frame, adding unnecessary cost. Consequently, the capability to estimate the probability of failure (PoF) for drilling tools by using lifetime prediction methodologies introduces an alternative way to avert expensive downhole tool failures and ensures the success of any drilling operation by indicating the overall risk. Background: Lifetime Prediction Methodologies In lifetime prediction analysis and reliability engineering, the output of the analysis is always an estimate (ReliaSoft, 2005). The true value of probability of failure (PoF), probability of success (reliability), mean life, parameters of a distribution or any other applicable parameter is never truly known. In fact, these values will (likely) remain unknown for all practical pur- poses. However, through the use of lifetime prediction analysis, engineers are able to use operational field data to determine the PoF for parts, components and systems. Understanding the PoF is useful in determining whether drilling tools must be used in harsh or benign environments for a desired length of time without failure. Lifetime prediction is inclusive of an as- sortment of statistical techniques ranging from best-fit modeling to machine learning and text/data mining as a way to ana- lyze historical and current data for making predictions about the future (forecasting). Current lifetime prediction techniques often require test data obtained throughout extended periods that approach the actual life of a part; this type of testing can be costly and time-consuming. An alternate approach to laboratory testing is to obtain, catalog and statistically analyze opera- tional field data using predictive analytics. To meet the growing demand for more reliable drilling tools, there is mounting interest in the area of health prognostics for electronics components by using physics-based models, operational field data, design and qualification testing data and in- service inspections data. The field data can be used to build a part/tool profile2 to evaluate operational fitness based on the historical usage of an entire population of the same part/tool. This is very similar to the way a doctor compares the results of an individual’s blood test against a specified range from a larger population to determine if the values are within an accepta- ble range. The determination factors into what risk category (low, medium or high) the individual belongs in. A risk- informed decision is made that determines whether any corrective action is required. In the case of a part/tool, the risk indi- cates the fitness of the part/tool to operate optimally in the next run (meaning a recommendation is made on whether to pro- 2 A profile would describe events for a part, from manufacture through end of life. Natural or induced factors (performance or environmental) would be included. All associated failures (confirmed or unconfirmed) would be documented. Figure 1: Illustration of Drilling System Rig Drill Pipe Drilling Tool Abrasive Formation
  • 3. SPE 171517 3 ceed with using the tool as is, perform some level of maintenance or entirely replace/ retire it). Identifying precursors to failure and quantifying the associated risk in real-time is challenging because it is not realistic to take measurements during drilling; an example is when PCBAs are built-in inside the tool and require disassembly to perform tests and measurements. Therefore, using algorithms as a diagnostic tool to detect anomalies is a fast and practical approach. Table 13 provides further details on contemporary methodologies used for lifetime prediction. Table 1: Lifetime Prediction Methodologies Methodology Description (+/-) FD RD TD R&M SP Comp RC SS Measurement of failure precur- sors The first process of measuring failure precur- sors as indicator of impending failure is es- tablished on the hypothesis that a degraded circuit board produces a significantly different signature from that of a defect-free board. Detecting Anomalies using fuses/ sensors The second technique of research in elec- tronics prognostics and health management (PHM) uses sacrificial circuits like fuses, ca- naries, circuit breakers and self-diagnostics sensors for detecting whether the device is operating outside of its design limits. Sacrifi- cial circuits are widely used in consumer electronics products and appliances. Physics based modeling The third approach for life prediction uses modeling and simulation to relate the funda- mental physical and chemical behavior of materials to the action of surrounding envi- ronment and applied loads. Typically for elec- tronics, the PoF- based modeling process starts by exposing the product to highly ac- celerated life tests (HALT) and highly accel- erated stress tests (HAST) to find the signifi- cant mode(s) and root-cause(s) of failure. Field data driven analytics & Sta- tistical modeling The fourth methodology gained momentum because of availability of large volumes of data and limitations of data-agnostic meth- ods. Proposed Lifetime Prediction Methodology This paper introduces a methodology to estimate the lifetime of drilling electronics using operational field data, drilling dy- namics and historical maintenance information. Reliability analyses on specific drilling parameters and Bayesian statistics are combined in a probabilistic framework. Parameter estimation is used to calibrate statistical equations to field data, and proba- bilistic analysis is used to obtain the likelihood of failure. Model parameters are represented as random variables, each with a probability distribution. The methodology takes into account that drilling electronics in downhole conditions can have varied failure modes, and each failure mode can be caused by the interaction of multiple variables, either independently or interde- pendently. Several candidate models were developed to account for the inability to model each failure mode of a component in the field. Bayesian updating further improves the model results by updating prior probability estimates to produce a poste- rior probability estimate established upon operational run history updates for individual part numbers (PNs) within a drilling tool. The inclusion of Bayesian updating adds precision to dynamically selecting more accurate failure models for a selected part as a function of usage. Sophisticated risk-modeling techniques can reduce uncertainties in drilling operations for oil and gas companies by quantitatively identifying the risk. 3 Fully darkened circles denote complete correlation with method.
  • 4. 4 SPE 171517-MS The essential information necessary to estimate the probability of failure is entrenched in the historical life cycle data normally found within a company’s Failure Reporting Analysis and Corrective Action System (FRACAS). FRACAS data is important because it reveals when/how components fail, provides detail on the material properties, loads (electrical and mechanical), material response, the physics of failure and corresponding corrective actions (upgrade or revision). The data is fed directly into a life prediction model that is used to assist in the Risk-Informed Decision Making (RIDM4 ) process, maintain drilling tools and increase reliability (Figure 2). Operational Field Data Requirements Foremost, oil and gas companies must have the ability to predict business outcomes and make risk-informed decisions that enable them surpass their competitors. This is heavily contingent on how successful these companies are at harnessing the available data. In the case of the proposed lifetime prediction approach, historical data is leveraged to forecast future perfor- mance of drilling tools to the part level. However, this cannot be effectively done unless these companies understand the data available. Field data-driven models for lifetime prediction of electronic assemblies in drilling operations is challenging for two reasons. First, not all of the factors impacting component life can be measured in real time. Second, the data that can be measured has errors and noise because of limitations of the measurement system and human factors. Challenges for historical data of PCBAs (Figure 3) include variable operating environment, incomplete information on failures and operating history and sta- tistical variation in components (manufacturing defects, material properties, etc…). 4 The primary objective of RIDM is to provide the decision maker with the necessary risk information to make a choice that has the most potential for successfully meeting objectives (ex: completing a drilling mission without failure and within the specified timetable). Figure 2: Proposed Life Prediction Methodology Optimize Drilling Performance Improve Reliability and Reduce Risk Create Independent Models for each Part Number (PN) in a Drilling Tool Utilize Models to Evaluate Proba- bility of Failure (Generate Results and Plots) Model is Invalid Reduce Maintenance Costs Gather Operational Field Data (Failures and Sus- pensions) Environment, run, and failure data Repair and Maintenance Data Develop Life Prediction Model Select/Screen Desired Data (Filter miss- ing/incomplete data) Consider Per- centage of missing data Consider Size of Dataset Data must have high fidelity and quality. Select Appropriate Lifetime Distribution to Fit Data (Weibull, Lognormal, etc…) Variable Selection (Drilling Hours, Temperature and Vibration) Train Data & Create Model (Outlier Detection, Weighting Factors, Bayesian Updating, and Best Fit Models) Test Model Lifetime Prediction (Risk and Remain- ing Useful Life) Model is Valid Utilize Probability of Failure in Risk-Informed Decision Making Process (RIDM)
  • 5. SPE 171517 5 There are some basic requirements for integrating historical data with predictive models. The first requirement is that the operational field data used in lifetime prediction must be plentiful5 and the second requirement is that the parts must be serial- ized6 . This may require that the data is assembled and formatted in a way that provides the necessary data fields for each PN. Failure (parts that are no longer operational) and suspension (parts that have either been scrapped or have not failed) data must be included in the dataset for each PN. Basic information for a PN must also include the job number, run and incident information for any failures. Table 2 presents some data fields that are helpful in developing a predictive lifetime model (list is not all-encompassing; oth- er fields may also be applicable): Table 2: Potential Drilling Tool Data Fields of Interest Data Field Name Field Description Part Number (PN) Referencing identifier for a group of parts that share common design. Serial Number (SN) Unique identifier of a single piece with associated history. Revision (Rev) Tracking method for non-functionality related changes associated with a PN. Upgrade History (per Revi- sion) Whether or not the PN-SN-Revision has multiple combinations. Repair History (per Revision) Whether or not this PN-SN-Revision has any repair activity. Last Repair Date The last repair activity date during the reported time frame. Scrap Whether or not the most recent component has finally been scrapped. Scrap Reason The scrap reason if the most recent component has finally been scrapped Last Maintenance Level Most recent level of maintenance performed on this PN-SN-Rev. Last Maintenance Location Geographical location where the most recent maintenance activity was complet- ed. Product Description The top level assembly display name. Last Job Location Geographical location where the tool was most recently operated. Last Job Number Referencing (SAP) number used to link with a customer’s well name. Drilling Hours The length of time the BHA/bit was actively making hole. Circulation Hours The length of time drilling fluid was pumped through the BHA. Distance Drilled The total distance drilled by a PN-SN in the reported time frame. Average Temperature The average of average temperatures in the reported time frame. Average RPM The average of average revolutions per minute (RPM) in the reported time frame. Min Depth In The minimum of Depth-In in the reported time frame. Average Flow Rate The average of average flow rate in the reported time frame. 5 There must be sufficient failure data for a part to be adequately modelled. Since there is some pre-screening of the data prior to building a model, there must be at least a year’s worth of data to use in the process. 6 Each part has a part number (PN) and a serial number (SN) that uniquely identifies it and its applicable history. One PN may have multiple SNs and multiple revisions. Figure 3: PCBA Images
  • 6. 6 SPE 171517-MS Data Field Name Field Description Incident Date The incident date of the last failure in the reported time frame. Incident Description Description of events that led to the incident in the reported timeframe. Root Cause Description The root cause of an incident in the reported timeframe. Failure Mode The failure mode of the last failure reported time frame. Average Vibration The average vibration (axial, lateral, and stickslip). Model Development Models are developed at the part number level first (each revision of a part is evaluated separately). Next all of the parts that are installed in a specific drilling tool are grouped together to provide the overall risk of that tool. Therefore, each part’s con- tribution to the overall condition of the drilling tool can be assessed. Comprehensive data for the entire history of the part is required to analyze the relationship between operating environment and life. A typical time to failure model comprises a life distribution function to incorporate the statistical scatter in failure time and a life characteristics function (Appendix A) that describes a general relation between failure time and stress levels (Kale et al., 2014). Weibull, lognormal and exponential distributions are considered in this methodology for each part’s model. The life characteristic can be any life measure such as the mean, median or hazard rate that represents a bulk property of the distribu- tion. The life characteristic is expressed as a function of stress (as shown in Appendix A). The unknown parameter of the composite model is determined by tuning the model equation to fit field data using the Iterative Maximum Likelihood Esti- mation technique. Optimizing Problem One of the main focuses of this paper is to optimize allocation of assets by incorporating operational constraints on life and reliability of individual components that make up the tool. A case study is presented describing a scenario where two assets in a maintenance shop are awaiting overhaul. Furthermore, there are a fixed number of spare parts that can be used as re- placements and stringent threshold reliability constraints that each tool must meet after a maintenance action. The optimizing problem is to maximize reliability of the two assets by swapping the existing parts between the two tools and the additional spare parts. Calculating which assets to swap is done by using the constrained linear programming algorithm. A linear pro- gramming problem may be defined as the problem of maximizing or minimizing a linear function subject to linear con- straints. The constraints may be equalities or inequalities. (1) where x represents the vector of variables whose optimum values are to be determined, C is a vector containing the sensitivity of objective function with respect to each unknown x, B are vectors of known coefficients representing constraint bounds, A is a matrix of coefficients containing sensitivity of constraints with respect to each x and the superscript T stands for matrix transpose. The unknown variables xi represents the system reliability of the ith asset. For typical assets used in drilling sys- tems, it is fair to assume that failure of a single component leads to service failure; consequently, the reliability of assets are modeled using the series system (Equation 2). ∏ ( ) (2) The number of subcomponents in the asset is represented by n. The example used in this paper shows the application of the linear programming method to optimize allocation of individual subcomponents between two assets awaiting maintenance. When considering typical assets (such as PCBAs) used in drilling, the vector C = {1, 1} since reliability of each asset has to be maximized, A = {1, 1}. The example also incorporates a typical scenario where an asset is deployed in more critical jobs (example: award based contract). In this scenario, the selected drilling tool will need to have a higher threshold for reliability than the other. The best parts from the lower reliability drilling tool will be swapped to the tool with higher reliability. The overall asset optimization problem can be summarized as: ∏ ( ) ∏ ( ) (3)
  • 7. SPE 171517 7 ∏( ) ∏( ) The optimization problem is solved using the simplex method in MS Excel. Theoretical discussion of the simplex LP method can be found in Dantzig et al., (1997) and Murty et al., (1983). Case Study 1: Utilizing Lifetime Prediction for Identifying Risk Drilling tool parts are maintained or replaced depending on how many circulating hours they are exposed to. Those circulat- ing hours may be in the range of the mean time between failure (MTBF7 ) or exceeded it. Predicting time to failure of PCBAs within a drilling tool prior to deployment to the field is the example used in this case study. This is a similar type of diagnosis as to when someone takes their vehicle in for maintenance. The vehicle has sensors that reveal data on the current condition of the vehicle depending on how it is used. The results of the screening indicates the type of maintenance required so the ve- hicle can continue to perform optimally and achieve the mission (operate without failure when the owner is driving). Using the location of the upcoming run, a desired risk threshold (example 50% or half of the calculated operating life) can be set to determine whether the parts have consumed more than the risk threshold. Fig. 4: Drilling Tool Based on the location of the upcoming run, a desired risk threshold (example 50%, half of the calculated operating life) can be set to determine whether the parts have consumed more than the risk threshold. Baseline Case: Initial Drilling Tool Assessment A drilling tool was analyzed in April 2014 prior to deployment. In the case study, the risk threshold 8 is set at 50% which is a conservative setting (the expected life of each part is displayed graphically and interpreted in Figure 5). Figure 6 shows the prediction range for actual drilling hours9 (DrillHrs) and a fuel gauge chart that enables the user to see the percentage of life consumed. 7 MTBF is the predicted elapsed time between inherent failures of a system during operation. 8 The risk threshold is the amount of risk that a job is willing to incur. For instance, in the case of an award driven contract where risk is less tolerable, electronics with greater than 50% life consumed are considered risky. However, for a contract where the drilling conditions are more benign, greater risk in the range of 75% could be more acceptable. 9 Actual drilling hours are represented by the diamond. Figure 5: Data Interpretation
  • 8. 8 SPE 171517-MS Further guidance on replacing parts at a maintenance cycle can also be extracted from the results, depending on how the data is flagged (in this case, >50% risk) (as shown in Table 3). Table 3: Predicted Life of Electronic Parts in Drilling Tool Results – April 2014 Results Interpretation Key Low Risk: 0.0 – 0.25 Medium Risk: 0.25 – 0.50 High Risk: >0.50 Uncertainty in prediction due to missing data (>30%) A maintenance center has the ability to distinguish that one part (xxxx-7) requires replacement and can apply lower levels of maintenance to the other electronics (with low or medium risk) or leave them undisturbed (alleviating induced failure because of human error or process escapes). In this case, serial number xxxx-7 can be examined more closely to observe/perceive the predictions for each run. Four parts are identified as medium risk and can either be assigned a certain level of maintenance or can be left untouched depending on associated risk. In addition, the maintenance facility can incorporate the risk values into a sparing forecast for the parts shown in Table 3. The incorporation of historical data and utilization of lifetime prediction risk values can now be used as indicators for future demand. As a result, the accuracy of the sparing forecast is improved and ad- ditional cost savings are generated. Figure 7 shows the run history of the part flagged as high risk. The diamonds represent the actual drilling time of the part. Although the drilling hours do not reached the upper limit of the confidence bound, there is still uncertainty. A recommenda- tion is made for the repair10 , if applicable, or replacement of the part before the next run. 10 Repair is more likely to be considered at the assembly level. PCBAs are more likely to be replaced. Part Number Serial Number Last Job No Cumulative Temperature C Cumulative StickSlip Cumulative Lateral (g_RMS) DrillHrs [h] Worst Case Life 25Q Predicted Mean Life 75Q Best Case Life Risk Part Description Comments 1 xxxx-1 10000 50 (L0) 0.35 (L1) 1.39 (L2) 23.33 301.10 428.47 518.63 622.46 834.54 0.00 PCBA (1) Missing Data For 1Runs..(12%) 2 xxxx-2 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 132.53 236.27 307.20 377.36 505.40 0.00 BATTERY 3 xxxx-3 10000 91.57 (L1) 0.32 (L1) 1.39 (L2) 332.25 278.65 451.23 601.87 731.80 997.62 0.11 PCBA (1) Missing Data For 1Runs..(6%) 4 xxxx-4 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 373.68 587.34 744.87 937.90 1390.76 0.00 PCBA 5 xxxx-5 10000 50 (L0) 0.32 (L1) 1.34 (L2) 216.99 216.73 389.55 534.26 739.48 1229.68 0.03 TRANSDUCER (1) Missing Data For 2Runs..(11%) 6 xxxx-6 10000 50 (L0) 0.35 (L1) 1.39 (L2) 23.33 124.35 204.65 266.52 318.90 418.51 0.00 BATTERY ASSY (1) Missing Data For 1Runs..(12%) 7 xxxx-7 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 378.20 470.53 543.91 647.85 922.11 0.81 PCBA (1) Missing Data For 3Runs..(13%) 8 xxxx-8 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 547.85 776.58 953.76 1183.45 1750.25 0.15 MAGNETOMETER (1) Missing Data For 3Runs..(13%) 9 xxxx-9 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 414.99 617.51 761.79 940.06 1285.00 0.38 ACCELEROMETER (1) Missing Data For 3Runs..(13%) 10 xxxx-10 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 528.26 916.09 1224.80 1612.98 2303.18 0.12 PCBA (1) Missing Data For 3Runs..(13%) 11 xxxx-11 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 253.96 517.32 736.48 1008.65 1805.98 0.44 PCBA (1) Missing Data For 3Runs..(13%) 12 xxxx-12 10000 91.57 (L1) 0.32 (L1) 1.39 (L2) 332.25 223.98 480.48 685.26 969.48 1686.29 0.12 SENSOR (1) Missing Data For 1Runs..(6%) 13 xxxx-13 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 266.49 581.45 923.16 1315.53 2178.79 0.00 POWER SUPPLY 14 xxxx-14 10000 91.57 (L1) 0.32 (L1) 1.39 (L2) 332.25 223.98 480.48 685.26 969.48 1686.29 0.12 SENSOR (1) Missing Data For 1Runs..(6%) 15 xxxx-15 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 266.49 581.45 923.16 1315.53 2178.79 0.00 POWER SUPPLY 16 xxxx-16 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 528.26 916.09 1224.80 1612.98 2303.18 0.12 PCBA (1) Missing Data For 3Runs..(13%) 17 xxxx-17 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 414.99 617.51 762.53 940.06 1285.00 0.38 ACCELEROMETER (1) Missing Data For 3Runs..(13%) 18 xxxx-18 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 414.99 617.51 762.53 940.06 1285.00 0.38 ACCELEROMETER (1) Missing Data For 3Runs..(13%) 19 xxxx-19 10000 66.5 (L0) 0.29 (L1) 1.15 (L2) 692.17 514.10 748.73 927.73 1151.33 1714.98 0.19 MAGNETOMETER (1) Missing Data For 3Runs..(13%) 20 xxxx-20 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 123.18 202.70 264.81 316.51 416.86 0.00 BATTERY ASSY Figure 6: Graphical Representation of Predicted Life of Electronic Parts in Drilling Tool – April 2014
  • 9. SPE 171517 9 The probability of failure increases as the part’s life (measured in drilling hours) reaches the 75% life estimate of the predic- tion (as shown on the chart on the left-hand side of Figure 7). If the part was to remain unchanged in the drilling tool and operated with the same or similar conditions as shown in Table 4, then the probability of failure continues to increase. Table 4: Predicted Life versus. Run for Part Outside of Risk Threshold Decision Case The drilling tool was analyzed again in May 2014, prior to deployment. In this case study, the risk threshold is set at 50% again. Maintenance was previously performed based on the recommendations made in Table 3. Fig. 8 shows the prediction range for actual drilling hours and a fuel gauge chart that enables the user to distinguish the percentage of life consumed. Last Job No Cumulative Temperatur e C Cumulative Lateral (g_RMS) Cumulative StickSlip (g_RMS) DrillHrs [h] Worst Case Life 25Q Predicted Mean Life 75Q Best Case Life Risk Comments 10000 66.50 1.15 0.29 692.17 378.20 470.53 543.91 647.85 922.11 0.81 10000 66.72 1.15 0.28 677.75 377.84 470.27 543.06 647.52 921.70 0.79 10000 66.77 1.15 0.28 677.42 378.16 470.66 543.40 648.06 922.53 0.79 10000 66.80 1.15 0.28 674.42 377.97 470.41 543.08 647.71 922.10 0.79 10000 66.81 1.15 0.28 673.92 377.80 470.20 542.84 647.42 921.70 0.79 9000 66.91 1.14 0.28 672.34 378.17 470.66 543.45 648.15 922.93 0.78 Missing Data For This Run 9000 66.91 1.14 0.28 599.51 381.24 474.37 548.12 653.31 930.57 0.63 Missing Data For This Run 9000 66.91 1.14 0.28 595.68 381.26 474.40 548.16 653.36 930.63 0.62 Missing Data For This Run 6000 66.91 1.14 0.28 594.01 381.21 474.33 548.08 653.25 930.50 0.62 6000 68.60 1.25 0.31 585.11 347.15 432.64 499.96 595.60 846.42 0.72 5800 67.58 1.32 0.33 541.21 335.40 417.57 483.78 574.98 811.59 0.67 5123 69.40 1.44 0.36 495.11 309.81 386.13 447.29 530.82 745.99 0.65 5123 71.92 1.44 0.34 432.61 170.63 325.55 450.20 619.38 1060.60 0.47 4726 76.43 1.42 0.23 348.44 202.54 385.29 535.00 736.66 1279.67 0.20 4726 77.31 1.46 0.24 339.24 197.54 375.06 521.84 716.61 1244.64 0.20 4575 78.99 1.56 0.25 317.14 188.14 356.96 496.13 681.03 1179.46 0.20 4575 75.97 1.46 0.24 283.72 200.54 381.58 528.37 727.27 1263.72 0.13 3723 76.04 1.47 0.24 273.80 200.15 380.77 527.33 725.79 1261.03 0.12 3723 50.00 1.25 0.13 85.70 338.55 629.71 867.77 1188.73 1764.98 0.00 3723 50.00 1.12 0.15 46.80 312.93 584.04 801.60 1102.97 1764.98 0.00 2555 50.00 0.85 0.12 42.70 337.10 623.24 857.42 1189.40 1764.98 0.00 1859 50.00 0.82 0.11 30.00 356.99 661.58 913.72 1261.60 1764.98 0.00 Figure 7: Predicted Life versus Run for Part Outside of Risk Threshold
  • 10. 10 SPE 171517-MS Fig. 8: Graphical Representation of Predicted Life of Electronic Parts in Drilling Tool – May 2014 Most of the serial numbers have remained the same. However, based on the previous data there have been some changes made to the tool build. There is also a noticeable change in the percentage of missing data, which these dynamic models take into account, and a new risk is calculated for each part. The update to the data has also had a positive impact on the results, as shown in Table 5. Table 5: Predicted Life of Electronic Parts in Drilling Tool Results – May 2014 Results Interpretation Key Low Risk: 0.0 – 0.25 Medium Risk: 0.25 – 0.50 High Risk: >0.50 Uncertainty in prediction due to missing data (>30%) In Table 3 and Table 5, serial numbers 7, 9, 11, 17 and 18 are identified as high/ medium risk in Table 3, and appropriate maintenance actions were taken. Parts that had risk lower than the risk threshold were used and previous parts that met the risk threshold remained unchanged. Table 5 enables for better decision making ability because the probability of failure can be assessed before a tool is sent into the field. Case Study 2: Sparing Optimization This section will show the application of optimization technique developed in the previous section to determine best possible selection of sub-components that will maximize the overall system reliability of both the assets (example: Fig. 9). Part Number Serial Number Last Job No Cumulative Temperature C Cumulative StickSlip Cumulative Lateral (g_RMS) DrillHrs [h] Worst Case Life 25Q Predicted Mean Life 75Q Best Case Life Risk Part Description Comments 1 xxxx-1 10000 50 (L0) 0.35 (L1) 1.39 (L2) 23.33 301.10 428.47 518.63 622.46 834.54 0.00 PCBA (1) Missing Data For 1Runs..(12%) 2 xxxx-2 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 132.53 236.27 307.20 377.36 505.40 0.00 BATTERY 3 xxxx-3 10000 91.57 (L1) 0.32 (L1) 1.39 (L2) 332.25 278.65 451.23 601.87 731.80 997.62 0.11 PCBA (1) Missing Data For 1Runs..(6%) 4 xxxx-4 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 373.68 587.34 744.87 937.90 1390.76 0.00 PCBA 5 xxxx-5 10000 50 (L0) 0.32 (L1) 1.34 (L2) 216.99 216.73 389.55 534.26 739.48 1229.68 0.03 TRANSDUCER (1) Missing Data For 2Runs..(11%) 6 xxxx-6 10000 50 (L0) 0.35 (L1) 1.39 (L2) 23.33 124.35 204.65 266.52 318.90 418.51 0.00 BATTERY ASSY (1) Missing Data For 1Runs..(12%) 7 xxxx-7 10000 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 237.15 448.23 628.43 865.80 1511.01 0.49 PCBA (1) Missing Data For 2Runs..(8%) 8 xxxx-8 10000 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 571.43 850.71 1066.84 1343.04 2055.70 0.05 MAGNETOMETER (1) Missing Data For 2Runs..(8%) 9 xxxx-9 10000 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 426.25 638.38 800.76 989.99 1285.00 0.22 ACCELEROMETER (1) Missing Data For 2Runs..(8%) 10 xxxx-10 10000 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 453.21 792.79 1076.48 1410.04 2236.77 0.13 PCBA (1) Missing Data For 2Runs..(8%) 11 xxxx-11 10000 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 254.15 538.13 773.65 1101.58 2038.32 0.33 PCBA (1) Missing Data For 2Runs..(8%) 12 xxxx-12 10000 91.57 (L1) 0.32 (L1) 1.39 (L2) 332.25 223.98 480.48 685.26 969.48 1686.29 0.12 SENSOR (1) Missing Data For 1Runs..(6%) 13 xxxx-13 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 266.49 581.45 923.16 1315.53 2178.79 0.00 POWER SUPPLY 14 xxxx-14 10000 91.57 (L1) 0.32 (L1) 1.39 (L2) 332.25 223.98 480.48 685.26 969.48 1686.29 0.12 SENSOR (1) Missing Data For 1Runs..(6%) 15 xxxx-15 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 266.49 581.45 923.16 1315.53 2178.79 0.00 POWER SUPPLY 16 xxxx-201 9859 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 453.21 792.79 1076.48 1410.04 2236.77 0.13 PCBA (1) Missing Data For 2Runs..(8%) 17 xxxx-1715 9859 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 426.25 638.38 800.76 989.99 1285.00 0.22 ACCELEROMETER (1) Missing Data For 2Runs..(8%) 18 xxxx-1869 9859 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 426.25 638.38 800.76 989.99 1285.00 0.22 ACCELEROMETER (1) Missing Data For 2Runs..(8%) 19 xxxx-1975 9859 59.32 (L0) 0.21 (L1) 0.97 (L1) 614.74 571.43 851.33 1066.84 1344.01 2055.70 0.05 MAGNETOMETER (1) Missing Data For 2Runs..(8%) 20 xxxx-20 10000 50 (L0) 0.38 (L1) 1.5 (L2) 19.83 123.18 202.70 264.81 316.51 416.86 0.00 BATTERY ASSY
  • 11. SPE 171517 11 Fig. 9: Asset Swapping The reliability of each component is calculated using the life prediction method described in Kale et al., (2014). The system reliability of the asset is calculated using Equation 2. The baseline reliability of the two assets is show in Table 6. First and third columns in the table list the name of the subcomponent that makes up an asset, second and the fourth columns show the risk of failure of individual subcomponent in the asset. The part name shown in each row represents unique parts serial num- bers present in each asset. For example, Asset1-01 is functionally different from Asset1-02, and Asset1-02 is functionally different from Asset1-03 and so on. Part names with same numerical index represent a different sample of same functional part. For example, Asset1-01 and Asset2-01 represent identically manufactured parts that have the same functionality. The last row in the table shows the overall system reliability of the assets computed using Equation 2. Table 6: Risk of Individual Sub-Components in Two Assets Part Name Asset1 Risk of Asset 1 Part Name Asset2 Risk of Asset 2 Asset1-01 5×10-4 Asset2-01 5×10-4 Asset1-02 5×10-4 Asset2-02 5×10-4 Asset1-03 5×10-4 Asset2-03 0.025 Asset1-04 5×10-4 Asset2-04 5×10-4 Asset1-05 0.051 Asset2-05 5×10-4 Asset1-06 5×10-4 Asset2-06 5×10-4 Asset1-07 0.076 Asset2-07 0.038 Asset1-08 0.001 Asset2-08 5×10-4 Asset1-09 5×10-4 Asset2-09 0.029 Asset1-10 0.031 Asset2-10 5×10-4 Asset1-11 5×10-4 Asset2-11 5×10-4 Asset1-12 5×10-4 Asset2-12 5×10-4 Asset1-13 5×10-4 Asset2-13 0.061 Asset1-14 5×10-4 Asset2-14 0.061 Asset1-15 5×10-4 Asset2-15 5×10-4 Asset1-16 5×10-4 Asset2-16 5×10-4 Asset1-17 0.030 Asset2-17 0.0287 Asset1-18 0.030 Asset2-18 0.0287 Asset1-19 0.030 Asset2-19 0.0287 System Reliability 0.774 0.735 The sum of the system reliability for the two assets is (calculated by Equation 3) is 1.51. The total system reliability of these two assets is maximized by swapping parts using the simplex linear programming method. The condition for swapping parts is that the system reliability of each asset must be at least 50%. Table 7 shows the results for this optimization.
  • 12. 12 SPE 171517-MS Table 7: Risk of Individual Sub-Components in Two Assets after Swapping Sub-Components Between Them to Max- imize Cumulative System Reliability Part Name Asset 1 Risk of Asset 1 Part Name Asset 2 Risk of Asset 2 Comments Asset1-01 5×10-4 Asset2-01 5×10-4 Asset1-02 5×10-4 Asset1-02 5×10-4 Asset1-03 5×10-4 Asset2-03 0.025 Asset1-04 5×10-4 Asset2-04 5×10-4 Asset2-05 5×10-4 Asset1-05 0.051 Swap Asset1-06 5×10-4 Asset2-06 5×10-4 Asset2-07 0.038 Asset1-07 0.076 Swap Asset2-08 5×10-4 Asset1-08 0.00055 Swap Asset1-09 5×10-4 Asset2-09 0.03 Asset2-10 5×10-4 Asset1-10 0.03 Swap Asset1-11 5×10-4 Asset2-11 5×10-4 Asset1-12 5×10-4 Asset2-12 5×10-4 Asset1-13 5×10-4 Asset2-13 0.061 Asset1-14 5×10-4 Asset2-14 0.061 Asset1-15 5×10-4 Asset2-15 5×10-4 Asset1-16 5×10-4 Asset2-16 5×10-4 Asset2-17 0.0287 Asset1-17 0.03 Swap Asset2-18 0.0287 Asset1-18 0.03 Swap Asset2-19 0.0287 Asset1-19 0.03 Swap System Reliability 0.881 0.646 Table 7 shows that combined reliability of the two assets can be maximized by swapping parts between them. The fifth col- umn in Table 7 shows which part was swapped between the two assets. The outcome of this swapping of parts is that the reliability of the first asset is increased to 88% from a baseline value of 77% and the reliability of second asset is reduced to 64% from a baseline value of 73%. A scenario is presented where there are additional new spare parts (assumed to have risk of 0) available to replace Asset-07, Asset-17, Asset-18 and Asset-19. The cumulative system reliability of the two assets is maximized by swapping parts between them and utilizing the additional spare parts. The results for the optimization are shown in Table 8. The overall reliability of assets can be enhanced by optimally utilizing the existing subcomponents in the tool. Cost-based optimization can be achieved to add the economics of spares and repairs in the decision making process to determine which component must be replaced and which must remain active. The cost of failure factor may be added to de- cide the optimal maintenance interval and level of repairs and replacements from the calculated risk of failure and cost of maintenance.
  • 13. SPE 171517 13 Table 8: Risk of Individual Sub-Components in Two Assets after Swapping Sub-Components between Them and Uti- lizing Additional Spare Parts to Maximize Cumulative System Reliability Part Name Asset 1 Risk of Asset 1 Part Name Asset 2 Risk of Asset 2 Comments Asset1-01 5×10-4 Asset2-01 5×10-4 Asset1-02 5×10-4 Asset1-02 5×10-4 Asset2-03 0.025 Asset1-03 5×10-4 Swap Asset1-04 5×10-4 Asset2-04 5×10-4 Asset1-05 0.051 Asset2-05 5×10-4 Asset1-06 5×10-4 Asset2-06 5×10-4 Asset2-07 0.038 Asset1-07 0.000 New part/Swap Asset1-08 0.001 Asset2-08 5×10-4 Asset2-09 0.029 Asset1-09 5×10-4 Swap Asset1-10 0.031 Asset2-10 5×10-4 Asset1-11 5×10-4 Asset2-11 5×10-4 Asset1-12 5×10-4 Asset2-12 5×10-4 Asset2-13 0.061 Asset1-13 5×10-4 Swap Asset2-14 0.061 Asset1-14 5×10-4 Swap Asset1-15 5×10-4 Asset2-15 5×10-4 Asset1-16 5×10-4 Asset2-16 5×10-4 Asset2-17 0.0287 Asset1-17 0.000 New part/Swap Asset2-18 0.0287 Asset1-18 0.000 New part/Swap Asset2-19 0.0287 Asset1-19 0.000 New part/Swap System Reliability 0.676 0.999 Conclusion The paper presents how maintenance plans and reliability for drilling tools can be improved, while reducing cost, by taking advantage of the forecasting capability that lifetime prediction provides. The improvements ultimately lead to preventing costly failures in the field. Lifetime prediction is a way to make risk-informed decisions and is a catalyst to maintaining drill- ing tools by letting the data show where improvement/change must be made. This shift in standard practice leads to lower maintenance costs without sacrificing reliability. Improvement in sparing forecasts is an additional benefit to this methodolo- gy because higher risk parts are more easily identified. Future work will focus on refining model predictions by using additional environmental variables, incorporating other statis- tical methodologies and integrating data from design and qualification tests to optimize drilling performance. Acknowledgements The authors thank Baker Hughes for permitting them the chance to work on such a trailblazing methodology. Nomenclature Drillhrs = Drilling hours BHA = Bottomhole assembly FRACAS = Failure Reporting and Corrective Action System HALT = Highly accelerated life test HAST = Highly accelerated stress test MLE = Maximum likelihood estimation NPT = Nonproductive Time PCBA = Printed circuit board assembly PHM = Prognostics and health management PoF = Probability of failure PN = Part number RIDM = Risk-Informed Decision Making RPM = Revolutions per minute SN = Serial number F = Failure L = Lateral vibration Mi = ith model identifier
  • 14. 14 SPE 171517-MS N = Symbol used to represent negative decision, generally “no” or “0” S = Symbol used to represent stick-slip or suspensions T = Temperature X = Vector of parameters such as temperature and vibrations Y = Symbol used to represent affirmative decision, generally “yes” or “1” f = Probability density function m = Number of models n = Number of records p = Probability p(a|b) = Conditional probability of occurrence of event a provided b is true revid = revision identifier tf = Time to failure (drilling hours) wi = Weight of ith data point xave = Average value of parameter x xstdev = Standard deviation of parameter x α = Calibration parameters of reliability model = Likelihood η = Characteristic life or scale factor of a probability distribution β = Shape factor of a probability distribution σ = Standard deviation λ= Hazard function {CF} = Set of life data for confirmed failure {O} = Set of outliers {S} = Set of life data for suspension {UF} = Set of life data for unconfirmed failure Load, Stress and Severity are used interchangeably to describe the impact of the operational environment (mechanical and thermal) on the durability of parts. Nominal part is a representative part that has a life equal to the average of several parts produced using the same manufactur- ing process and operating under identical conditions. Suspensions are used in reliability modeling to represent hours accumulated on parts that are in operation or removed from service for reasons other than failure. Bibliography Barker, D., Dasgupta, A., and Pecht, M. (1992, February). PWB solder joint life calculations under thermal and vibrational loading. Journal if the IES, 35(1), 17-25. Chatterjee, K., Modarres, M., and Bernstein, J. (2012). Fifty Years of Physics of Failure. Journal of Reliability Information Analysis Center. Duffek, D. (2004). Effect of Combined Thermal and Mechanical Loading on the Fatigue of Solder Joints. University of Notre Dame. Notre Dame: Master's Thesis. Garvey, D. R., Baumann, J., Lehr, J., and Hines, J. (2009). Pattern Recognition Based Remaining Useful Life Estimation of Bottom Hole Assembly Tools. SPE/IADC Drilling Conference and Exhibition. Amsterdam, The Netherlands. George B. Dantzig and Mukund N. Thapa. 1997. Linear programming 1: Introduction. Springer-Verlag. Kale, A. A., Carter-Journet, K., Heuermann-Kuehn, L., Falgout, T., and Zurcher, D. (2014). A Probabilistic Approach to Reliability and Life Prediction of Electronics in Drilling and Evaluation Tools. Litt, J., Soditus, S., Hendricks, R., and Zaretsky, E. (2001). Structural Life and Reliability Metrics Benchmarking and Verification of Probabilistic Life Prediction Codes. 5th Annual FA/_JAir Force/NASA/Navy Workshop. Mishra, S., and Pecht, M. (2002). In-situ Sensors for Product Reliability Monitoring. Proceedings of SPIE, 4755, pp. 10-19. Murty, Katta G. (1983). Linear programming. New York: John Wiley and Sons Inc. pp. xix+482. ISBN 0-471-09725-X. MR 720547. Reich, M. (2004). The Fascinating Workd of Drilling Technoligy: Products from Baker Hughes and their Functions. Celle: Baker Hughes. Tuchband, B. A. (2007). Implementation of Prognostics and Health Management for Electronic Systems. College Park: University of Maryland.
  • 15. SPE 171517 15 Appendix A A. General Log-Linear Model The relation between characteristic life and stress variables are represented by using one of the three models generalized as log-linear (GLL), proportional hazard (PH) and cumulative damage (CD). The GLL model represents life using Equation A-1 ( ̅) ∑ ∑ ∑ (A-1) where ̅ = {T, L, S}. For a Weibull distribution, the probability density function is shown in Equation A-2, where β is the shape parameter, η is the scale parameter and α’s are unknown parameters calculated from field data using the maximum likelihood estimation technique. ( ̅) ( ̅) (̅) (A-2) The probability density function (PDF) for an exponential distribution can be obtained by simply putting β=1 in Equation A- 1. For lognormal distribution, the probability density function for a GLL stress function is shown in Equation A-3 ( ̅) √ ( ( ) (̅) ) (A-3) B. Proportional Hazard Model For a proportional hazard model, the hazard rate of a component is affected by hours in operation and stress variables. The instantaneous hazard rate of a part is given by Equation A-4 ( ̅) ( ̅) ( ̅) ( ) ( ̅ ̅) (A-4) where f is the probability density function and R is the reliability function. The instantaneous hazard rate, λ0, is a function of time only and the stress function, η, is a function of operating stresses such as temperature, vibration etc. The list of unknown model parameter ̅ is obtained by calibrating the model to test data using the maximum likelihood estimation (MLE). The stress function, η, is given by Equation A-5 ( ̅) ∑ ∑ ∑ (A-5) Substituting Equation A-5 in Equation A-2, the hazard function for a Weibull distribution is written using Eq. (A-6) ( ̅) ( ) ∑ ∑ ∑ (A-6) C. Cumulative Damage Model The cumulative damage model incorporates the effect of time varying stress on life of components. The model takes into ac- count the impact of damage accumulated at each stress level on the reliability of parts. Damage accumulation can take place at various rates for various stress levels and can be determined using the linear damage sum (Miner’s rule), the inverse power law or cycle counting techniques such as rainflow counting. The cumulative damage model used in this paper is established from Miner’s rule which is based on the hypothesis that if there are n different stress levels and the time to failure at the ith stress σi is Tfi, then the damage fraction, p, is given by Equation A-7 ∑ (A-7) where ti is the number of cycles accumulated at stress σi and failure occurs when the damage fraction equals unity. The prob- ability distribution functions for Weibull and lognormal distributions are obtained by substituting equation A-7 in equations A-2 and A-3, respectively. Given the stress variables ̅ { }, the PDF for a Weibull distribution is given by ( ̅) ∫ ∑ ( ) ∑ ∑ ( ) ( )
  • 16. 16 SPE 171517-MS ( ̅) ( ̅)( ( ̅)) (( ( ̅))) (A-8) D. Characteristic Life Function The life characteristic function describes a general relation between failure time and stress levels. The life characteristic can be any time to failure measure such as the mean, median, hazard rate etc. that represents a bulk property of a probability dis- tribution. Ideally, the function must incorporate the governing equations that represent the physical phenomenon of degrada- tion of the material under the application of load. Typical electronic circuit boards used in drilling and evaluations are com- plex and the governing equations representing degradation and failure mechanism are difficult to model; therefore, the paper evaluates several empirical functions between stress variables and selects the one that best fits the field data.