This document discusses using Tableau and R integration to perform Weibull analysis. It provides an overview of Weibull reliability analysis and the bathtub curve. It then demonstrates how to set up R scripts in Tableau to calculate Weibull parameters like beta, eta, survival and failure probabilities, and confidence bands. Plots of survival data can then be created in Tableau using these R-calculated values. Links are also provided to download the necessary R packages and Tableau.
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Weibull using Tableau + R Integration
1. Weibull using Tableau + R Integration
18-JUL-17
Monica Willbrand
Tableau Professional Services
2. • 3 years Tableau Professional Services
+ Tableau Desktop Qualified Associate +
+ Tableau Server Qualified Associate +
• 8 years Semiconductor industry
Relevant Background:
3. • Talk about the tools will use
• Brief overview of Weibull analysis
• Walk through process and calculations
• If we DON’T get through it, the organizers will have this deck and
Tableau workbook
Agenda
4. What is Tableau?
• Visualization software
• On a mission to help people see and
understand data
• 2017 Gartner Magic Quadrant for BI and
Analytics: 5 years a leader
10. R + R Server, and a couple of packages for Weibull
• Get R @ https://www.r-project.org/
• Get Rserve package
• Get packages invoked by Rscript:
– Flexsurv
– Plyr
• Download and unzip in your R
library, e.g.
Program FilesRR-3.3.3library
14. Rscript in Tableau for R(t)
• R(t) = SCRIPT_REAL('
library(flexsurv);
library(plyr)
input<-data.frame(time=.arg1);
S <- Surv(.arg1,.arg2)
fit <- flexsurvreg(S~1,dist="weibull")
s<-summary(fit, cl=0.8, tidy=TRUE, type=.arg3[1])
c<-join(input, s, by = "time");
c$est',
AVG([Time In Field]),ATTR([Failure Flag]),"survival")
arg1 = Time in Field
arg2= Failure Flag
arg3= survival
15. Confidence bands
• Create parameter in Tableau
– Data type: Float
– Allowable values: Range
– Enter range of values
• Rscript for each band
16. R(t) lower interval
SCRIPT_REAL('
library(flexsurv);
library(plyr)
input<-data.frame(time=.arg1);
S <- Surv(.arg1,.arg2)
fit <- flexsurvreg(S~1,dist="weibull")
s<-summary(fit, cl=.arg3[1], tidy=TRUE, type=.arg4[1])
c<-join(input, s, by = "time");
c$lcl',
AVG([Time In Field]),ATTR([Failure Flag]),[interval range],"survival")
arg1 = Time in Field
arg2= Failure Flag
arg3= interval range
arg4= survival
17. R(t) upper interval
SCRIPT_REAL('
library(flexsurv);
library(plyr)
input<-data.frame(time=.arg1);
S <- Surv(.arg1,.arg2)
fit <- flexsurvreg(S~1,dist="weibull")
s<-summary(fit, cl=.arg3[1], tidy=TRUE, type=.arg4[1])
c<-join(input, s, by = "time");
c$ucl',
AVG([Time In Field]),ATTR([Failure Flag]),[interval range],"survival")
arg1 = Time in Field
arg2= Failure Flag
arg3= interval range
arg4= survival
18. Calc for F(t)
• F(t)=1-R(t) = Q(t) = Unreliability over time
19. Beta calc
SCRIPT_REAL('
library(flexsurv);
S <- Surv(.arg1,.arg2)
fit <- flexsurvreg(S~1,dist="weibull")
shape <- fit$res[1];
scale <- fit$res[2];
shape',
AVG([Time In Field]),ATTR([Failure Flag]))
arg1 = Time in Field
arg2= Failure Flag
20. Eta calc
SCRIPT_REAL('
library(flexsurv);
S <- Surv(.arg1,.arg2)
fit <- flexsurvreg(S~1,dist="weibull")
shape <- fit$res[1];
scale <- fit$res[2];
(scale)',
AVG([Time In Field]),ATTR([Failure Flag]))
arg1 = Time in Field
arg2= Failure Flag
21. Plotting data points. Some calcs.
• Censored?
If [Failure Suspension Flag] = 'S' THEN 1 ELSE 0 END
• Cohort Data=
IF SUM([Censored?]) = 1 THEN NULL ELSE
([Adj Median Rank]-.3)/(TOTAL(SUM([Number of Records]))+.4)
END
• Adjusted median rank=
IF SUM([Censored?]) = 1 THEN PREVIOUS_VALUE(0)
ELSE
(([Inverse Rank] * PREVIOUS_VALUE(0))+(TOTAL(SUM([Number of Records]))+1))/([Inverse Rank]+1)
END
24. Review
• Tools to perform Weibull
– Tableau
– R Integration
– Scripts
– Reference links in the slide notes
25. Links:
• Download Tableau
• Get R
• Download packages:
– https://cran.r-project.org/web/packages/Rserve/index.html
– https://cran.r-project.org/web/packages/flexsurv/index.html
– https://cran.r-project.org/web/packages/plyr/index.html
• 2017 Gartner Magic Quadrant for BI and Analytics: 5 years a leader
Assumes general knowledge of reliability, Tableau, and R
AWESOME ability to execute, completeness of vision
https://public.tableau.com/en-us/s/gallery
Wonderful online community https://community.tableau.com/welcome
>100k users
Tableau + R
https://www.tableau.com/learn/whitepapers/using-r-and-tableau
What is Rserve? https://www.rforge.net/Rserve/
TCP/IP server which allows other programs to use facilities of R (see www.r-project.org) from various languages without the need to initialize R or link against R library. Every connection has a separate workspace and working directory. Client-side implementations are available for popular languages such as C/C++, PHP and Java. Rserve supports remote connection, authentication and file transfer. Typical use is to integrate R backend for computation of statistical models, plots etc. in other applications.
Think about trip up today, mtbf for switching gear
Care about these coefficients
Tell us about our population, our failure rate
IM when beta <1, result of defects, design, assembly
Normal life when beta =1, constant/random failures, e.g. “stress exceeds strength”
End of life wear out, beta >1
Talking about values beta, may be thinking of the bathtub curb
Kaplan Meier method from OCatherin: https://community.tableau.com/thread/171437
Instruction
Workbook
Validation method using R
- Relative failure rate of an entire population of products, two coefficients
β
Ƞ determine when given portion of population will fail
Intercept (eta) & Beta [shape]
F(t)= 1 – R(t)
R(t)=e-(t/Ƞ)^β
Non-R: https://community.tableau.com/thread/171437
If infant mortality, highly accelerated stress testing
Identify failing components, RCA for defectivity or material variation
--R console feels like command line, not interactive
--KM uses Max Likelihood Estimate to get parameter
When we use the KM method we are using MLE or Max Likelihood Estimate, vs. Weibull, using LSE, or Least Squares estimate, we get accurate parameters beta and eta that one would obtain using Reliasoft program (validated w/ client). Ultimately, SE much lower w/ Weibull. To do this in Tableau, we must integrate with R server
Workbook outcome, presenting in Powerpoint, not Tableau
F(t)= 1 – R(t)
F(t) = unreliability
R(t)=e-(t/Ƞ)^β
logarithmic scale
Show failure points
Overview of connection between Tableau + RServe
Script
See in R console, it’s fun, kinda visual
In Tableau, beautifully executed, very visual, interactive Weibull, CI bands
Download & install R console, libraries
https://cran.r-project.org/web/packages/Rserve/index.html
https://cran.r-project.org/web/packages/flexsurv/index.html
https://cran.r-project.org/web/packages/plyr/index.html
- Bell labs 1976
Statistical analysis, modeling
Programming, based off of S
R, 1993, Ross Ihaka Robert Gentlemen
open source, free!
Verifiable, e.g. give someone a drug, claim improvement is survivability
Anyone can contribute, stats packages
NIST resource for Weibull: http://itl.nist.gov/div898/handbook/apr/section4/apr413.r
What is Rserve? https://www.rforge.net/Rserve/
What is Rserve? Rserve is a TCP/IP server which allows other programs to use facilities of R (see www.r-project.org) from various languages without the need to initialize R or link against R library. Every connection has a separate workspace and working directory. Client-side implementations are available for popular languages such as C/C++, PHP and Java. Rserve supports remote connection, authentication and file transfer. Typical use is to integrate R backend for computation of statstical models, plots etc. in other applications.
Point Desktop client to Rserve
Commands to point Tableau Server to Rserve: http://kb.tableau.com/articles/HowTo/configuring-tableau-server-for-r-and-rserve?userSource=1
Data required:
Time in field
Failure Flag
ID for patient/part/widget
F(t)=1-R(t) = Q(t) = Unreliability over time=
arg1 = response variable
arg2 = censor flag
“Survival” argument can be replaced with Hazard or Cumulative Hazard depending on curve we want to plot
Parameterize and swap on the fly
http://www.itl.nist.gov/div898/handbook/eda/section3/eda362.htm#CDF
Hazard The hazard function is the ratio of the probability density function to the survival function
Cl Width of symmetric confidence intervals for maximum likelihood estimates, by default 0.95 in flexsurv package
- Add argument for confidence interval range w/ parameter
https://cran.r-project.org/web/packages/flexsurv/flexsurv.pdf
Shape parameter of the Weibull distribution, beta (β), represents the failure rate behavior.
If beta is less than 1, then the failure rate decreases with time;
If beta is greater than 1, then the failure rate increases with time.
when beta is equal to 1, the failure rate is constant.
http://www.itl.nist.gov/div898/handbook/apr/section2/apr221.htm
For each time ti of the i-th failure, calculate the CDF or percentile estimate using 100(i−0.3)/(n+0.4).
http://reliawiki.org/index.php/Parameter_Estimation
Least squares (rank regression) vertical deviation from line to miniize
Table calcs compute using Specific dimensions:
Time in Field
ID (e.g. Serial Number or Patient ID)
At the level: Deepest
Restarting every: None
Sort order: Specific Dimensions (default)
Combined field to create path/lines from point-to-point for Confidence Bands
Hmmmm, beta < 1…. Failures decrease over time
Infant mortality failures
Stress testing of product
Maybe out gear wouldn’t have been problematic this a.m. (Amtrak!)