Using R for Statistical Training: An Application to Six Sigma Methodology for Process Improvement.
Using R for
Statistical Training
17/04/2012
EL Cano,
Using R for Statistical Training
JM Moguerza,
A Redchuk An Application to Six Sigma Methodology
Statistical Training for Process Improvement.
The Problem
Approaches
The R Choice
The R framework
Sweave
Emilio L. Cano, Andr´s Redchuk and Javier
e
Application M. Moguerza
Six Sigma
Examples
Environments Departamento de Estad´ıstica e Investigaci´n Operativa
o
Universidad Rey Juan Carlos (Madrid)
XXXIII Congreso Nacional de Estad´
ıstica e
Investigaci´n Operativa
o
SEIO 2012 1/28
Using R for
Statistical Training
Contenido
17/04/2012
EL Cano,
JM Moguerza,
A Redchuk 1 Statistical Training
Statistical Training The Problem
The Problem
Approaches Approaches
The R Choice
The R framework
Sweave
Application
Six Sigma
Examples
Environments
SEIO 2012 2/28
Using R for
Statistical Training
Contenido
17/04/2012
EL Cano,
JM Moguerza,
A Redchuk 1 Statistical Training
Statistical Training The Problem
The Problem
Approaches Approaches
The R Choice
The R framework
Sweave 2 The R Choice
Application
Six Sigma
The R framework
Examples
Environments Sweave
SEIO 2012 2/28
Using R for
Statistical Training
Contenido
17/04/2012
EL Cano,
JM Moguerza,
A Redchuk 1 Statistical Training
Statistical Training The Problem
The Problem
Approaches Approaches
The R Choice
The R framework
Sweave 2 The R Choice
Application
Six Sigma
The R framework
Examples
Environments Sweave
3 Application
Six Sigma
Examples
Environments
SEIO 2012 2/28
Using R for
Statistical Training
Contenido
17/04/2012
EL Cano,
JM Moguerza,
A Redchuk 1 Statistical Training
Statistical Training The Problem
The Problem
Approaches Approaches
The R Choice
The R framework
Sweave 2 The R Choice
Application
Six Sigma
The R framework
Examples
Environments Sweave
3 Application
Six Sigma
Examples
Environments
SEIO 2012 3/28
Using R for
Statistical Training
The Problem
17/04/2012
Elements of Statistical Training
EL Cano,
JM Moguerza,
A Redchuk
Statistical Training
The Problem
Approaches
The R Choice
The R framework
Sweave
Application
Six Sigma
Examples
Environments
SEIO 2012 4/28
Using R for
Statistical Training
Copy-paste Approach
17/04/2012
Approaches
EL Cano,
JM Moguerza,
A Redchuk
Statistical Training
The Problem
Approaches
The R Choice
The R framework
Inconsistencies
Sweave
Application Errors
Six Sigma
Examples
Environments
Out-of-date
non-reproducible
Painful changes
SEIO 2012 5/28
Using R for
Statistical Training
Reproducible Research Approach
17/04/2012
Approaches
EL Cano,
JM Moguerza,
A Redchuk
Statistical Training Reproducible Research
The Problem
Approaches
The goal of reproducible research is to tie
The R Choice
The R framework specific instructions to data analysis and
Sweave
Application experimental data so that scholarship can be
Six Sigma
Examples recreated, better understood and verified
Environments
Literate Programming
Literate programming is a methodology that
combines a programming language with a
documentation language
SEIO 2012 6/28
Using R for
Statistical Training
Reproducible Research
17/04/2012
Workflow
EL Cano,
JM Moguerza,
A Redchuk
Statistical Training
The Problem
Approaches
The R Choice
The R framework
Sweave
Application
Six Sigma
Examples
Environments
SEIO 2012 7/28
Using R for
Statistical Training
Contenido
17/04/2012
EL Cano,
JM Moguerza,
A Redchuk 1 Statistical Training
Statistical Training The Problem
The Problem
Approaches Approaches
The R Choice
The R framework
Sweave 2 The R Choice
Application
Six Sigma
The R framework
Examples
Environments Sweave
3 Application
Six Sigma
Examples
Environments
SEIO 2012 8/28
Using R for
Statistical Training
The R System
17/04/2012
Choosing R
EL Cano,
JM Moguerza,
A Redchuk
Statistical Training What is R?
The Problem
Approaches R is a language and environment for statistical
The R Choice
The R framework computing and graphics.
Sweave
Application
Six Sigma
Examples
Open Source
Environments
Platform independent
Huge community
Extensible
3 730 available
http://www.r-project.org
packages
SEIO 2012 9/28
Using R for
A
LTEX, Beamer, PDF
Statistical Training
17/04/2012
Choosing R
EL Cano,
JM Moguerza,
A Redchuk
A
LTEX
Statistical Training
The Problem
Approaches
LaTeX is a high-quality typesetting system; it
The R Choice
The R framework
includes features designed for the production
Sweave
of technical and scientific documentation
Application
Six Sigma
Examples
Environments Beamer
Beamer is a LaTeX class for creating
presentations that are held using a projector,
but it can also be used to create transparency
slides
LTEXFiles can easily be converted to PDF.
A
SEIO 2012 10/28
Using R for
Statistical Training
Sweave Documents
17/04/2012
An Efficient Framework
EL Cano,
JM Moguerza,
A Redchuk
Statistical Training
The Problem
Approaches
The R Choice
Sweave
The R framework
Sweave
A Sweave document is a plain-text file which
Application merges LTEX code and R code. The R
A
Six Sigma
Examples
Environments
function Sweave() converts the Sweave
document (*.Rnw) into a LTEXfile (*.tex).
A
The code chunks are executed and the results
embedded into the LTEX file.
A
SEIO 2012 11/28
Using R for
Statistical Training
Contenido
17/04/2012
EL Cano,
JM Moguerza,
A Redchuk 1 Statistical Training
Statistical Training The Problem
The Problem
Approaches Approaches
The R Choice
The R framework
Sweave 2 The R Choice
Application
Six Sigma
The R framework
Examples
Environments Sweave
3 Application
Six Sigma
Examples
Environments
SEIO 2012 12/28
Using R for
Statistical Training
Methodology at a Glance
17/04/2012
Six Sigma
EL Cano,
JM Moguerza,
A Redchuk
Statistical Training
The Problem
The Essense
Approaches
The application of the Scientific Method to
The R Choice
The R framework
Sweave
process improvement, using an easy language.
Application
Six Sigma
Examples DMAIC Cycle
Environments
Roles
Define
Champion
Measure
Master Black Belt
Analyze
Black Belt
Improve
Green Belt
Control
SEIO 2012 13/28
Using R for
Statistical Training
SixSigma Package
17/04/2012
Six Sigma
EL Cano,
JM Moguerza, Six Sigma with R | Paper Helicopter template
Using packages
max
A Redchuk (9.5cm)
std
(8cm)
Statistical Training
The Problem
min
(6.5cm)
Manuals
Approaches
Data sets
← wings length →
The R Choice
The R framework
Sweave
Templates
cut
Application
Learn-by-Code
?
pe
Six Sigma fold ↑ fold ↓
ta
Examples
Environments
cut
Six Sigma Process Map
operators
INPUTS
cut cut tools
X raw material
facilities
← body length → INSPECTION ASSEMBLY TEST LABELING
sheets sheets helicopter helicopter
...
INPUTS
INPUTS
INPUTS
INPUTS
tape?
tape?
Param.(x): width NC Param.(x): operator C Param.(x): operator C Param.(x): operator C
operator C cut P throw P label P
Measure pattern P fix P discard P Featur.(y): label
discard P rotor.width C environment N
Featur.(y): ok rotor.length C Featur.(y): time
paperclip C
tape C
min Featur.(y): weight
(6.5cm)
LEGEND
std helicopter
OUTPUTS
fold ↓ ↓
fold ↑ ↑
(C)ontrollable
(8cm) (Cr)itical
(N)oise
Y
(P)rocedure
clip? max
Paper Helicopter Project
max min ← body width → min max (9.5cm)
SEIO 2012 (6cm) (4cm) (4cm) (6cm) 14/28
Using R for
Statistical Training
Book
17/04/2012
Six Sigma
EL Cano,
JM Moguerza,
A Redchuk
Six Sigma with R
Statistical Training
The Problem A live example: The entire book has been
Approaches
The R Choice
produced using Sweave.
The R framework
Sweave
Application The roadmap: The
Six Sigma
Examples
Environments
DMAIC Cycle
The case study: paper
helicopter
SixSigma package: data
sets, functions
Easy explanations,
further readings
SEIO 2012 15/28
Using R for
Statistical Training
Sweave Example I
17/04/2012
Six Sigma Application
EL Cano,
JM Moguerza,
A Redchuk
documentclass [ a4paper ]{ article }
Statistical Training usepackage { Sweave }
The Problem title { Design of Experiments }
Approaches author { EL Cano and JM Moguerza and A Rechuk }
The R Choice begin { document }
The R framework maketitle
Sweave section { Introduction }
Application Design of experiments is the most important took in the I
Six Sigma DMAIC cycle ldots .
Examples
< < > >=
Environments
library ( SixSigma )
doe . model1 <- lm ( score ~ flour + salt + bakPow +
flour * salt + flour * bakPow +
salt * bakPow + flour * salt * bakPow ,
data = ss . data . doe1 )
summary ( doe . model1 )
@
This is the general model :
begin { equation }
label { eq : doe : model }
SEIO 2012 16/28
Using R for
Statistical Training
Sweave Example II
17/04/2012
Six Sigma Application
EL Cano,
JM Moguerza,
A Redchuk
Statistical Training
The Problem
y_ { ijkl }= mu + alpha_i + beta_j + gamma_k +( alpha beta ) _ { ij }
Approaches ( alpha gamma ) _ { ik }+( beta gamma ) _ { kl }+( alpha beta gamma
The R Choice
varepsilon_ { ijkl } ,
The R framework
end { equation }
Sweave And here we have a plot of effects :
Application
Six Sigma << maineff , echo = FALSE , fig = TRUE > >=
Examples plot ( c ( -1 , 1) , ylim = range ( ss . data . doe1$score ) ,
Environments
coef ( doe . model1 )[1] + c ( -1 , 1) * coef ( doe
type =" b " , pch =16)
abline ( h = coef ( doe . model1 )[1])
@
% input { section2 }
end { document }
SEIO 2012 17/28
Using R for
Statistical Training
Project Example
17/04/2012
Divide and Conquer!
EL Cano,
JM Moguerza,
A Redchuk
Strategies
Statistical Training
The Problem
Approaches
Partial Sweave files can be compiled to get
The R Choice partial LTEX files. R scripts can Sweave .Rnw
A
The R framework
Sweave files and “source” .R files. The final document
Application
Six Sigma is obtained by compiling the “master”
Examples
Environments LTEX file.
A
> source("code/myoptions.R")
> source("code/myfunctions.R")
> source("code/mydata.R")
> Sweave("rnw/theorem01.Rnw")
> Sweave("rnw/lesson01.Rnw")
> Sweave("rnw/exercises01.Rnw")
> ...
> texi2pdf("master.tex")
SEIO 2012 20/28
Using R for
Statistical Training
Some useful extensions
17/04/2012
Packages
EL Cano,
JM Moguerza,
A Redchuk
knitr, pgfSweave: enhanced options for
Statistical Training
The Problem
Sweave
Approaches
The R Choice
RGIFT: Automatic generation of
The R framework
Sweave questionnaires for Moodle
Application
Six Sigma exams: Automatic generation of printable
Examples
Environments exams
odfWeave: Open Document format
documents generation
More in the “Reproducible Research” Task
View at CRAN.
http://cran.r-project.org/web/views/
ReproducibleResearch.html
SEIO 2012 21/28
Using R for
Statistical Training
R GUI
17/04/2012
Integrated Environments
EL Cano,
JM Moguerza,
A Redchuk
Statistical Training
The Problem
Approaches
The R Choice
The R framework
Sweave
Application
Six Sigma
Examples
Environments
SEIO 2012 22/28
Using R for
Statistical Training
R Studio
17/04/2012
Integrated Environments
EL Cano,
JM Moguerza,
A Redchuk
Statistical Training
The Problem
Approaches
The R Choice
The R framework
Sweave
Application
Six Sigma
Examples
Environments
SEIO 2012 23/28
Using R for
Statistical Training
EMACS + ESS
17/04/2012
Integrated Environments
EL Cano,
JM Moguerza,
A Redchuk
Statistical Training
The Problem
Approaches
The R Choice
The R framework
Sweave
Application
Six Sigma
Examples
Environments
SEIO 2012 24/28
Using R for
Statistical Training
Eclipse + StatET
17/04/2012
Integrated Environments
EL Cano,
JM Moguerza,
A Redchuk
Statistical Training
The Problem
Approaches
The R Choice
The R framework
Sweave
Application
Six Sigma
Examples
Environments
SEIO 2012 25/28
Using R for
Statistical Training
Summary
17/04/2012
EL Cano,
JM Moguerza,
A Redchuk
Statistical training entail some challenges
regarding contents and materials.
Statistical Training
The Problem
Approaches
The R Choice
The R framework
Sweave
Application
Six Sigma
Examples
Environments
SEIO 2012 26/28
Using R for
Statistical Training
Summary
17/04/2012
EL Cano,
JM Moguerza,
A Redchuk
Statistical training entail some challenges
regarding contents and materials.
Statistical Training
The Problem
Approaches
R is the perfect partner for statistical
The R Choice
The R framework
training.
Sweave
Application
Six Sigma
Examples
Environments
SEIO 2012 26/28
Using R for
Statistical Training
Summary
17/04/2012
EL Cano,
JM Moguerza,
A Redchuk
Statistical training entail some challenges
regarding contents and materials.
Statistical Training
The Problem
Approaches
R is the perfect partner for statistical
The R Choice
The R framework
training.
Sweave
Application
Reproducible research and literate
Six Sigma
Examples
programming enhance training materials
Environments
quality.
SEIO 2012 26/28
Using R for
Statistical Training
Summary
17/04/2012
EL Cano,
JM Moguerza,
A Redchuk
Statistical training entail some challenges
regarding contents and materials.
Statistical Training
The Problem
Approaches
R is the perfect partner for statistical
The R Choice
The R framework
training.
Sweave
Application
Reproducible research and literate
Six Sigma
Examples
programming enhance training materials
Environments
quality.
The use of R and LTEX through Sweave,
A
comprise a complete framework for
statistical documentation generation.
SEIO 2012 26/28
Using R for
Statistical Training
Summary
17/04/2012
EL Cano,
JM Moguerza,
A Redchuk
Statistical training entail some challenges
regarding contents and materials.
Statistical Training
The Problem
Approaches
R is the perfect partner for statistical
The R Choice
The R framework
training.
Sweave
Application
Reproducible research and literate
Six Sigma
Examples
programming enhance training materials
Environments
quality.
The use of R and LTEX through Sweave,
A
comprise a complete framework for
statistical documentation generation.
Extensions and integrated environments
make easy exploiting the R capabilities.
SEIO 2012 26/28
Using R for
Statistical Training
Acknowledgements
17/04/2012
EL Cano,
JM Moguerza,
A Redchuk
Statistical Training
The Problem
Approaches R Core Team and R enthusiasts in general.
The R Choice Springer
The R framework
Sweave
Application This work has been partially funded by the projects:
Six Sigma AGORANET project (IPT-430000-2010-32)
Examples VRTUOSI www.vrtuosi.org: 502869-LLP-1-2009-ES-ERASMUS-EVC)
Environments
HAUS: IPT-2011-1049-430000
EDUCALAB: IPT-2011-1071-430000
DEMOCRACY4ALL: IPT-2011-0869-430000
CORPORATE COMMUNITY: IPT-2011-0871-430000
SEIO 2012 27/28
Using R for
Statistical Training
Discussion
17/04/2012
EL Cano,
JM Moguerza,
A Redchuk
Statistical Training
The Problem
Approaches
The R Choice
The R framework
Sweave
Thanks for your
Application
Six Sigma
Examples
Environments
attention !
SEIO 2012 28/28