Applied Statistics And Doe Mayank

Applied Statistics and DOEMayank

Applied StatisticsMeasures of central tendency (central position of data)µMeanPopulation :Sample:MedianModeMeasures of dispersion (spread of data)Varianceσ2s2Population :Sample:Standard deviationσsPopulation :Sample:Coefficient of variation

Measures of Central tendencyData: 34, 43, 81, 106, 106 and 115MeanAverage Σx/n =80.83ModeHighest frequency =106MedianMiddle score (81+106)/2 =93.5

Measures of dispersionVariance:Standard deviation:xSSSS/(n-1)MSsd√MSMost of the data lies between 44.5±4,57 = 39to 49

Measures of dispersionCoefficient of VarianceCV = s/ *100% 4.57/44.5*100% = 10.28%Standard deviation is 10.28% of the mean

Measures of dispersionNormal DistributionExample: IQ Score

Measures of dispersionNormal DistributionIQ ScoreCountScore<55115130145100857055145<

Measures of dispersionNormal Distribution34.13%34.13%Probability13.59%13.59 %Score2.14%2.14%0.13%0.13%0.0031%0.0031%0.000028%0.000028%Sd from-6σ-5σ-4σ-2σ-1σ1σ2σ3σ-3σ5σ6σ4σμ68.2689%95.4499%99.7300%99.9936%99.999942669%99.999999802%

Measures of dispersionNormal DistributionSix SigmaDPMODPHOLSLUSLSd from-6σ-5σ-4σ-2σ-1σ1σ2σ3σ-3σ5σ6σ4σμ99.999999802%

Measures of dispersionNormal DistributionLSLLSLUSLUSL

Measures of dispersionNormal Distribution1.5 σLSLUSL3.4 DMPO-6σ-5σ-4σ-2σ-1σ1σ2σ3σ4σ-3σ5σ6σμ

Statistical significance testsSignificance tests Z- test t- test F- test ANOVA

Statistical significance testsZ - test Z-value :How many standard deviations away from mean?+ve z: values are above the mean, -ve z: values are below the meanPopulationSampleGroup compared to population1 point compared to population

Statistical significance testsZ - test Sample :BMIMean ( ) = 26.20Standard deviation (s) = 6.57What is the probability that of a person having BMI 19.2 sdbelow the mean19.2 sd above the meanA person with a BMI of 19.2 has a z score of:So this person has a BMI 1.07 standard deviations below the mean

Statistical significance testsZ - test Sample :Probability<19.6>19.6Sd16 %84 %-1σμStandard deviationZ score0-1

Statistical significance testsZ - test Population :Test group : Employee having two wheelerTest : Commuting time from home to BioconClaim : Average commuting time is less than 24 minAt 0.01 level of significance (α=0.01):Is there enough evidence to support the research claim???Samples : 3018 16 23 19 25 48 13 17 20 2316 21 18 16 29 15 8 19 20 715 16 24 15 6 11 14 23 18 12

Statistical significance testsZ - test Population :Assumption: Population is normally distributed ProbabilityScore24MeanX

Statistical significance testsZ - test Population :Hypothesis testingTest vs PopulationComparison of means:Null hypothesis : H0No difference (Claim not true)H0 : x ≥ µµ = 24Alternate hypothesis : H1It is different (Claim is true)H1 : x < µ

Statistical significance testsZ - test Population :ProbabilityProbability24MeanXZ valueScoreLevel of significanceα = 0.01CriticalvalueZ0-2.33

Statistical significance testsZ - test Population :Ztest< ZcriticalZtest>ZcriticalRejection regionAcceptance region-2.33Z = 18.2s = 7.7Z = - 4.13µ = 24n = 30

Statistical significance testsZ - test Population :Rejection region-2.33- 4.13ZSo is test value is significantly different (lower) than the mean Yes: There are significant evidence to reject the null hypothesisH0 : s ≥ 24Rejectedand therefore accept the claimH1 : s < 24Significantly supported

Statistical significance testst - testComparison of means between two groupsH0: H1: Null hypothesis will be rejectedttest > tcriticalNull hypothesis will not be rejectedttest < tcritical

Statistical significance testst - testComparison of means between two groupsSignalDifference between group meanst = =NoiseVariability of groups

Statistical significance testst - testEffect of fertilizer on plant heightCase 1Fertilizerw/o Fertilizer27.15 – 17.9t test = = 2.4t critical with 38 df at 0.05 significance level= 2.03Plant heightdf = 2n-2ttest > tcriticalSo is significantly different from H0: RejectedH1: s2

Statistical significance testst - testCase 2Fertilizerw/o Fertilizert critical =2.031.3t test = Plant heightttest < tcriticalSo is not significantly different from H0: Not rejectedRejectedH1: s2

Statistical significance testst - testOverview

Statistical significance testsF - testComparison of variances where and are the sample variancesF =The F hypothesis test is defined as:H0: =RejectedHa: <>≠If Ftest > Fcritical (at significant level)

Statistical significance testsANOVAANalysisOf VArianceOne way : Effect of one factor (variable)Two way : Effect of two factors (variables)

Effect of interactionStatistical significance testsOne way ANOVAStrategy:Compare variability within group MSwg to between groups MSbgMSbgF = MSwgGroup 1Group 1Group 2Group 2Between groupsWithin groups

Statistical significance testsOne way ANOVAIs there any impact of exam room temperature on student performance?Factor ( Independent Variable): Temperature (cold, optimum, hot)Effect ( Dependent Variable): Score (marks obtained)Null hypothesis (H0) : No effect (µ1= µ2 = µ3)Alternate hypothesis (H1) : There is an effect (µ1 ≠ µ2 ≠ µ3)

Statistical significance testsOne way ANOVACOHColdOptHotNumber of AttendeesSS= X̄

Statistical significance testsOne way ANOVAMSbg==F = 6.40MSwgFcriticalfor Numerator degrees of freedom : 2Denominator degrees of freedom : 33 At significance level (α) : 0.05=4.17Ftest > FcriticalSo there are enough evidence to reject null hypothesisH0: All means are same (no effect of Temperature)RejectedAt 95% confidence level we can say:That the variation between means is not just by chanceExamination Room temperature matters significantly

Statistical significance testsTwo way ANOVAFactors ( Independent Variable): 1) Gender:Man Woman2) Type of sport Indoor OutdoorEffect ( Dependent Variable): 1) Number of participantsRelative impact of gender or type of sprot?Any interaction between gender and type of sport?Null hypothesis (H0a) : No effect of gender Null hypothesis (H0b) : No effect of type of sportNull hypothesis (H0c) : No interaction Alternate hypothesis (H1) : There is an effect

Statistical significance testsTwo way ANOVAMan Woman s↓g->IndoorOutdoor

Statistical significance testsTwo way ANOVAIndoor OutdoorNull hypothesis (H0a) : No effect of genderRejectedRejectedNull hypothesis (H0b) : No effect of type of sportsRejectedNull hypothesis (H0c) : No interaction

Statistical significance testsTwo way ANOVAFactors ( Independent Variable): 1) Temperature:30 352) pH 5 7Effect ( Dependent Variable): 1) Total product (g)pH 7pH 530o C 35o C

Regression and correlationRegression analysis:Investigation of relationship between variables

Regression and correlationRegression analysis:Investigation of relationship between variablesy = -0.951x + 50.49y = ax +bR² = 0.955One independent variableSimple linear regression

Regression and correlationRegression analysis:Simple linear regressiony = ax + bNon linearMultiple linear regressiony = a1x1+ a2x2+ a11 x2 + a12 x1x2+by = a1x1+ a2x2+ a3x3+ bLinearNon Linear

Regression and correlationCorrelation analysis:To find how well (or badly) a line fits the observationWhat is the strength of this relationship- r2 (coefficient of determination) or adjusted r2Is the relationship we have described statistically significant?-Significant tests

Regression and correlationCorrelation analysis:ŷ = ax + binterceptslopeε= ŷ, predicted value= y i, true valueε =residual error =y - ŷA and b values are calculated that minimize Sum of Squares (SS) of residuals =Σ (y – ŷ)2 : minimum

Regression and correlationCorrelation analysis:r2 : Coefficient of determinationErrorTotal(yi – y)2(y – ŷ)2Always between 0 and 1Increase with number of predictorSSError= 1-r2SSTotalIt can be negative alsoSSError/(n-p-1)Adjusted r2= 1- True representative of relationship strengthSSTotal/(n-1)n= total observationp= Number of predictor

MSbgMSModel==FFMSwgMSErrorGroup 1Group 1Group 2Group 2Regression and correlationCorrelation analysis:Statistical significance of relationshipErrorModel

Design of experimentTraditional methodOne factor at time (OFAT)Statistical methodMultiple factor at time (MFAT)

Design of experimentHow to select a design?

Design of experiment- terminologyIndependent variable/sFactorsContinuousNumeric: any value between lower and upper valueeg. Temperature, pH, concentrationCategoricalNumeric/non-numeric : only characters or levelseg. Gender, operator, type, temperatureLevels-1(lower)+1(higher)0(middle)Range of a factor/sEffectsDependent variable/s: ResponseMain effect/sEffect/s due to individual factor/sInteraction effect/sEffect/s due to interaction of multiple factorsConfounding/AliasingWhen two or more effects can not be distinguishedeg. Main effect is confounded with interaction effects Main effects and interaction effects are aliased

Design of experimentResolution of a designPower of a designHigher order interaction are less significant than lower order interaction

Design of experimentFactorial designFactorLfFull factorial:Level

Design of experimentFactorial design22ba4 experiments

Design of experimentFactorial design23cba8 experiments

Design of experimentFactorial design32ba9 experiments

Design of experimentFactorial design33cb27 experiments

Design of experimentFractional Factorial design23-1238 experiments4 experiments

Design of experimentResponse surface methodology

Design of experimentGeometry of some important response surface designsBox - Behnkeneg. 3 factor 3 level12 experiments

Design of experimentGeometry of some important response surface designsCentral composite designeg. 2 factor 2level+=

Design of experimentGeometry of some important response surface designsTaguchi designSignalMedia, pH, feed rateInner array:Controllable variables during productionOuter array:Uncontrollable variables during productionNoiseTemp, DO,

Applied Statistics And Doe Mayank

More Related Content

What's hot

Similar to Applied Statistics And Doe Mayank

Applied Statistics And Doe Mayank