SlideShare a Scribd company logo
Page 1 of 52
Hypothesis testing and interpretation of data
Testing Of Hypothesis
The basic logic of hypothesis testing is to prove or disprove the research question. When a
researcher conducts quantitatively research, he/she is attempting to answer a research question or
hypothesis that has been formulated .One method of evaluating this research question is via a
process called hypothesis testing, which is sometimes also referred to as significance testing.
Example :
Two lecturers, Sandy and Mandy, thinks that they use the best method to teach their students.
Each lecturer has 50 statistics student who are studying a graduate degree in management. In
sandy’s class, students have to attend one lecture and one seminar class every week, whilst in
Mandy believes that lectures are sufficient by themselves in their own time. This is the first year
that Sandy has given seminars, but since they take up a lot of her time, she wants to make sure
that she is not wasting her time and that seminars improve the students’ performance.
The ResearchHypothesis
The first step in hypothesis testing is to set a research hypothesis. In a sandy and mandy,s study,
the aim is to examine the effect that two different teaching methods – providing both lectures
and seminars classes (sandy), and providing only lectures by themselves (mandy) – had on the
performance of the students. More specifically , they want to determine whether performance is
different between the two different teaching methods. Whilst mandy is skeptical about the
Page 2 of 52
effectiveness of seminars, sandy clearly believes that students do better than those in mandy’s
class. This leads to the following research hypothesis:
Researchhypothesis: When student attend seminar classes, in addition to lecture, their
performance increases.
By taking a hypothesis testing approach, Sandy and Mandy want to generalize their result toa
population(total students) rather than just the students in their sample. However, in order to use
hypothesis testing, one needs to re-state the research hypothesis as a null and alternative
hypothesis.
Null hypothesis : the null hypothesis (H0) is a hypothesis which the researcher tries to disprove,
reject or nullify. A null hypothesis is “the hypothesis that there is no relationship between two or
more variables, symbolized as H0.
Alternative hypothesis: the alternate, or research, hypothesis proposes a relationship between two
or more variables, symbolized as H1.
Decision errors
Two type of errors can result from a hypothesis test.
TypeⅠerror : A typeⅠerror occurs when the researcher rejects anull hypothesis when it is true.
The probability of committing a type error is called the significance level. This probability is
also called alpha, and is often denoted by α
Page 3 of 52
Type Ⅱerror : A Type Ⅱ error occurs when the researcher fails to reject a null hypothesis,
which is false. The probability of committing a Type Ⅱ error is called Beta, and is often
denoted by β . The probability of not committing a TypeⅡ error is called the Power of
the test.
Page 4 of 52
Steps/procedures in Hypothesis Testing
1. Identify the research problem :
The first step is to state the research problem The research problem needs to identify the
population of interest ,and the variables under investigation.
Example of research problem: To find out the effectiveness of two teaching methods- only
lecture method- with reference to exam marks of the students.
In the above research problem, the population of interest refers to the student, and the variable
include the teaching methods and the marks.
This step enable the researcher not only define what is not to be tested but what variable(s) will
be used in sample data collection. The type of variable(s), wheter categorical, discreate or
continuous, further defines the statistical test which can be performed on the collected data.
2.Specific the null and alternative Hypothesis:
The research problem or question is converted into a null hypothesis and an alternative
hypothesis. The hypothesis. The hypotheses are started in such a way that they are mutually
exclusive. That is, if one is true, the other must be false.
(a)Null Hypothesis: A null hypothesis (H0)is a statement that declares the observed difference is
due to “chance”. It is the hypothesis the researcher hopes to reject or disprove.
A null hypothesis states that there is no relationship between two or more variables. The
simplistic definition of the null is - as the opposite of the alternative hypothesis(H1).
Example: “There is no difference between the two methods of teaching( only lecture method,
and lecture-cum-seminar method) on the scoring of marks of student.”
Page 5 of 52
(b) Alternative Hypothesis:
The alternate hypothesis proposes a relationship between two or more variables, symbolized as
H1.
Example: “The lecture-cum-seminar method improves the scoring of marks of students as
compared to the only lecture method.”
“Note that the two hypotheses we propose to test must be mutually exclusive i.e., when one is
true the other must be false. And we see that they must be exhaustive; they must be include all
possible occurrences.”
From the above, it is clear that the null hypothesis is a hypothesis of no difference. The main
problem of testing of hypothesis is to accept or to reject the null hypothesis. The alternative
hypothesis specifies a definite relationship between the two variables. Only one alternative
hypothesis is tested against the null hypothesis.
3. Significance Level:
After formulating the hypotheses, the researcher must determine a certain level of significance.
The confidence with which a null hypothesis is accepted or rejected depends on the level of
significance.
Generally, the level of significance falls between 5%and 1%:
A significance level of 5% means the risk of making a wrong decision in accepting a false
hypothesis or in rejecting a true hypothesis by 5 times out 100 occasions.
A significance level of 1% means the ris of making a wrong decision is 1%. This means the
researcher may make o
Page 6 of 52
A wrong decision in accepting a false hypothesis or in rejecting a true hypothesis is once out of
100 occasions. Therefore, a 1% level of significance provides greater confidence with which null
hypothesis is accepted or rejected as compared to 5% level of significance.
4.Test Statistic:
A statistic used to test the null hypothesis. The researcher needs to identify a test statistic that can
be used to assess the truth of the null hypothesis. It is used to test whether the null hypothesis set
up should be accepted or rejected.
Test statistic is calculated from the collected data. There are different types of test statistics. For
instance, the z statistic will compare the observed sample mean to an expected population mean
μ0. Large test statistics indicate data are far from expected, providing evidence against the null
hypothesis and in favor of the alternative hypothesis.
Every test in statistics indicate the same. Based on the sample data, it gives the probability( P-
Value) that can be observed. When the P-Value is low, it means the sample data are very
significant and it indicates that the null hypothesis is wrong. When the P-value is high, it
suggests that the null hypothesis is wrong. When the P-value is high, it suggest that the collected
data are within the normal range.
5.Region of Acceptance and Region of Rejection :
The region of acceptance is a range of values. If the test statistic falls within the region of
acceptance, the null hypothesis is not rejected. The region of acceptance is defined so that the
chance of making a Type Ⅰerror is equal to the Alpha(α) level of significance.
Page 7 of 52
Type Ⅰerror –A rejection of a true null hypothesis
The set of values outside the region of acceptance is called the region of rejection. If the test
statistics falls within the region of rejection, the null hypothesis is rejected at the Alpha (α) level
of significance.
6. Select an Appropriate Test:
A hypothesis test may be one-tailed or two-tailed. Whether the test is one sided or 2 sided
depends on alternative hypothesis and nature of the problem.
A test of a statistical hypothesis, where the region of rejection is on only one side of the sampling
distribution, is called a one-tailed test. For example, suppose the null hypothesis states that the
mean is less than equal to 10. The alternative hypothesis would be that the mean is greater than
10. The region of rejection would consist of a range of numbers located on the right side of
sampling distribution; that is, a set of numbers greater than 10.
In simple words, in one tailed test, the test statistic for rejection of null hypothesis falls only in
one side of sampling distribution curve.
Page 8 of 52
Significance Level
In hypothesis testing, the significance level is the criterion used for rejecting the null hypothesis.
The significance level is used in hypothesis testing as follows: First, the difference between the
results of the experiment and the null hypothesis is determined. Then, assuming the null
hypothesis is true, the probability of a difference that large or larger is computed . Finally, this
probability is compared to the significance level. If the probability is less than or equal to the
significance level, then the null hypothesis is rejected and the outcome is said to be statistically
significant. Traditionally, experimenters have used either the 0.05 level (sometimes called the
5% level) or the 0.01 level (1% level), although the choice of levels is largely subjective. The
lower the significance level, the more the data must diverge from the null hypothesis to be
significant. Therefore, the 0.01 level is more conservative than the 0.05 level. The Greek letter
alpha (α) is sometimes used to indicate the significance level. See also: Type I
error and significance test
Page 9 of 52
5) Identify the rejection region
• Is it an upper, lower, or two-tailed test?
• Determine the critical value associated with , the level of significance of the test
The third step is to compute the probability value (also known as
the p value). This is the probability of obtaining a sample statistic as
different or more different from the parameter specified in the null
hypothesis given that the null hypothesis is true.
Page 10 of 52
Page 11 of 52
Page 12 of 52
Hypothesistesting
Page 13 of 52
Page 14 of 52
PARAMETRICTESTS
1. Descriptive Statistics – overview of the attributes of a data set. These include measurements
of central tendency (frequency histograms, mean, median, & mode) and dispersion (range,
variance & standard deviation)
2. Inferential Statistics - provide measures of how well data support hypothesis and if data are
generalizable beyond what was tested (significance tests)
Data: Observations recorded during research
Types of data:
1. Nominal data synonymous with categorical data, assigned names/ categories based on
characters with out ranking between categories.ex. male/female, yes/no, death /survival
2. Ordinal data orderedorgradeddata, expressedas Scores or ranks
ex.paingradedas mild,moderate andsevere
3. Interval data an equal and definite interval betweentwomeasurements
itcan be continuousordiscrete
ex.weightexpressedas20, 21,22,23,24
interval between20& 21 is same as 23 &24
Page 15 of 52
Page 16 of 52
ParametricHypothesis testsare frequentlyusedtomeasure the qualityof sampleparametersorto test
whetherestimatesonagivenparameterare equal fortwosamples.
ParametricHypothesistestssetupanull hypothesisagainstanalternative hypothesis,testing,for
instance,whetherornot the populationmeanisequal toacertainvalue,andthenusingappropriate
statisticstocalculate the probabilitythatthe null hypothesisistrue.Youcan thenrejector accept the
null hypothesisbasedonthe calculatedprobability.
Page 17 of 52
Z test
z-testisbasedonthe normal probabilitydistributionandisusedforjudgingthe significance of several
statistical measures,particularlythe mean.The relevantteststatistic,z,isworkedoutandcompared
withitsprobable value (tobe readfromtable showingareaundernormal curve) ata specifiedlevelof
significance forjudgingthe significanceof the measure concerned.Thisisa mostfrequentlyusedtestin
researchstudies.Thistestisusedevenwhenbinomial distributionort-distributionisapplicable onthe
presumptionthatsucha distributiontendstoapproximate normal distributionas‘n’becomeslarger.z-
testis generallyusedforcomparingthe meanof a sample tosome hypothesisedmeanforthe
populationincase of large sample,orwhenpopulationvarianceisknown.z-testisalsousedforjudging
he significance of difference betweenmeansof twoindependentsamplesincase of large samples,or
whenpopulationvariance isknown.z-testisalsousedforcomparingthe sample proportiontoa
theoretical value of populationproportionorforjudgingthe difference inproportionsof two
independentsampleswhennhappenstobe large.Besides,thistestmaybe usedforjudgingthe
significance of median,mode,coefficientof correlationandseveral othermeasures.t-testisbasedont-
distributionandisconsideredanappropriate testforjudgingthe significance of asample meanorfor
judgingthe significance of difference betweenthe meansof twosamplesin case of small sample(s)
whenpopulationvariance isnotknown(inwhichcase we use variance of the sample asanestimate of
the populationvariance).Incase twosamplesare related,we use pairedt-test(orwhatisknownas
difference test) forjudging the significance of the meanof differencebetweenthe tworelatedsamples.
It can alsobe usedforjudgingthe significance of the coefficientsof simpleandpartial correlations.The
relevantteststatistic,t,iscalculatedfromthe sample dataandthen comparedwithitsprobable value
basedon t-distribution(tobe readfromthe table thatgivesprobable valuesof tfor differentlevelsof
significance fordifferentdegreesof freedom)ata specifiedlevel of significance forconcerningdegrees
of freedomforacceptingorrejectingthe null hypothesis.Itmaybe notedthatt-testappliesonlyincase
of small sample(s) whenpopulationvarianceisunknown.
A Z-testisany statistical testforwhichthe distribution of the teststatisticunderthe null hypothesis can
be approximatedbyanormal distribution.Because of the central limittheorem,manyteststatisticsare
approximately normallydistributedforlarge samples.Foreachsignificance level,the Z-testhasa single
critical value (forexample,1.96for 5% two tailed) whichmakesitmore convenientthanthe Student's t-
testwhichhas separate critical valuesforeachsample size.Therefore,manystatistical testscanbe
convenientlyperformedasapproximate Z-testsif the sample sizeislarge orthe populationvariance
known.If the population variance isunknown(andtherefore hastobe estimatedfromthe sample itself)
and the sample size isnotlarge (n< 30), the Student's t-testmaybe more appropriate.
If T isa statisticthatis approximatelynormallydistributedunderthe null hypothesis,the nextstepin
performingaZ-testisto estimate the expectedvalue θof T underthe null hypothesis,andthenobtain
an estimate sof the standard deviation of T.Afterthatthe standard score Z = (T − θ) / s iscalculated,
fromwhich one-tailedandtwo-tailedp-valuescanbe calculatedasΦ(−Z) (forupper-tailedtests),Φ(Z)
(forlower-tailedtests) and2Φ(−|Z|) (fortwo-tailedtests)where Φisthe standard normalcumulative
distributionfunction.
Page 18 of 52
Use inlocationtesting[edit]
The term "Z-test"isoftenusedtoreferspecificallytothe one-samplelocationtest comparingthe mean
of a setof measurementstoa givenconstant.If the observeddata X1,..., Xn are (i) uncorrelated,(ii) have
a commonmean μ, and(iii) have acommonvariance σ2
,thenthe sample average X hasmeanμ and
variance σ2
/ n.If ournull hypothesisisthatthe meanvalue of the populationisagivennumberμ0,we
can use X −μ0 as a test-statistic,rejectingthe null hypothesisif X −μ0islarge.
To calculate the standardizedstatisticZ= (X − μ0) / s, we needtoeitherknow orhave an approximate
value forσ2
, fromwhichwe can calculate s2
= σ2
/ n.In some applications,σ2
isknown,butthisis
uncommon.If the sample size ismoderate orlarge,we can substitute the samplevariance forσ2
,giving
a plug-in test.The resultingtestwill notbe anexactZ-testsince the uncertaintyinthe sample variance is
not accountedfor— however,itwill be agoodapproximationunlessthe sample sizeissmall.A t-
testcan be usedto accountfor the uncertaintyinthe sample variance whenthe sample sizeissmall and
the data are exactly normal.There isnouniversal constantatwhichthe sample size isgenerally
consideredlarge enoughtojustifyuse of the plug-intest.Typical rulesof thumbrange from20 to50
samples.Forlargersample sizes,the t-testprocedure givesalmostidentical p-valuesasthe Z-test
procedure.
Otherlocationteststhatcan be performedas Z-testsare the two-sample locationtestandthe paired
difference test.
Conditions[edit]
For the Z-testto be applicable,certainconditionsmustbe met.
 Nuisance parameters shouldbe known,orestimatedwithhighaccuracy(anexample of a
nuisance parameterwouldbe the standarddeviation inaone-sample locationtest). Z-tests
focuson a single parameter,andtreatall otherunknownparametersasbeingfixedattheirtrue
values.Inpractice,due to Slutsky'stheorem,"pluggingin"consistentestimatesof nuisance
parameterscan be justified.Howeverif the sample sizeisnotlarge enoughforthese estimates
to be reasonablyaccurate,the Z-testmaynot performwell.
 The test statisticshouldfollowa normal distribution.Generally,one appealstothe central limit
theoremtojustifyassumingthatateststatisticvariesnormally.There isagreatdeal of
statistical researchonthe questionof whenateststatisticvariesapproximatelynormally.If the
variationof the teststatisticisstronglynon-normal,aZ-testshouldnotbe used.
If estimatesof nuisance parametersare pluggedinasdiscussedabove,itisimportanttouse estimates
appropriate forthe waythe data were sampled.Inthe special case of Z-testsforthe one ortwo sample
locationproblem,the usual samplestandarddeviation isonlyappropriate if the datawere collectedas
an independentsample.
In some situations,itispossible todevise atestthat properlyaccountsforthe variationinplug-in
estimatesof nuisance parameters.Inthe case of one and twosample locationproblems,a t-testdoes
this.
Example[edit]
Page 19 of 52
Suppose thatina particulargeographicregion,the meanandstandarddeviationof scoresona reading
testare 100 points,and12 points,respectively.Ourinterestisinthe scoresof 55 studentsina particular
school whoreceivedameanscore of 96. We can askwhetherthismeanscore issignificantlylowerthan
the regional mean — that is,are the studentsinthisschool comparable toa simple randomsample of
55 studentsfromthe regionasa whole,orare theirscoressurprisinglylow?
We beginbycalculatingthe standarderrorof the mean:
where isthe populationstandarddeviation
Nextwe calculate the z-score,whichisthe distance fromthe sample meantothe populationmeanin
unitsof the standarderror:
In thisexample,we treatthe populationmeanandvariance asknown,whichwouldbe appropriateif all
studentsinthe regionwere tested.Whenpopulationparametersare unknown,attest shouldbe
conductedinstead.
The classroommeanscore is96, whichis−2.47 standarderror unitsfromthe populationmeanof 100.
Lookingupthe z-score ina table of the standard normal distribution,we findthatthe probabilityof
observingastandardnormal value below -2.47is approximately0.5- 0.4932 = 0.0068. This isthe one-
sidedp-value forthe null hypothesisthatthe 55 studentsare comparable toa simple randomsample
fromthe populationof all test-takers.The two-sidedp-valueisapproximately0.014 (twice the one-
sidedp-value).
Anotherwayof statingthingsisthat withprobability1 − 0.014 = 0.986, a simple randomsample of 55
studentswouldhave ameantestscore within4 unitsof the populationmean.We couldalsosaythat
with98.6% confidence we rejectthe null hypothesis thatthe 55 test takersare comparable to a simple
randomsample fromthe populationof test-takers.
The Z-testtellsusthat the 55 studentsof interesthave anunusuallylow meantestscore comparedto
mostsimple randomsamplesof similarsize fromthe populationof test-takers.A deficiencyof this
analysisisthatit doesnotconsiderwhethertheeffectsize of 4pointsismeaningful.If insteadof a
classroom,we consideredasubregioncontaining900 studentswhose meanscore was99, nearlythe
same z-score and p-value wouldbe observed.Thisshowsthatif the sample size islarge enough,very
small differencesfromthe null value canbe highlystatisticallysignificant.See statistical hypothesis
testingforfurtherdiscussionof thisissue.
Z-testsotherthanlocationtests[edit]
Locationtestsare the most familiar Z-tests.Anotherclassof Z-testsarisesin maximum
likelihood estimationof theparametersinaparametricstatistical model.Maximumlikelihoodestimates
are approximatelynormal undercertainconditions,andtheirasymptoticvariance canbe calculatedin
Page 20 of 52
termsof the Fisherinformation.The maximumlikelihoodestimate dividedbyitsstandarderrorcan be
usedas a teststatisticfor the null hypothesisthatthe populationvalue of the parameterequalszero.
More generally,if isthe maximumlikelihoodestimate of aparameterθ, and θ0 isthe value of θ under
the null hypothesis,
can be usedasa Z-teststatistic.
Whenusinga Z-testformaximumlikelihoodestimates,itisimportanttobe aware that the normal
approximationmaybe poorif the sample size isnotsufficientlylarge. Althoughthere isnosimple,
universal rule statinghowlarge the sample sizemustbe touse a Z-test, simulation cangive agoodidea
as to whetheraZ-testisappropriate ina givensituation.
Z-testsare employedwheneveritcan be arguedthat a teststatisticfollowsanormal distributionunder
the null hypothesisof interest.Many non-parametricteststatistics,suchas U statistics,are
approximatelynormal forlarge enoughsample sizes,andhence are oftenperformedas Z-tests.
F test
F-testisbasedonF-distributionandisusedtocompare the variance of the two-independentsamples.
Thistestis alsousedinthe contextof analysisof variance (ANOVA)forjudgingthe significance of more
than twosample meansatone and the same time.Itisalsousedfor judgingthe significance of multiple
correlationcoefficients.Teststatistic,F,iscalculatedandcomparedwithitsprobable value (tobe seen
inthe F-ratiotablesfordifferentdegreesof freedomforgreaterandsmallervariancesatspecifiedlevel
of significance) foracceptingorrejectingthe null hypothesis.
An F-testisany statistical testinwhichthe teststatistichasan F-distribution underthe null hypothesis.
It ismost oftenusedwhen comparingstatistical models thathave beenfittedtoa data set,inorderto
identifythe modelthatbestfitsthe populationfromwhichthe datawere sampled.Exact"F-tests"
mainlyarise whenthe modelshave beenfittedtothe data usingleastsquares.The name wascoined
by George W. Snedecor,inhonourof SirRonaldA.Fisher.Fisherinitiallydevelopedthe statisticasthe
variance ratioin the 1920s.[
Page 21 of 52
Commonexamplesof F-tests[edit]
Commonexamplesof the use of F-testsare,forexample,the studyof the followingcases:
 The hypothesisthatthe meansof a givensetof normallydistributed populations,all havingthe
same standarddeviation,are equal.Thisisperhapsthe best-knownF-test,andplaysan
importantrole inthe analysisof variance (ANOVA).
 The hypothesis thata proposedregressionmodel fitsthe datawell.SeeLack-of-fitsumof
squares.
 The hypothesisthata data setina regressionanalysis followsthe simplerof twoproposedlinear
modelsthatare nestedwithineachother.
In addition,some statistical procedures,suchas Scheffé'smethod formultiple comparisonsadjustment
inlinearmodels,alsouse F-tests.
F-testof the equalityof two variances[edit]
Main article: F-testof equalityof variances
The F-testissensitive tonon-normality.[2][3]
Inthe analysisof variance (ANOVA),alternativetests
include Levene'stest,Bartlett'stest,andthe Brown–Forsythe test.However,whenanyof these testsare
conductedtotest the underlyingassumptionof homoscedasticity (i.e.homogeneityof variance),asa
preliminarysteptotestingformeaneffects,there isanincrease inthe experiment-wiseType I
error rate.[4]
Formulaand calculation[edit]
Most F-testsarise byconsideringadecompositionof the variability inacollectionof datainterms
of sumsof squares.TheteststatisticinanF-testisthe ratio of two scaledsumsof squaresreflecting
differentsourcesof variability.Thesesumsof squaresare constructedsothat the statistictendstobe
greaterwhenthe null hypothesisisnottrue.Inorderfor the statisticto follow the F-distribution under
the null hypothesis,the sumsof squaresshouldbe statisticallyindependent,andeachshouldfollowa
scaledchi-squareddistribution.The latterconditionisguaranteedif the datavaluesare independent
and normallydistributed withacommon variance.
Multiple-comparisonANOVAproblems[edit]
The F-testinone-wayanalysisof variance isusedtoassesswhetherthe expectedvalues of a
quantitative variable withinseveralpre-definedgroupsdifferfromeachother.Forexample,suppose
that a medical trial comparesfourtreatments.The ANOVA F-testcanbe usedtoassesswhetheranyof
the treatmentsisonaverage superior,orinferior,tothe othersversusthe null hypothesisthatall four
treatmentsyieldthe same meanresponse.Thisisanexample of an"omnibus"test,meaningthata
single testisperformedtodetectanyof several possibledifferences.Alternatively,we couldcarryout
pairwise testsamongthe treatments(forinstance,inthe medical trial example withfourtreatmentswe
couldcarry out six testsamongpairs of treatments).The advantage of the ANOVA F-testisthatwe do
not needtopre-specifywhichtreatmentsare tobe compared,andwe donot needtoadjustfor
makingmultiplecomparisons.The disadvantageof the ANOVA F-testisthatif we rejectthe null
hypothesis,we donotknowwhichtreatmentscanbe saidto be significantlydifferentfromthe others –
Page 22 of 52
if the F-testisperformedatlevel α we cannotstate that the treatmentpairwiththe greatestmean
difference issignificantlydifferentatlevel α.
The formulafor the one-wayANOVAF-teststatisticis
or
The "explainedvariance",or"between-groupvariability"is
where denotesthe sample mean inthe ith
group, ni is the numberof observationsinthe ith
group,
denotesthe overall meanof the data,and K denotesthe numberof groups.
The "unexplainedvariance",or"within-groupvariability"is
where Yij is the jth
observationinthe ith
out of K groups and N is the overall sample size.This F-statistic
followsthe F-distribution withK−1, N −K degreesof freedomunderthe null hypothesis.The statisticwill
be large if the between-groupvariabilityislarge relativetothe within-groupvariability,whichisunlikely
to happenif the populationmeans of the groupsall have the same value.
Note that whenthere are onlytwogroupsfor the one-wayANOVAF-test, F=t2
where tis
the Student's t statistic.
Regressionproblems[edit]
Considertwomodels,1and2, where model 1is'nested'withinmodel 2.Model 1 isthe Restricted
model,andModel 2 is the Unrestrictedone.Thatis,model 1 has p1 parameters,andmodel 2
has p2 parameters,where p2 > p1,and forany choice of parametersinmodel 1,the same regression
curve can be achievedbysome choice of the parametersof model 2.(We use the conventionthatany
constantparameterina model isincludedwhencountingthe parameters.Forinstance,the simple
linearmodel y = mx + b hasp=2 underthisconvention.)The model withmore parameterswillalwaysbe
able to fitthe data at leastas well asthe model withfewerparameters.Thustypicallymodel 2will givea
better(i.e.lowererror) fittothe data than model 1.But one oftenwantsto determine whethermodel 2
givesa significantly betterfittothe data. One approach tothis problemistouse an F test.
If there are n data pointstoestimate parametersof bothmodelsfrom, thenone cancalculate
the F statistic,givenby
Page 23 of 52
where RSSi is the residual sumof squares of model i.If yourregressionmodel hasbeencalculatedwith
weights,thenreplace RSSi withχ2
,the weightedsumof squaredresiduals.Underthe null hypothesis
that model 2 doesnotprovide a significantlybetterfitthanmodel 1, F will have an F distribution,with
(p2−p1,n−p2) degreesof freedom.The null hypothesisisrejectedif the Fcalculatedfromthe datais
greaterthan the critical value of the F-distribution forsome desiredfalse-rejectionprobability(e.g.
0.05). The F-testisa Wald test.
One-wayANOVA example[edit]
Consideranexperimenttostudythe effectof three differentlevelsof afactor on a response (e.g.three
levelsof afertilizeronplantgrowth).If we had6 observationsforeachlevel,we couldwritethe
outcome of the experimentinatable like this,wherea1,a2,anda3 are the three levelsof the factor
beingstudied.
a1 a2 a3
6 8 13
8 12 9
4 9 11
5 11 8
3 6 7
4 8 12
The null hypothesis,denotedH0,forthe overall F-testforthisexperimentwouldbe thatall three levels
of the factor produce the same response,onaverage.Tocalculate the F-ratio:
Step 1: Calculate the meanwithineachgroup:
Step 2: Calculate the overall mean:
Page 24 of 52
where a is the numberof groups.
Step 3: Calculate the "between-group"sumof squares:
where n is the numberof data valuespergroup.
The between-groupdegreesof freedomisone lessthanthe numberof groups
so the between-groupmeansquare value is
Step 4: Calculate the "within-group"sumof squares.Beginbycenteringthe dataineach group
a1 a2 a3
6−5=1 8−9=−1 13−10=3
8−5=3 12−9=3 9−10=−1
4−5=−1 9−9=0 11−10=1
5−5=0 11−9=2 8−10=−2
3−5=−2 6−9=−3 7−10=−3
4−5=−1 8−9=−1 12−10=2
The within-groupsumof squaresisthe sumof squaresof all 18 valuesinthistable
The within-groupdegreesof freedomis
Page 25 of 52
Thus the within-groupmeansquare value is
Step 5: The F-ratiois
The critical value isthe numberthat the teststatisticmustexceedtorejectthe test.Inthis
case, Fcrit(2,15) = 3.68 at α = 0.05. Since F=9.3 > 3.68, the resultsare significantatthe 5% significance
level.One wouldrejectthe null hypothesis,concludingthatthere isstrongevidencethatthe expected
valuesinthe three groupsdiffer.The p-valueforthistestis0.002.
Afterperformingthe F-test,itiscommonto carry out some "post-hoc"analysisof the groupmeans.In
thiscase,the firsttwogroupmeansdifferby4 units,the firstand thirdgroupmeansdifferby5 units,
and the secondandthird groupmeansdifferbyonly1 unit.The standarderror of eachof these
differencesis .Thusthe firstgroup isstronglydifferentfromthe other
groups,as the meandifference ismore timesthe standarderror,sowe can be highlyconfidentthat
the populationmean of the firstgroupdiffersfromthe populationmeansof the othergroups.However
there isno evidence thatthe secondandthirdgroupshave differentpopulation meansfromeachother,
as theirmeandifference of one unitiscomparable tothe standarderror.
Note F(x, y) denotesan F-distribution cumulative distributionfunctionwith x degreesof freedominthe
numeratorand ydegreesof freedominthe denominator.
ANOVA'srobustnesswithrespecttoType I errorsfor departuresfrompopulationnormality[edit]
Page 26 of 52
The one-wayANOVA canbe generalizedtothe factorial andmultivariatelayouts,aswell astothe
analysisof covariance.[clarification needed]
It isoftenstatedinpopularliterature thatnone of these F-testsare robustwhenthere are severe
violationsof the assumptionthateachpopulationfollowsthe normal distribution,particularlyforsmall
alphalevelsandunbalancedlayouts.[5]
Furthermore,itisalsoclaimedthatif the underlyingassumption
of homoscedasticity isviolated,the Type Ierrorpropertiesdegenerate muchmore severely.[6]
However,thisisa misconception,basedonworkdone inthe 1950s and earlier.The firstcomprehensive
investigationof the issue byMonte CarlosimulationwasDonaldson(1966).[7]
He showedthatunderthe
usual departures(positiveskew,unequalvariances)"the F-testisconservative"soislesslikelythanit
shouldbe to findthata variable issignificant.However,aseitherthe sample sizeorthe numberof cells
increases,"the powercurvesseemtoconverge tothatbased onthe normal distribution".More detailed
workwas done byTiku (1971).[8]
He foundthat "The non-normal theorypowerof Fisfoundto differ
fromthe normal theorypowerbya correction termwhichdecreasessharplywithincreasingsample
size."The problemof non-normality,especiallyinlarge samples,isfarlessseriousthanpopulararticles
wouldsuggest.
The current viewisthat"Monte-Carlostudieswere usedextensivelywithnormal distribution-based
teststo determine howsensitivetheyare toviolationsof the assumptionof normal distributionof the
analyzedvariablesinthe population.The general conclusionfromthese studiesisthatthe
consequencesof suchviolationsare less severe thanpreviouslythought.Althoughthese conclusions
shouldnotentirelydiscourage anyone frombeingconcernedaboutthe normalityassumption,theyhave
increasedthe overall popularityof the distribution-dependentstatistical testsinall areasof research."[9]
For nonparametricalternativesinthe factorial layout,see Sawilowsky.[10]
Formore discussion
see ANOVA onranks.
Page 27 of 52
Page 28 of 52
Page 29 of 52
Page 30 of 52
Page 31 of 52
References[edit]
1. Jump up^ Lomax, Richard G. (2007) Statistical Concepts: A Second Course, p. 10, ISBN 0-
8058-5850-4
2. Jump up^ Box, G. E. P. (1953). "Non-Normality and Tests on Variances". Biometrika 40 (3/4):
318–335. doi:10.1093/biomet/40.3-4.318.JSTOR 2333350.
3. Jump up^ Markowski, Carol A; Markowski, Edward P. (1990). "Conditions for the Effectiveness
of a Preliminary Test of Variance". The American Statistician 44 (4): 322–
326. doi:10.2307/2684360. JSTOR 2684360.
4. Jump up^ Sawilowsky, S. (2002). "Fermat, Schubert, Einstein, and Behrens-Fisher:The
Probable Difference Between Two Means When σ1
2
≠ σ2
2
". Journal of Modern Applied Statistical
Methods, 1(2), 461–472.
5. Jump up^ Blair, R. C. (1981). "A reaction to 'Consequences of failure to meet assumptions
underlying the fixed effects analysis of variance and covariance.'" Review of Educational
Research, 51, 499–507.
6. Jump up^ Randolf, E. A., & Barcikowski, R. S. (1989, November). "Type I error rate when real
study values are used as population parameters in a Monte Carlo study". Paper presented at the
11th annual meeting of the Mid-Western Educational Research Association, Chicago.
7. Jump
up^ https://www.rand.org/content/dam/rand/pubs/research_memoranda/2008/RM5072.pdf
8. Jump up^ M. L. Tiku, "Power Function of the F-Test Under Non-Normal Situations", Journal of
the American Statistical Association Vol. 66, No. 336 (Dec., 1971), page 913
9. Jump up^ https://www.statsoft.com/textbook/elementary-statistics-concepts/
10. Jump up^ Sawilowsky, S. (1990). Nonparametric tests of interaction in experimental
design. Review of Educational Research, 25(20–59).
Page 32 of 52
Page 33 of 52
Page 34 of 52
Page 35 of 52
Page 36 of 52
Page 37 of 52
Page 38 of 52
Page 39 of 52
Page 40 of 52
Page 41 of 52
Page 42 of 52
Page 43 of 52
Page 44 of 52
Page 45 of 52
Page 46 of 52
Page 47 of 52
Page 48 of 52
Page 49 of 52
Page 50 of 52
Page 51 of 52
Page 52 of 52

More Related Content

What's hot

hypothesis testing
hypothesis testinghypothesis testing
hypothesis testing
ilona50
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
Hossain Hemel
 
Hypothesis
HypothesisHypothesis
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testingNirajan Bam
 
HYPOTHESIS TESTING
HYPOTHESIS TESTINGHYPOTHESIS TESTING
HYPOTHESIS TESTING
Amna Sheikh
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
Muhammadasif909
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis TestingSampath
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testingCarlo Magno
 
S5 w1 hypothesis testing & t test
S5 w1 hypothesis testing & t testS5 w1 hypothesis testing & t test
S5 w1 hypothesis testing & t testRachel Chung
 
Hypothesis types, formulation, and testing
Hypothesis types, formulation, and testingHypothesis types, formulation, and testing
Hypothesis types, formulation, and testing
Aneesa Ch
 
Hypothesis
HypothesisHypothesis
Ds 2251 -_hypothesis test
Ds 2251 -_hypothesis testDs 2251 -_hypothesis test
Ds 2251 -_hypothesis test
Khulna University
 
Test of hypothesis
Test of hypothesisTest of hypothesis
Test of hypothesisvikramlawand
 
Testing of Hypothesis
Testing of Hypothesis Testing of Hypothesis
Testing of Hypothesis
Chintan Trivedi
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
Shameer P Hamsa
 
Hypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-testHypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-test
Shakehand with Life
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
Van Martija
 
Hypothesis test
Hypothesis testHypothesis test
Hypothesis test
MD.ASHIQUZZAMAN KHONDAKER
 
Four steps to hypothesis testing
Four steps to hypothesis testingFour steps to hypothesis testing
Four steps to hypothesis testingHasnain Baber
 

What's hot (20)

hypothesis testing
hypothesis testinghypothesis testing
hypothesis testing
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Hypothesis
HypothesisHypothesis
Hypothesis
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
HYPOTHESIS TESTING
HYPOTHESIS TESTINGHYPOTHESIS TESTING
HYPOTHESIS TESTING
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
S5 w1 hypothesis testing & t test
S5 w1 hypothesis testing & t testS5 w1 hypothesis testing & t test
S5 w1 hypothesis testing & t test
 
Hypothesis types, formulation, and testing
Hypothesis types, formulation, and testingHypothesis types, formulation, and testing
Hypothesis types, formulation, and testing
 
Hypothesis
HypothesisHypothesis
Hypothesis
 
Ds 2251 -_hypothesis test
Ds 2251 -_hypothesis testDs 2251 -_hypothesis test
Ds 2251 -_hypothesis test
 
Test of hypothesis
Test of hypothesisTest of hypothesis
Test of hypothesis
 
Testing of hypothesis
Testing of hypothesisTesting of hypothesis
Testing of hypothesis
 
Testing of Hypothesis
Testing of Hypothesis Testing of Hypothesis
Testing of Hypothesis
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Hypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-testHypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-test
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Hypothesis test
Hypothesis testHypothesis test
Hypothesis test
 
Four steps to hypothesis testing
Four steps to hypothesis testingFour steps to hypothesis testing
Four steps to hypothesis testing
 

Viewers also liked

Data analysis and statistical inference project
Data analysis and statistical inference projectData analysis and statistical inference project
Data analysis and statistical inference project
Maruşa Pescu (Beca)
 
Indirect tax (1)
Indirect tax (1)Indirect tax (1)
Indirect tax (1)
9867097496
 
Amar 38 final
Amar 38 finalAmar 38 final
Amar 38 final
9867097496
 
Data_Processing_Program
Data_Processing_ProgramData_Processing_Program
Data_Processing_ProgramNeil Dahlqvist
 
Project data analysis
Project data analysisProject data analysis
Project data analysis
Maruşa Pescu (Beca)
 
Final goods &amp; service tax
Final goods &amp; service taxFinal goods &amp; service tax
Final goods &amp; service tax
Abhishek Jhunjhunwala
 
Property Tax Assessment Services
Property Tax Assessment ServicesProperty Tax Assessment Services
Property Tax Assessment Services
cutmytaxes
 
Research project report sumit b
Research project report sumit bResearch project report sumit b
Research project report sumit b
sumit saxena
 
Research project on packaged drinking water industry
Research project on packaged drinking water industryResearch project on packaged drinking water industry
Research project on packaged drinking water industry
Pallav Tyagi
 
Standards of Auditing - Introduction and Application in the Indian Context
Standards of Auditing - Introduction and Application in the Indian ContextStandards of Auditing - Introduction and Application in the Indian Context
Standards of Auditing - Introduction and Application in the Indian Context
Bharath Rao
 
Auditing Standards- IndusInd Bank
Auditing Standards- IndusInd BankAuditing Standards- IndusInd Bank
Auditing Standards- IndusInd Bank
Nikita Jangid
 
Project-Student Financial Service System
Project-Student Financial Service SystemProject-Student Financial Service System
Project-Student Financial Service System
chezhiang
 
Mobilizing Local Government Tax Revenue for Adequate Service Delivery in Nige...
Mobilizing Local Government Tax Revenue for Adequate Service Delivery in Nige...Mobilizing Local Government Tax Revenue for Adequate Service Delivery in Nige...
Mobilizing Local Government Tax Revenue for Adequate Service Delivery in Nige...Oghenovo Egbegbedia
 
STANDARDS ON AUDIT
STANDARDS  ON AUDITSTANDARDS  ON AUDIT
STANDARDS ON AUDIT
Kinjal Gada
 
2013 asq field data analysis & statistical warranty forecasting
2013 asq field data analysis & statistical warranty forecasting2013 asq field data analysis & statistical warranty forecasting
2013 asq field data analysis & statistical warranty forecasting
ASQ Reliability Division
 
Company audit & accounts
Company audit & accounts  Company audit & accounts
Company audit & accounts
Vivek Mahajan
 
Demonetization of Indian Currency
Demonetization of Indian CurrencyDemonetization of Indian Currency
Demonetization of Indian Currency
Jithin Scaria
 
A study on understanding the concept of demonetization with reference to MBA ...
A study on understanding the concept of demonetization with reference to MBA ...A study on understanding the concept of demonetization with reference to MBA ...
A study on understanding the concept of demonetization with reference to MBA ...
Syed Valiullah Bakhtiyari
 

Viewers also liked (20)

Data analysis and statistical inference project
Data analysis and statistical inference projectData analysis and statistical inference project
Data analysis and statistical inference project
 
Indirect tax (1)
Indirect tax (1)Indirect tax (1)
Indirect tax (1)
 
Amar 38 final
Amar 38 finalAmar 38 final
Amar 38 final
 
Data_Processing_Program
Data_Processing_ProgramData_Processing_Program
Data_Processing_Program
 
Project data analysis
Project data analysisProject data analysis
Project data analysis
 
Final goods &amp; service tax
Final goods &amp; service taxFinal goods &amp; service tax
Final goods &amp; service tax
 
Property Tax Assessment Services
Property Tax Assessment ServicesProperty Tax Assessment Services
Property Tax Assessment Services
 
Research project report sumit b
Research project report sumit bResearch project report sumit b
Research project report sumit b
 
Research project on packaged drinking water industry
Research project on packaged drinking water industryResearch project on packaged drinking water industry
Research project on packaged drinking water industry
 
Standards of Auditing - Introduction and Application in the Indian Context
Standards of Auditing - Introduction and Application in the Indian ContextStandards of Auditing - Introduction and Application in the Indian Context
Standards of Auditing - Introduction and Application in the Indian Context
 
Auditing Standards- IndusInd Bank
Auditing Standards- IndusInd BankAuditing Standards- IndusInd Bank
Auditing Standards- IndusInd Bank
 
Project-Student Financial Service System
Project-Student Financial Service SystemProject-Student Financial Service System
Project-Student Financial Service System
 
Mobilizing Local Government Tax Revenue for Adequate Service Delivery in Nige...
Mobilizing Local Government Tax Revenue for Adequate Service Delivery in Nige...Mobilizing Local Government Tax Revenue for Adequate Service Delivery in Nige...
Mobilizing Local Government Tax Revenue for Adequate Service Delivery in Nige...
 
STANDARDS ON AUDIT
STANDARDS  ON AUDITSTANDARDS  ON AUDIT
STANDARDS ON AUDIT
 
2013 asq field data analysis & statistical warranty forecasting
2013 asq field data analysis & statistical warranty forecasting2013 asq field data analysis & statistical warranty forecasting
2013 asq field data analysis & statistical warranty forecasting
 
Project Report on e banking
Project Report on e bankingProject Report on e banking
Project Report on e banking
 
Company audit & accounts
Company audit & accounts  Company audit & accounts
Company audit & accounts
 
E-banking project
E-banking projectE-banking project
E-banking project
 
Demonetization of Indian Currency
Demonetization of Indian CurrencyDemonetization of Indian Currency
Demonetization of Indian Currency
 
A study on understanding the concept of demonetization with reference to MBA ...
A study on understanding the concept of demonetization with reference to MBA ...A study on understanding the concept of demonetization with reference to MBA ...
A study on understanding the concept of demonetization with reference to MBA ...
 

Similar to Hypothesis testing

hypothesis testing overview
hypothesis testing overviewhypothesis testing overview
hypothesis testing overview
i i
 
Aron chpt 5 ed revised
Aron chpt 5 ed revisedAron chpt 5 ed revised
Aron chpt 5 ed revisedSandra Nicks
 
Introduction-to-Hypothesis-Testing Explained in detail
Introduction-to-Hypothesis-Testing Explained in detailIntroduction-to-Hypothesis-Testing Explained in detail
Introduction-to-Hypothesis-Testing Explained in detail
ShriramKargaonkar
 
Tests of significance
Tests of significanceTests of significance
Tests of significance
Shubhanshu Gupta
 
20 OCT-Hypothesis Testing.ppt
20 OCT-Hypothesis Testing.ppt20 OCT-Hypothesis Testing.ppt
20 OCT-Hypothesis Testing.ppt
Shivraj Nile
 
Hypothesis testing, error and bias
Hypothesis testing, error and biasHypothesis testing, error and bias
Hypothesis testing, error and bias
Dr.Jatin Chhaya
 
0hypothesis testing.pdf
0hypothesis testing.pdf0hypothesis testing.pdf
0hypothesis testing.pdf
AyushPandey175
 
Elements of inferential statistics
Elements of inferential statisticsElements of inferential statistics
Elements of inferential statistics
Arati Mishra Ingalageri
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
Sathish Rajamani
 
Hypothesis Testing.pptx
Hypothesis Testing.pptxHypothesis Testing.pptx
Hypothesis Testing.pptx
heencomm
 
Statistics and Probability Q4_M1_LAS .docx
Statistics and Probability Q4_M1_LAS .docxStatistics and Probability Q4_M1_LAS .docx
Statistics and Probability Q4_M1_LAS .docx
Nelia Sumalinog
 
Hypothesis Testing Definitions A statistical hypothesi.docx
Hypothesis Testing  Definitions A statistical hypothesi.docxHypothesis Testing  Definitions A statistical hypothesi.docx
Hypothesis Testing Definitions A statistical hypothesi.docx
wilcockiris
 
Hypothesis
HypothesisHypothesis
Hypothesis
Dr. Priyanka Jain
 
Unit 4 Tests of Significance
Unit 4 Tests of SignificanceUnit 4 Tests of Significance
Unit 4 Tests of Significance
Rai University
 
Hypothesis Testing.pptx
Hypothesis Testing.pptxHypothesis Testing.pptx
Hypothesis Testing.pptx
Melba Shaya Sweety
 
Testing of Hypothesis.pptx
Testing of Hypothesis.pptxTesting of Hypothesis.pptx
Testing of Hypothesis.pptx
hemamalini398951
 
Inferential statistics hand out (2)
Inferential statistics hand out (2)Inferential statistics hand out (2)
Inferential statistics hand out (2)Kimberly Ann Yabut
 
HYPOTHESIS
HYPOTHESISHYPOTHESIS
HYPOTHESIS
VanarajVasanthiRK
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
Afra Fathima
 

Similar to Hypothesis testing (20)

hypothesis testing overview
hypothesis testing overviewhypothesis testing overview
hypothesis testing overview
 
Aron chpt 5 ed revised
Aron chpt 5 ed revisedAron chpt 5 ed revised
Aron chpt 5 ed revised
 
Aron chpt 5 ed
Aron chpt 5 edAron chpt 5 ed
Aron chpt 5 ed
 
Introduction-to-Hypothesis-Testing Explained in detail
Introduction-to-Hypothesis-Testing Explained in detailIntroduction-to-Hypothesis-Testing Explained in detail
Introduction-to-Hypothesis-Testing Explained in detail
 
Tests of significance
Tests of significanceTests of significance
Tests of significance
 
20 OCT-Hypothesis Testing.ppt
20 OCT-Hypothesis Testing.ppt20 OCT-Hypothesis Testing.ppt
20 OCT-Hypothesis Testing.ppt
 
Hypothesis testing, error and bias
Hypothesis testing, error and biasHypothesis testing, error and bias
Hypothesis testing, error and bias
 
0hypothesis testing.pdf
0hypothesis testing.pdf0hypothesis testing.pdf
0hypothesis testing.pdf
 
Elements of inferential statistics
Elements of inferential statisticsElements of inferential statistics
Elements of inferential statistics
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Hypothesis Testing.pptx
Hypothesis Testing.pptxHypothesis Testing.pptx
Hypothesis Testing.pptx
 
Statistics and Probability Q4_M1_LAS .docx
Statistics and Probability Q4_M1_LAS .docxStatistics and Probability Q4_M1_LAS .docx
Statistics and Probability Q4_M1_LAS .docx
 
Hypothesis Testing Definitions A statistical hypothesi.docx
Hypothesis Testing  Definitions A statistical hypothesi.docxHypothesis Testing  Definitions A statistical hypothesi.docx
Hypothesis Testing Definitions A statistical hypothesi.docx
 
Hypothesis
HypothesisHypothesis
Hypothesis
 
Unit 4 Tests of Significance
Unit 4 Tests of SignificanceUnit 4 Tests of Significance
Unit 4 Tests of Significance
 
Hypothesis Testing.pptx
Hypothesis Testing.pptxHypothesis Testing.pptx
Hypothesis Testing.pptx
 
Testing of Hypothesis.pptx
Testing of Hypothesis.pptxTesting of Hypothesis.pptx
Testing of Hypothesis.pptx
 
Inferential statistics hand out (2)
Inferential statistics hand out (2)Inferential statistics hand out (2)
Inferential statistics hand out (2)
 
HYPOTHESIS
HYPOTHESISHYPOTHESIS
HYPOTHESIS
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 

Recently uploaded

The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
CarlosHernanMontoyab2
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
Jheel Barad
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 

Recently uploaded (20)

The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 

Hypothesis testing

  • 1. Page 1 of 52 Hypothesis testing and interpretation of data Testing Of Hypothesis The basic logic of hypothesis testing is to prove or disprove the research question. When a researcher conducts quantitatively research, he/she is attempting to answer a research question or hypothesis that has been formulated .One method of evaluating this research question is via a process called hypothesis testing, which is sometimes also referred to as significance testing. Example : Two lecturers, Sandy and Mandy, thinks that they use the best method to teach their students. Each lecturer has 50 statistics student who are studying a graduate degree in management. In sandy’s class, students have to attend one lecture and one seminar class every week, whilst in Mandy believes that lectures are sufficient by themselves in their own time. This is the first year that Sandy has given seminars, but since they take up a lot of her time, she wants to make sure that she is not wasting her time and that seminars improve the students’ performance. The ResearchHypothesis The first step in hypothesis testing is to set a research hypothesis. In a sandy and mandy,s study, the aim is to examine the effect that two different teaching methods – providing both lectures and seminars classes (sandy), and providing only lectures by themselves (mandy) – had on the performance of the students. More specifically , they want to determine whether performance is different between the two different teaching methods. Whilst mandy is skeptical about the
  • 2. Page 2 of 52 effectiveness of seminars, sandy clearly believes that students do better than those in mandy’s class. This leads to the following research hypothesis: Researchhypothesis: When student attend seminar classes, in addition to lecture, their performance increases. By taking a hypothesis testing approach, Sandy and Mandy want to generalize their result toa population(total students) rather than just the students in their sample. However, in order to use hypothesis testing, one needs to re-state the research hypothesis as a null and alternative hypothesis. Null hypothesis : the null hypothesis (H0) is a hypothesis which the researcher tries to disprove, reject or nullify. A null hypothesis is “the hypothesis that there is no relationship between two or more variables, symbolized as H0. Alternative hypothesis: the alternate, or research, hypothesis proposes a relationship between two or more variables, symbolized as H1. Decision errors Two type of errors can result from a hypothesis test. TypeⅠerror : A typeⅠerror occurs when the researcher rejects anull hypothesis when it is true. The probability of committing a type error is called the significance level. This probability is also called alpha, and is often denoted by α
  • 3. Page 3 of 52 Type Ⅱerror : A Type Ⅱ error occurs when the researcher fails to reject a null hypothesis, which is false. The probability of committing a Type Ⅱ error is called Beta, and is often denoted by β . The probability of not committing a TypeⅡ error is called the Power of the test.
  • 4. Page 4 of 52 Steps/procedures in Hypothesis Testing 1. Identify the research problem : The first step is to state the research problem The research problem needs to identify the population of interest ,and the variables under investigation. Example of research problem: To find out the effectiveness of two teaching methods- only lecture method- with reference to exam marks of the students. In the above research problem, the population of interest refers to the student, and the variable include the teaching methods and the marks. This step enable the researcher not only define what is not to be tested but what variable(s) will be used in sample data collection. The type of variable(s), wheter categorical, discreate or continuous, further defines the statistical test which can be performed on the collected data. 2.Specific the null and alternative Hypothesis: The research problem or question is converted into a null hypothesis and an alternative hypothesis. The hypothesis. The hypotheses are started in such a way that they are mutually exclusive. That is, if one is true, the other must be false. (a)Null Hypothesis: A null hypothesis (H0)is a statement that declares the observed difference is due to “chance”. It is the hypothesis the researcher hopes to reject or disprove. A null hypothesis states that there is no relationship between two or more variables. The simplistic definition of the null is - as the opposite of the alternative hypothesis(H1). Example: “There is no difference between the two methods of teaching( only lecture method, and lecture-cum-seminar method) on the scoring of marks of student.”
  • 5. Page 5 of 52 (b) Alternative Hypothesis: The alternate hypothesis proposes a relationship between two or more variables, symbolized as H1. Example: “The lecture-cum-seminar method improves the scoring of marks of students as compared to the only lecture method.” “Note that the two hypotheses we propose to test must be mutually exclusive i.e., when one is true the other must be false. And we see that they must be exhaustive; they must be include all possible occurrences.” From the above, it is clear that the null hypothesis is a hypothesis of no difference. The main problem of testing of hypothesis is to accept or to reject the null hypothesis. The alternative hypothesis specifies a definite relationship between the two variables. Only one alternative hypothesis is tested against the null hypothesis. 3. Significance Level: After formulating the hypotheses, the researcher must determine a certain level of significance. The confidence with which a null hypothesis is accepted or rejected depends on the level of significance. Generally, the level of significance falls between 5%and 1%: A significance level of 5% means the risk of making a wrong decision in accepting a false hypothesis or in rejecting a true hypothesis by 5 times out 100 occasions. A significance level of 1% means the ris of making a wrong decision is 1%. This means the researcher may make o
  • 6. Page 6 of 52 A wrong decision in accepting a false hypothesis or in rejecting a true hypothesis is once out of 100 occasions. Therefore, a 1% level of significance provides greater confidence with which null hypothesis is accepted or rejected as compared to 5% level of significance. 4.Test Statistic: A statistic used to test the null hypothesis. The researcher needs to identify a test statistic that can be used to assess the truth of the null hypothesis. It is used to test whether the null hypothesis set up should be accepted or rejected. Test statistic is calculated from the collected data. There are different types of test statistics. For instance, the z statistic will compare the observed sample mean to an expected population mean μ0. Large test statistics indicate data are far from expected, providing evidence against the null hypothesis and in favor of the alternative hypothesis. Every test in statistics indicate the same. Based on the sample data, it gives the probability( P- Value) that can be observed. When the P-Value is low, it means the sample data are very significant and it indicates that the null hypothesis is wrong. When the P-value is high, it suggests that the null hypothesis is wrong. When the P-value is high, it suggest that the collected data are within the normal range. 5.Region of Acceptance and Region of Rejection : The region of acceptance is a range of values. If the test statistic falls within the region of acceptance, the null hypothesis is not rejected. The region of acceptance is defined so that the chance of making a Type Ⅰerror is equal to the Alpha(α) level of significance.
  • 7. Page 7 of 52 Type Ⅰerror –A rejection of a true null hypothesis The set of values outside the region of acceptance is called the region of rejection. If the test statistics falls within the region of rejection, the null hypothesis is rejected at the Alpha (α) level of significance. 6. Select an Appropriate Test: A hypothesis test may be one-tailed or two-tailed. Whether the test is one sided or 2 sided depends on alternative hypothesis and nature of the problem. A test of a statistical hypothesis, where the region of rejection is on only one side of the sampling distribution, is called a one-tailed test. For example, suppose the null hypothesis states that the mean is less than equal to 10. The alternative hypothesis would be that the mean is greater than 10. The region of rejection would consist of a range of numbers located on the right side of sampling distribution; that is, a set of numbers greater than 10. In simple words, in one tailed test, the test statistic for rejection of null hypothesis falls only in one side of sampling distribution curve.
  • 8. Page 8 of 52 Significance Level In hypothesis testing, the significance level is the criterion used for rejecting the null hypothesis. The significance level is used in hypothesis testing as follows: First, the difference between the results of the experiment and the null hypothesis is determined. Then, assuming the null hypothesis is true, the probability of a difference that large or larger is computed . Finally, this probability is compared to the significance level. If the probability is less than or equal to the significance level, then the null hypothesis is rejected and the outcome is said to be statistically significant. Traditionally, experimenters have used either the 0.05 level (sometimes called the 5% level) or the 0.01 level (1% level), although the choice of levels is largely subjective. The lower the significance level, the more the data must diverge from the null hypothesis to be significant. Therefore, the 0.01 level is more conservative than the 0.05 level. The Greek letter alpha (α) is sometimes used to indicate the significance level. See also: Type I error and significance test
  • 9. Page 9 of 52 5) Identify the rejection region • Is it an upper, lower, or two-tailed test? • Determine the critical value associated with , the level of significance of the test The third step is to compute the probability value (also known as the p value). This is the probability of obtaining a sample statistic as different or more different from the parameter specified in the null hypothesis given that the null hypothesis is true.
  • 12. Page 12 of 52 Hypothesistesting
  • 14. Page 14 of 52 PARAMETRICTESTS 1. Descriptive Statistics – overview of the attributes of a data set. These include measurements of central tendency (frequency histograms, mean, median, & mode) and dispersion (range, variance & standard deviation) 2. Inferential Statistics - provide measures of how well data support hypothesis and if data are generalizable beyond what was tested (significance tests) Data: Observations recorded during research Types of data: 1. Nominal data synonymous with categorical data, assigned names/ categories based on characters with out ranking between categories.ex. male/female, yes/no, death /survival 2. Ordinal data orderedorgradeddata, expressedas Scores or ranks ex.paingradedas mild,moderate andsevere 3. Interval data an equal and definite interval betweentwomeasurements itcan be continuousordiscrete ex.weightexpressedas20, 21,22,23,24 interval between20& 21 is same as 23 &24
  • 16. Page 16 of 52 ParametricHypothesis testsare frequentlyusedtomeasure the qualityof sampleparametersorto test whetherestimatesonagivenparameterare equal fortwosamples. ParametricHypothesistestssetupanull hypothesisagainstanalternative hypothesis,testing,for instance,whetherornot the populationmeanisequal toacertainvalue,andthenusingappropriate statisticstocalculate the probabilitythatthe null hypothesisistrue.Youcan thenrejector accept the null hypothesisbasedonthe calculatedprobability.
  • 17. Page 17 of 52 Z test z-testisbasedonthe normal probabilitydistributionandisusedforjudgingthe significance of several statistical measures,particularlythe mean.The relevantteststatistic,z,isworkedoutandcompared withitsprobable value (tobe readfromtable showingareaundernormal curve) ata specifiedlevelof significance forjudgingthe significanceof the measure concerned.Thisisa mostfrequentlyusedtestin researchstudies.Thistestisusedevenwhenbinomial distributionort-distributionisapplicable onthe presumptionthatsucha distributiontendstoapproximate normal distributionas‘n’becomeslarger.z- testis generallyusedforcomparingthe meanof a sample tosome hypothesisedmeanforthe populationincase of large sample,orwhenpopulationvarianceisknown.z-testisalsousedforjudging he significance of difference betweenmeansof twoindependentsamplesincase of large samples,or whenpopulationvariance isknown.z-testisalsousedforcomparingthe sample proportiontoa theoretical value of populationproportionorforjudgingthe difference inproportionsof two independentsampleswhennhappenstobe large.Besides,thistestmaybe usedforjudgingthe significance of median,mode,coefficientof correlationandseveral othermeasures.t-testisbasedont- distributionandisconsideredanappropriate testforjudgingthe significance of asample meanorfor judgingthe significance of difference betweenthe meansof twosamplesin case of small sample(s) whenpopulationvariance isnotknown(inwhichcase we use variance of the sample asanestimate of the populationvariance).Incase twosamplesare related,we use pairedt-test(orwhatisknownas difference test) forjudging the significance of the meanof differencebetweenthe tworelatedsamples. It can alsobe usedforjudgingthe significance of the coefficientsof simpleandpartial correlations.The relevantteststatistic,t,iscalculatedfromthe sample dataandthen comparedwithitsprobable value basedon t-distribution(tobe readfromthe table thatgivesprobable valuesof tfor differentlevelsof significance fordifferentdegreesof freedom)ata specifiedlevel of significance forconcerningdegrees of freedomforacceptingorrejectingthe null hypothesis.Itmaybe notedthatt-testappliesonlyincase of small sample(s) whenpopulationvarianceisunknown. A Z-testisany statistical testforwhichthe distribution of the teststatisticunderthe null hypothesis can be approximatedbyanormal distribution.Because of the central limittheorem,manyteststatisticsare approximately normallydistributedforlarge samples.Foreachsignificance level,the Z-testhasa single critical value (forexample,1.96for 5% two tailed) whichmakesitmore convenientthanthe Student's t- testwhichhas separate critical valuesforeachsample size.Therefore,manystatistical testscanbe convenientlyperformedasapproximate Z-testsif the sample sizeislarge orthe populationvariance known.If the population variance isunknown(andtherefore hastobe estimatedfromthe sample itself) and the sample size isnotlarge (n< 30), the Student's t-testmaybe more appropriate. If T isa statisticthatis approximatelynormallydistributedunderthe null hypothesis,the nextstepin performingaZ-testisto estimate the expectedvalue θof T underthe null hypothesis,andthenobtain an estimate sof the standard deviation of T.Afterthatthe standard score Z = (T − θ) / s iscalculated, fromwhich one-tailedandtwo-tailedp-valuescanbe calculatedasΦ(−Z) (forupper-tailedtests),Φ(Z) (forlower-tailedtests) and2Φ(−|Z|) (fortwo-tailedtests)where Φisthe standard normalcumulative distributionfunction.
  • 18. Page 18 of 52 Use inlocationtesting[edit] The term "Z-test"isoftenusedtoreferspecificallytothe one-samplelocationtest comparingthe mean of a setof measurementstoa givenconstant.If the observeddata X1,..., Xn are (i) uncorrelated,(ii) have a commonmean μ, and(iii) have acommonvariance σ2 ,thenthe sample average X hasmeanμ and variance σ2 / n.If ournull hypothesisisthatthe meanvalue of the populationisagivennumberμ0,we can use X −μ0 as a test-statistic,rejectingthe null hypothesisif X −μ0islarge. To calculate the standardizedstatisticZ= (X − μ0) / s, we needtoeitherknow orhave an approximate value forσ2 , fromwhichwe can calculate s2 = σ2 / n.In some applications,σ2 isknown,butthisis uncommon.If the sample size ismoderate orlarge,we can substitute the samplevariance forσ2 ,giving a plug-in test.The resultingtestwill notbe anexactZ-testsince the uncertaintyinthe sample variance is not accountedfor— however,itwill be agoodapproximationunlessthe sample sizeissmall.A t- testcan be usedto accountfor the uncertaintyinthe sample variance whenthe sample sizeissmall and the data are exactly normal.There isnouniversal constantatwhichthe sample size isgenerally consideredlarge enoughtojustifyuse of the plug-intest.Typical rulesof thumbrange from20 to50 samples.Forlargersample sizes,the t-testprocedure givesalmostidentical p-valuesasthe Z-test procedure. Otherlocationteststhatcan be performedas Z-testsare the two-sample locationtestandthe paired difference test. Conditions[edit] For the Z-testto be applicable,certainconditionsmustbe met.  Nuisance parameters shouldbe known,orestimatedwithhighaccuracy(anexample of a nuisance parameterwouldbe the standarddeviation inaone-sample locationtest). Z-tests focuson a single parameter,andtreatall otherunknownparametersasbeingfixedattheirtrue values.Inpractice,due to Slutsky'stheorem,"pluggingin"consistentestimatesof nuisance parameterscan be justified.Howeverif the sample sizeisnotlarge enoughforthese estimates to be reasonablyaccurate,the Z-testmaynot performwell.  The test statisticshouldfollowa normal distribution.Generally,one appealstothe central limit theoremtojustifyassumingthatateststatisticvariesnormally.There isagreatdeal of statistical researchonthe questionof whenateststatisticvariesapproximatelynormally.If the variationof the teststatisticisstronglynon-normal,aZ-testshouldnotbe used. If estimatesof nuisance parametersare pluggedinasdiscussedabove,itisimportanttouse estimates appropriate forthe waythe data were sampled.Inthe special case of Z-testsforthe one ortwo sample locationproblem,the usual samplestandarddeviation isonlyappropriate if the datawere collectedas an independentsample. In some situations,itispossible todevise atestthat properlyaccountsforthe variationinplug-in estimatesof nuisance parameters.Inthe case of one and twosample locationproblems,a t-testdoes this. Example[edit]
  • 19. Page 19 of 52 Suppose thatina particulargeographicregion,the meanandstandarddeviationof scoresona reading testare 100 points,and12 points,respectively.Ourinterestisinthe scoresof 55 studentsina particular school whoreceivedameanscore of 96. We can askwhetherthismeanscore issignificantlylowerthan the regional mean — that is,are the studentsinthisschool comparable toa simple randomsample of 55 studentsfromthe regionasa whole,orare theirscoressurprisinglylow? We beginbycalculatingthe standarderrorof the mean: where isthe populationstandarddeviation Nextwe calculate the z-score,whichisthe distance fromthe sample meantothe populationmeanin unitsof the standarderror: In thisexample,we treatthe populationmeanandvariance asknown,whichwouldbe appropriateif all studentsinthe regionwere tested.Whenpopulationparametersare unknown,attest shouldbe conductedinstead. The classroommeanscore is96, whichis−2.47 standarderror unitsfromthe populationmeanof 100. Lookingupthe z-score ina table of the standard normal distribution,we findthatthe probabilityof observingastandardnormal value below -2.47is approximately0.5- 0.4932 = 0.0068. This isthe one- sidedp-value forthe null hypothesisthatthe 55 studentsare comparable toa simple randomsample fromthe populationof all test-takers.The two-sidedp-valueisapproximately0.014 (twice the one- sidedp-value). Anotherwayof statingthingsisthat withprobability1 − 0.014 = 0.986, a simple randomsample of 55 studentswouldhave ameantestscore within4 unitsof the populationmean.We couldalsosaythat with98.6% confidence we rejectthe null hypothesis thatthe 55 test takersare comparable to a simple randomsample fromthe populationof test-takers. The Z-testtellsusthat the 55 studentsof interesthave anunusuallylow meantestscore comparedto mostsimple randomsamplesof similarsize fromthe populationof test-takers.A deficiencyof this analysisisthatit doesnotconsiderwhethertheeffectsize of 4pointsismeaningful.If insteadof a classroom,we consideredasubregioncontaining900 studentswhose meanscore was99, nearlythe same z-score and p-value wouldbe observed.Thisshowsthatif the sample size islarge enough,very small differencesfromthe null value canbe highlystatisticallysignificant.See statistical hypothesis testingforfurtherdiscussionof thisissue. Z-testsotherthanlocationtests[edit] Locationtestsare the most familiar Z-tests.Anotherclassof Z-testsarisesin maximum likelihood estimationof theparametersinaparametricstatistical model.Maximumlikelihoodestimates are approximatelynormal undercertainconditions,andtheirasymptoticvariance canbe calculatedin
  • 20. Page 20 of 52 termsof the Fisherinformation.The maximumlikelihoodestimate dividedbyitsstandarderrorcan be usedas a teststatisticfor the null hypothesisthatthe populationvalue of the parameterequalszero. More generally,if isthe maximumlikelihoodestimate of aparameterθ, and θ0 isthe value of θ under the null hypothesis, can be usedasa Z-teststatistic. Whenusinga Z-testformaximumlikelihoodestimates,itisimportanttobe aware that the normal approximationmaybe poorif the sample size isnotsufficientlylarge. Althoughthere isnosimple, universal rule statinghowlarge the sample sizemustbe touse a Z-test, simulation cangive agoodidea as to whetheraZ-testisappropriate ina givensituation. Z-testsare employedwheneveritcan be arguedthat a teststatisticfollowsanormal distributionunder the null hypothesisof interest.Many non-parametricteststatistics,suchas U statistics,are approximatelynormal forlarge enoughsample sizes,andhence are oftenperformedas Z-tests. F test F-testisbasedonF-distributionandisusedtocompare the variance of the two-independentsamples. Thistestis alsousedinthe contextof analysisof variance (ANOVA)forjudgingthe significance of more than twosample meansatone and the same time.Itisalsousedfor judgingthe significance of multiple correlationcoefficients.Teststatistic,F,iscalculatedandcomparedwithitsprobable value (tobe seen inthe F-ratiotablesfordifferentdegreesof freedomforgreaterandsmallervariancesatspecifiedlevel of significance) foracceptingorrejectingthe null hypothesis. An F-testisany statistical testinwhichthe teststatistichasan F-distribution underthe null hypothesis. It ismost oftenusedwhen comparingstatistical models thathave beenfittedtoa data set,inorderto identifythe modelthatbestfitsthe populationfromwhichthe datawere sampled.Exact"F-tests" mainlyarise whenthe modelshave beenfittedtothe data usingleastsquares.The name wascoined by George W. Snedecor,inhonourof SirRonaldA.Fisher.Fisherinitiallydevelopedthe statisticasthe variance ratioin the 1920s.[
  • 21. Page 21 of 52 Commonexamplesof F-tests[edit] Commonexamplesof the use of F-testsare,forexample,the studyof the followingcases:  The hypothesisthatthe meansof a givensetof normallydistributed populations,all havingthe same standarddeviation,are equal.Thisisperhapsthe best-knownF-test,andplaysan importantrole inthe analysisof variance (ANOVA).  The hypothesis thata proposedregressionmodel fitsthe datawell.SeeLack-of-fitsumof squares.  The hypothesisthata data setina regressionanalysis followsthe simplerof twoproposedlinear modelsthatare nestedwithineachother. In addition,some statistical procedures,suchas Scheffé'smethod formultiple comparisonsadjustment inlinearmodels,alsouse F-tests. F-testof the equalityof two variances[edit] Main article: F-testof equalityof variances The F-testissensitive tonon-normality.[2][3] Inthe analysisof variance (ANOVA),alternativetests include Levene'stest,Bartlett'stest,andthe Brown–Forsythe test.However,whenanyof these testsare conductedtotest the underlyingassumptionof homoscedasticity (i.e.homogeneityof variance),asa preliminarysteptotestingformeaneffects,there isanincrease inthe experiment-wiseType I error rate.[4] Formulaand calculation[edit] Most F-testsarise byconsideringadecompositionof the variability inacollectionof datainterms of sumsof squares.TheteststatisticinanF-testisthe ratio of two scaledsumsof squaresreflecting differentsourcesof variability.Thesesumsof squaresare constructedsothat the statistictendstobe greaterwhenthe null hypothesisisnottrue.Inorderfor the statisticto follow the F-distribution under the null hypothesis,the sumsof squaresshouldbe statisticallyindependent,andeachshouldfollowa scaledchi-squareddistribution.The latterconditionisguaranteedif the datavaluesare independent and normallydistributed withacommon variance. Multiple-comparisonANOVAproblems[edit] The F-testinone-wayanalysisof variance isusedtoassesswhetherthe expectedvalues of a quantitative variable withinseveralpre-definedgroupsdifferfromeachother.Forexample,suppose that a medical trial comparesfourtreatments.The ANOVA F-testcanbe usedtoassesswhetheranyof the treatmentsisonaverage superior,orinferior,tothe othersversusthe null hypothesisthatall four treatmentsyieldthe same meanresponse.Thisisanexample of an"omnibus"test,meaningthata single testisperformedtodetectanyof several possibledifferences.Alternatively,we couldcarryout pairwise testsamongthe treatments(forinstance,inthe medical trial example withfourtreatmentswe couldcarry out six testsamongpairs of treatments).The advantage of the ANOVA F-testisthatwe do not needtopre-specifywhichtreatmentsare tobe compared,andwe donot needtoadjustfor makingmultiplecomparisons.The disadvantageof the ANOVA F-testisthatif we rejectthe null hypothesis,we donotknowwhichtreatmentscanbe saidto be significantlydifferentfromthe others –
  • 22. Page 22 of 52 if the F-testisperformedatlevel α we cannotstate that the treatmentpairwiththe greatestmean difference issignificantlydifferentatlevel α. The formulafor the one-wayANOVAF-teststatisticis or The "explainedvariance",or"between-groupvariability"is where denotesthe sample mean inthe ith group, ni is the numberof observationsinthe ith group, denotesthe overall meanof the data,and K denotesthe numberof groups. The "unexplainedvariance",or"within-groupvariability"is where Yij is the jth observationinthe ith out of K groups and N is the overall sample size.This F-statistic followsthe F-distribution withK−1, N −K degreesof freedomunderthe null hypothesis.The statisticwill be large if the between-groupvariabilityislarge relativetothe within-groupvariability,whichisunlikely to happenif the populationmeans of the groupsall have the same value. Note that whenthere are onlytwogroupsfor the one-wayANOVAF-test, F=t2 where tis the Student's t statistic. Regressionproblems[edit] Considertwomodels,1and2, where model 1is'nested'withinmodel 2.Model 1 isthe Restricted model,andModel 2 is the Unrestrictedone.Thatis,model 1 has p1 parameters,andmodel 2 has p2 parameters,where p2 > p1,and forany choice of parametersinmodel 1,the same regression curve can be achievedbysome choice of the parametersof model 2.(We use the conventionthatany constantparameterina model isincludedwhencountingthe parameters.Forinstance,the simple linearmodel y = mx + b hasp=2 underthisconvention.)The model withmore parameterswillalwaysbe able to fitthe data at leastas well asthe model withfewerparameters.Thustypicallymodel 2will givea better(i.e.lowererror) fittothe data than model 1.But one oftenwantsto determine whethermodel 2 givesa significantly betterfittothe data. One approach tothis problemistouse an F test. If there are n data pointstoestimate parametersof bothmodelsfrom, thenone cancalculate the F statistic,givenby
  • 23. Page 23 of 52 where RSSi is the residual sumof squares of model i.If yourregressionmodel hasbeencalculatedwith weights,thenreplace RSSi withχ2 ,the weightedsumof squaredresiduals.Underthe null hypothesis that model 2 doesnotprovide a significantlybetterfitthanmodel 1, F will have an F distribution,with (p2−p1,n−p2) degreesof freedom.The null hypothesisisrejectedif the Fcalculatedfromthe datais greaterthan the critical value of the F-distribution forsome desiredfalse-rejectionprobability(e.g. 0.05). The F-testisa Wald test. One-wayANOVA example[edit] Consideranexperimenttostudythe effectof three differentlevelsof afactor on a response (e.g.three levelsof afertilizeronplantgrowth).If we had6 observationsforeachlevel,we couldwritethe outcome of the experimentinatable like this,wherea1,a2,anda3 are the three levelsof the factor beingstudied. a1 a2 a3 6 8 13 8 12 9 4 9 11 5 11 8 3 6 7 4 8 12 The null hypothesis,denotedH0,forthe overall F-testforthisexperimentwouldbe thatall three levels of the factor produce the same response,onaverage.Tocalculate the F-ratio: Step 1: Calculate the meanwithineachgroup: Step 2: Calculate the overall mean:
  • 24. Page 24 of 52 where a is the numberof groups. Step 3: Calculate the "between-group"sumof squares: where n is the numberof data valuespergroup. The between-groupdegreesof freedomisone lessthanthe numberof groups so the between-groupmeansquare value is Step 4: Calculate the "within-group"sumof squares.Beginbycenteringthe dataineach group a1 a2 a3 6−5=1 8−9=−1 13−10=3 8−5=3 12−9=3 9−10=−1 4−5=−1 9−9=0 11−10=1 5−5=0 11−9=2 8−10=−2 3−5=−2 6−9=−3 7−10=−3 4−5=−1 8−9=−1 12−10=2 The within-groupsumof squaresisthe sumof squaresof all 18 valuesinthistable The within-groupdegreesof freedomis
  • 25. Page 25 of 52 Thus the within-groupmeansquare value is Step 5: The F-ratiois The critical value isthe numberthat the teststatisticmustexceedtorejectthe test.Inthis case, Fcrit(2,15) = 3.68 at α = 0.05. Since F=9.3 > 3.68, the resultsare significantatthe 5% significance level.One wouldrejectthe null hypothesis,concludingthatthere isstrongevidencethatthe expected valuesinthe three groupsdiffer.The p-valueforthistestis0.002. Afterperformingthe F-test,itiscommonto carry out some "post-hoc"analysisof the groupmeans.In thiscase,the firsttwogroupmeansdifferby4 units,the firstand thirdgroupmeansdifferby5 units, and the secondandthird groupmeansdifferbyonly1 unit.The standarderror of eachof these differencesis .Thusthe firstgroup isstronglydifferentfromthe other groups,as the meandifference ismore timesthe standarderror,sowe can be highlyconfidentthat the populationmean of the firstgroupdiffersfromthe populationmeansof the othergroups.However there isno evidence thatthe secondandthirdgroupshave differentpopulation meansfromeachother, as theirmeandifference of one unitiscomparable tothe standarderror. Note F(x, y) denotesan F-distribution cumulative distributionfunctionwith x degreesof freedominthe numeratorand ydegreesof freedominthe denominator. ANOVA'srobustnesswithrespecttoType I errorsfor departuresfrompopulationnormality[edit]
  • 26. Page 26 of 52 The one-wayANOVA canbe generalizedtothe factorial andmultivariatelayouts,aswell astothe analysisof covariance.[clarification needed] It isoftenstatedinpopularliterature thatnone of these F-testsare robustwhenthere are severe violationsof the assumptionthateachpopulationfollowsthe normal distribution,particularlyforsmall alphalevelsandunbalancedlayouts.[5] Furthermore,itisalsoclaimedthatif the underlyingassumption of homoscedasticity isviolated,the Type Ierrorpropertiesdegenerate muchmore severely.[6] However,thisisa misconception,basedonworkdone inthe 1950s and earlier.The firstcomprehensive investigationof the issue byMonte CarlosimulationwasDonaldson(1966).[7] He showedthatunderthe usual departures(positiveskew,unequalvariances)"the F-testisconservative"soislesslikelythanit shouldbe to findthata variable issignificant.However,aseitherthe sample sizeorthe numberof cells increases,"the powercurvesseemtoconverge tothatbased onthe normal distribution".More detailed workwas done byTiku (1971).[8] He foundthat "The non-normal theorypowerof Fisfoundto differ fromthe normal theorypowerbya correction termwhichdecreasessharplywithincreasingsample size."The problemof non-normality,especiallyinlarge samples,isfarlessseriousthanpopulararticles wouldsuggest. The current viewisthat"Monte-Carlostudieswere usedextensivelywithnormal distribution-based teststo determine howsensitivetheyare toviolationsof the assumptionof normal distributionof the analyzedvariablesinthe population.The general conclusionfromthese studiesisthatthe consequencesof suchviolationsare less severe thanpreviouslythought.Althoughthese conclusions shouldnotentirelydiscourage anyone frombeingconcernedaboutthe normalityassumption,theyhave increasedthe overall popularityof the distribution-dependentstatistical testsinall areasof research."[9] For nonparametricalternativesinthe factorial layout,see Sawilowsky.[10] Formore discussion see ANOVA onranks.
  • 31. Page 31 of 52 References[edit] 1. Jump up^ Lomax, Richard G. (2007) Statistical Concepts: A Second Course, p. 10, ISBN 0- 8058-5850-4 2. Jump up^ Box, G. E. P. (1953). "Non-Normality and Tests on Variances". Biometrika 40 (3/4): 318–335. doi:10.1093/biomet/40.3-4.318.JSTOR 2333350. 3. Jump up^ Markowski, Carol A; Markowski, Edward P. (1990). "Conditions for the Effectiveness of a Preliminary Test of Variance". The American Statistician 44 (4): 322– 326. doi:10.2307/2684360. JSTOR 2684360. 4. Jump up^ Sawilowsky, S. (2002). "Fermat, Schubert, Einstein, and Behrens-Fisher:The Probable Difference Between Two Means When σ1 2 ≠ σ2 2 ". Journal of Modern Applied Statistical Methods, 1(2), 461–472. 5. Jump up^ Blair, R. C. (1981). "A reaction to 'Consequences of failure to meet assumptions underlying the fixed effects analysis of variance and covariance.'" Review of Educational Research, 51, 499–507. 6. Jump up^ Randolf, E. A., & Barcikowski, R. S. (1989, November). "Type I error rate when real study values are used as population parameters in a Monte Carlo study". Paper presented at the 11th annual meeting of the Mid-Western Educational Research Association, Chicago. 7. Jump up^ https://www.rand.org/content/dam/rand/pubs/research_memoranda/2008/RM5072.pdf 8. Jump up^ M. L. Tiku, "Power Function of the F-Test Under Non-Normal Situations", Journal of the American Statistical Association Vol. 66, No. 336 (Dec., 1971), page 913 9. Jump up^ https://www.statsoft.com/textbook/elementary-statistics-concepts/ 10. Jump up^ Sawilowsky, S. (1990). Nonparametric tests of interaction in experimental design. Review of Educational Research, 25(20–59).