Postgraduate CourseSample vs populationWe want to know about these(population: N)We have to work with these(sample: n)population mean: μselectionsample mean: X_fit?
Postgraduate CourseLaw of large numbersThe larger the sample size (or the number ofobservations), the more accurate the predictions of thecharacteristics of the whole population, and smallerthe expected deviation in comparisons of outcomes.As a general principle it means that, in the long run,the average (mean) of a large number of observationswill be close to (or: may be taken as the best estimateof) the true mean’ of the population.Sample vs population
Postgraduate CourseSample size: why does it matter? Law of the large numbers: a reliable and accuraterepresentation of the population Statistical power: to prevent a type 2 error / falsenegativeSample vs population
Don‟t confuse: representativeness and reliabilityThe sample size has no direct relationship withrepresentativeness; even a large random sample can beinsufficiently representative.Postgraduate CourseSample vs population
Postgraduate CourseVariablesPostgraduate CourseVariable: anything that can be measured and candiffer across entities or timeIndependent variable: predictor variable (value doesnot depend on any other variables)Dependent variable: outcome variable (valuedepends on other variables)
Postgraduate Course3. Level of measurementPostgraduate Course
Postgraduate CourseLevel of measurementPostgraduate CourseRelationship between what is being measured andthe numbers that represent what is being measured.
Postgraduate CourseCategoricalContinuousNominalOrdinalIntervalRatioLevel of measurement
Postgraduate CourseNominal scaleClassification of categorical data. There is no order to thevalues, they are just given a name („nomen‟) or a number.The numbers can‟t be used to calculate … (you can‟tcalculate the mean of fruit) .. only frequencies1 = Apples2 = Oranges3 = Pineapples4 = Banana’s5 = Pears6 = Mango’s
Postgraduate CourseOrdinal scaleClassification of categorical data. Values can berank-ordered, but the distance between thevalues have no meaning. The numbers canonly be used to calculate a modus or a median1. Full Professor2. Associate professor3. Assistant professor4. PhD5. Master6. Bachelor
Postgraduate CourseInterval scaleClassification of continuous data. Values canbe rank-ordered, and the distance betweenthe values have meaning. However, there isno natural zero point1. John (1932)2. Denise (1945)3. Mary (19524. Marc (1964)5. Jeffrey (1978)6. Sarah (1982)
Postgraduate CourseRatio scaleClassification of continuous data. Values canbe rank-ordered, the distance between thevalues have meaning and there is a naturalzero point.1. Jeffrey (192 cm)2. John (187 cm)3. Sarah (180 cm4. Marc (179 cm)5. Mary (171 cm)6. Denise (165 cm)
Postgraduate CourseNominal Ordinal Interval RatioClassification Yes Yes Yes YesRank-order No Yes Yes YesFixed and equal intervals No No Yes YesNatural 0 point No No No YesNominal Ordinal Interval RatioMode Yes Yes Yes YesMedian No Yes Yes YesMean No No Yes YesLevels of measurementCategorical Continuous
Postgraduate CourseLevels of measurementOrdinal or interval? Can I calculate a mean?Q3: Every organization is unique, hence the findings from scientificresearch are not applicable.☐ Strongly agree☐ Somewhat agree☐ Neither agree or disagree☐ Somewhat disagree☐ Strongly disagree
Postgraduate Course4. Central tendencyThe aim is to find a single number that characterises the typical value ofthe variable in the sample. Which one you use depends in part on thelevel of measurement of the variable.
Postgraduate CourseCentral tendencyCentral tendency of a set of data / numbers(what number is most representative of the dataset / population?)7, 9, 9, 9, 10, 11,11, 13, 13 Mean = 10,2 Median = 10 Mode = 9
Postgraduate CourseCentral tendencyCentral tendency of a set of data / numbers(what number is most representative of the dataset / population?)3, 3, 3, 3, 3, 3, 100 Mean = 16,9 Median = 3 Mode = 3
Postgraduate Course“It is easy to obtain evidence in favor of virtually any theory,but such „corroboration‟ should count scientifically only if itis the positive result of a genuinely „risky‟ prediction, whichmight conceivably have been false.… A theory is scientific only if it is refutableby a conceivable event. Every genuine testof a scientific theory, then, is logically anattempt to refute or to falsify it.”Hypothesis: falsifiabilityCarl Popper
Postgraduate CourseHypothesis Null hypothesis (H0): Big Brother contestants andmembers of the public will not differ in their scores onpersonality disorder questionnaires Alternative hypothesis (H1): Big Brother contestants willscore higher on personality disorder questionnairesthan members of the public.
Postgraduate CourseHypothesis: type I vs type II errornull hypothesisis true& was rejected(type I error)αnull hypothesisis false& was rejected(correct conclusion)null hypothesisis true& was accepted(correct conclusion)null hypothesisis false& was accepted(type II error)βH0 is true H0 is falsereject H0accept H0
Postgraduate CourseA confidence interval gives an estimated rangeof values which is likely to include an unknownpopulation parameter (e.g. the mean).Confidence intervals are usually calculated sothat this percentage is 95% (95% CI)Confidence intervals
Postgraduate CourseWhen you see a 95% confidence interval for amean, think of it like this: if we‟d collected 100samples and calculated the mean for eachsample, than for 95 of these samples the meanwould fall within the confidence interval.Confidence intervals
Postgraduate Course2008 20094,54,03,55,03,0“According to the federalgovernment, theunemployment rate hasdropped from 4.3% to 3.8%.”95% CI= 4,1 - 3,5.This means theunemployment rate couldhave increased from 4.0 to4,1 !Confidence intervals
Postgraduate CourseWhen a point estimate (e.g. mean,percentage) is given, always check: standard deviationor confidence intervalConfidence intervals
110 130Postgraduate Course1. Is there a difference / an effect?2. How certain is it that the difference / effect found is not achance finding?X_0 X_1Statistical significance
Testing multiple hypothesisWhen you test 20 different hypotheses (or independentvariables), there is a high chance that at least one will bestatistically significant.example:Does apples, bacon, cheese, eggs, fish, garlic, hazelnuts, icecream, ketchup, lamb, melons, nuts, oranges, peanut butter,roasted food, salt, tofu, vinegar, wine or yoghurt causecancer?Postgraduate CourseStatistical significance
Significance testing:always prospective, never retrospectivePostgraduate CourseStatistical significance
Sample size Effect size(Significant increase in IQ)4 1025 4100 210.000 0,2Postgraduate CourseStatistical powerThe statistical power: the power to detect a meaningfuleffect, given sample size, significance level, and effect size.
Postgraduate CourseOverpowered: sample size too large, highprobability of making a Type I errorUnderpowered: sample size too small, highprobability of making a Type II error.Statistical power
Postgraduate CourseEffect sizeEffect size: a standardized measure of themagnitude of effect, independent ofsample sizestandardized > makes it possible to compare effect sizesacross different studies that have measured differentvariables, or have used different scales of measurement
Postgraduate CourseEffect sizes Cohen‟s d Pearson‟s r other - Hedges‟ g- Glass‟ Δ- odds ratio OR- relative risk RR
Postgraduate CourseEffect sizes Cohen‟s dEffect size based on means or distancesbetween/among meansInterpretation< .10 = small.30 = moderate> .50 = large
Postgraduate CourseEffect sizes Pearson‟s rEffect size based on ‘variance explained’Interpretation< .10 = small (explains 1% of the total variance).30 = moderate (explains 9% of the total variance)> .50 = large (explains 25% of the total variance)
Postgraduate Course12. Critical appraisalWhen you critically appraise a study, what characteristicsof the findings will you consider to determine its statisticalsignificance and magnitude?
Postgraduate CourseCritical appraisalWhen you critically appraise a study, what characteristicsof the findings will you consider to determine its statisticalsignificance and magnitude? p-value confidence interval sample size / power effect size practical relevance