SlideShare a Scribd company logo
1 of 36
Learning Module 1
Course Preparation:
Basic Statistical Concepts
Types of data analysis
• Quantitative Methods
– Testing theories using numbers
• Qualitative Methods
– Testing theories using language
• Magazine articles/Interviews
• Conversations
• Newspapers
• Media broadcasts
The research process
Test your theory
• Hypothesis:
– Coca-cola kills sperm.
• Independent Variable
– The proposed cause
– A predictor variable
– A manipulated variable (in experiments)
– Coca-cola in the hypothesis above
• Dependent Variable
– The proposed effect
– An outcome variable
– Measured not manipulated (in experiments)
– Sperm in the hypothesis above
Levels of measurement
• Categorical (entities are divided into distinct categories):
– Binary variable: There are only two categories
• e.g. dead or alive.
– Nominal variable: There are more than two categories
• e.g. whether someone is an omnivore, vegetarian, vegan, or fruitarian.
– Ordinal variable: The same as a nominal variable but the categories have a
logical order
• e.g. whether people got a fail, a pass, a merit or a distinction in their exam.
• Continuous (entities get a distinct score):
– Interval variable: Equal intervals on the variable represent equal differences in
the property being measured
• e.g. the difference between 6 and 8 is equivalent to the difference between 13 and 15.
– Ratio variable: The same as an interval variable, but the ratios of scores on the
scale must also make sense
• e.g. a score of 16 on an anxiety scale means that the person is, in reality, twice as anxious
as someone scoring 8.
https://www.scribbr.com/statistics/ratio-data
Validity
• Whether an instrument measures what it set out to
measure.
• Content validity
– Evidence that the content of a test corresponds to the
content of the construct it was designed to cover
• Ecological validity
– Evidence that the results of a study, experiment or test
can be applied, and allow inferences, to real-world
conditions.
Reliability
• Reliability
– The ability of the measure to produce the same results
under the same conditions.
• Test-Retest Reliability
– The ability of a measure to produce consistent results
when the same entities are tested at two different points
in time.
How to measure
• Correlational research:
– Observing what naturally goes on in the world without
directly interfering with it.
• Experimental research:
– One or more variable is systematically manipulated to
see their effect (alone or in combination) on an outcome
variable.
– Statements can be made about cause and effect.
Types of variation
• Systematic Variation
– Differences in performance created by a specific
experimental manipulation.
• Unsystematic Variation
– Differences in performance created by unknown factors.
• Age, Gender, IQ, Time of day, Measurement error etc.
• Randomization
– Minimizes unsystematic variation.
Analysing data: histograms
• Frequency Distributions (aka Histograms)
– A graph plotting values of observations on the horizontal
axis, with a bar showing how many times each value
occurred in the data set.
• The ‘Normal’ Distribution
– Bell shaped
– Symmetrical around the centre
The normal distribution
The curve shows the idealized shape.
0
1000
2000
3000
−4 −2 0 2 4
Score
Frequency
Properties of frequency distributions
• Skew
– The symmetry of the distribution.
– Positive skew (scores bunched at low values with the tail
pointing to high values).
– Negative skew (scores bunched at high values with the tail
pointing to low values).
• Kurtosis
– The ‘heaviness’ of the tails.
– Leptokurtic = heavy tails.
– Platykurtic = light tails.
Skew
Positive skew Negative skew
0
1000
2000
3000
4000
Frequency
Kurtosis
Leptokurtic Platykurtic
0
3000
6000
9000
Frequency
Central tendency: the mode
• Mode
– The most frequent score
• Bimodal
– Having two modes
• Multimodal
– Having several modes
Central tendency: the median
• The Median is the middle score when scores are
ordered:
Central tendency: the mean
• Mean
– The sum of scores divided by the number
of scores.
– Number of friends of 11 Facebook users.
The dispersion: range
• The Range
– The smallest score subtracted from the largest
– For our Facebook friends data the highest score is 234
and the lowest is 22; therefore the range is: 234 −22 =
212
The dispersion: the interquartile range
• Quartiles
– The three values that split the sorted data into four
equal parts.
– Second Quartile = median.
– Lower quartile = median of lower half of the data
– Upper quartile = median of upper half of the data
Sum of squared errors, SS
• Indicates the total dispersion, or total deviance of scores
from the mean:
• More useful to work with the average dispersion, known as
the variance:
Standard deviation
• The variance gives us a measure in units squared.
– In our Facebook example we would have to say
that the average error in out data was 3224.6
friends squared.
• This problem is solved by taking the square root of
the variance, which is known as the standard
deviation:
Important things to remember
• The sum of squares, variance, and standard
deviation represent the same thing:
– The ‘Fit’ of the mean to the data
– The variability in the data
– How well the mean represents the observed data
– Error
Going beyond the data: z-scores
• Z-scores
– Standardising a score with respect to the other
scores in the group.
– Expresses a score in terms of how many
standard deviations it is away from the mean.
– The distribution of z-scores has a mean of 0
and SD = 1.
s
X
X
z
−
=
Probability density function of a normal
distribution
z
Density
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
−4.00 −3.00 −1.96 −1.00 0.00 1.00 1.65 1.96 3.00 4.00
Probability = .025 Probability = .025
Probability = .95
Populations and samples
• Population
– The collection of units (be they people, plankton, plants,
cities, suicidal authors, etc.) to which we want to
generalize a set of findings or a statistical model.
• Sample
– A smaller (but hopefully representative) collection of units
from a population used to determine truths about that
population
Samples and populations
• Sample
– Mean and Standard Deviation(SD) describe only the sample from
which they were calculated.
• Population
– Mean and SD are intended to describe the entire population (very
rare in Psychology).
• Sample to population:
– Mean and SD are obtained from a sample, but are used to estimate
the mean and SD of the population (very common in psychology).
N
s
X
=

0
1
2
3
1 2 3 4 5
Sample mean
Frequency
Population
µ = 3
M = 3
Mean = 3 SD = 1.22
True value
(parameter)
Estimates
(statistics)
M = 3
M = 3
M = 4
M = 4
M = 2
M = 2
M= 5
M = 1
Standard Error
Types of hypotheses
• Null hypothesis, H0
• There is no effect.
• E.g. Big Brother contestants and members of the public will not differ in
their scores on personality disorder questionnaires
• The alternative hypothesis, H1
• AKA the experimental hypothesis
• E.g. Big Brother contestants will score higher on personality disorder
questionnaires than members of the public
Type I and Type II errors
• Type I error
– Occurs when we believe that there is a genuine effect in
our population, when in fact there isn’t.
– The probability is the α-level (usually 0.05)
• Type II error
– Occurs when we believe that there is no effect in the
population when, in reality, there is.
– The probability is the β-level (often 0.2)
Misconceptions around p-values
• Misconception 1: A significant result means that the
effect is important
– No, because significance depends on sample size.
• Misconception 2: A non-significant result means that the
null hypothesis is true
– No, a non-significant result tells us only that the effect is not
big enough to be found (given our sample size), it doesn’t tell
us that the effect size is zero.
• Misconception 3: A significant result means that the null
hypothesis is false?
– No, it is logically not possible to conclude this .
Effect sizes
• An effect size is a standardized measure of the size
of an effect:
– Standardized = comparable across studies
– Not (as) reliant on the sample size
– Allows people to objectively evaluate the size of observed
effect.
Effect size measures
• There are several effect size measures that can be
used:
– Cohen’s d
– Pearson’s r
The correlation coefficient, r
• We’ll cover this measure later in detail …
r = −1 r = −0.5 r = 0
r = 1 r = 0.5
0
1
2
3
4
5
6
7
8
9
10
0
1
2
3
4
5
6
7
8
9
10
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Duration of performance (minutes)
Time
after
performance
before
leaving
(mins)
Cohen’s d
መ
𝑑 =
ത
𝑋1 − ത
𝑋2
𝑠𝑝
𝑠𝑝 =
𝑁1 − 1 𝑠1
2
+ 𝑁2 − 1 𝑠2
2
𝑁1 + 𝑁2 − 2
Effect size measures
• r = .1, d = .2 (small effect):
– the effect explains 1% of the total variance.
• r = .3, d = .5 (medium effect):
– the effect accounts for 9% of the total variance.
• r = .5, d = .8 (large effect):
– the effect accounts for 25% of the variance.

More Related Content

Similar to Basic Statistical Concepts

3.1 non parametric test
3.1 non parametric test3.1 non parametric test
3.1 non parametric testShital Patil
 
De-Mystifying Stats: A primer on basic statistics
De-Mystifying Stats: A primer on basic statisticsDe-Mystifying Stats: A primer on basic statistics
De-Mystifying Stats: A primer on basic statisticsGillian Byrne
 
Analysing & interpreting data.ppt
Analysing & interpreting data.pptAnalysing & interpreting data.ppt
Analysing & interpreting data.pptmanaswidebbarma1
 
Chapter 8: Measurement and Sampling
Chapter 8: Measurement and SamplingChapter 8: Measurement and Sampling
Chapter 8: Measurement and SamplingDemelashAsege
 
Univariate Analysis
 Univariate Analysis Univariate Analysis
Univariate AnalysisSoumya Sahoo
 
Introduction to Data Management in Human Ecology
Introduction to Data Management in Human EcologyIntroduction to Data Management in Human Ecology
Introduction to Data Management in Human EcologyKern Rocke
 
Presentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlalPresentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlalIstiqlalEid
 
1. complete stats notes
1. complete stats notes1. complete stats notes
1. complete stats notesBob Smullen
 
BASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdf
BASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdfBASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdf
BASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdfAdamu Mohammad
 
Basics of statistics
Basics of statisticsBasics of statistics
Basics of statisticsdonthuraj
 
Psych Investigations Revision PowerPoint
Psych Investigations Revision PowerPointPsych Investigations Revision PowerPoint
Psych Investigations Revision PowerPointJamie Davies
 
Parametric tests seminar
Parametric tests seminarParametric tests seminar
Parametric tests seminardrdeepika87
 
Stats-Review-Maie-St-John-5-20-2009.ppt
Stats-Review-Maie-St-John-5-20-2009.pptStats-Review-Maie-St-John-5-20-2009.ppt
Stats-Review-Maie-St-John-5-20-2009.pptDiptoKumerSarker1
 
R - what do the numbers mean? #RStats
R - what do the numbers mean? #RStatsR - what do the numbers mean? #RStats
R - what do the numbers mean? #RStatsJen Stirrup
 
ststs nw.pptx
ststs nw.pptxststs nw.pptx
ststs nw.pptxMrymNb
 
Opinion Dynamics of Skeptical Agents Read-Through
Opinion Dynamics of Skeptical Agents Read-ThroughOpinion Dynamics of Skeptical Agents Read-Through
Opinion Dynamics of Skeptical Agents Read-Throughmatz_twt
 

Similar to Basic Statistical Concepts (20)

3.1 non parametric test
3.1 non parametric test3.1 non parametric test
3.1 non parametric test
 
De-Mystifying Stats: A primer on basic statistics
De-Mystifying Stats: A primer on basic statisticsDe-Mystifying Stats: A primer on basic statistics
De-Mystifying Stats: A primer on basic statistics
 
Analysing & interpreting data.ppt
Analysing & interpreting data.pptAnalysing & interpreting data.ppt
Analysing & interpreting data.ppt
 
Chapter 8: Measurement and Sampling
Chapter 8: Measurement and SamplingChapter 8: Measurement and Sampling
Chapter 8: Measurement and Sampling
 
Univariate Analysis
 Univariate Analysis Univariate Analysis
Univariate Analysis
 
Introduction to Data Management in Human Ecology
Introduction to Data Management in Human EcologyIntroduction to Data Management in Human Ecology
Introduction to Data Management in Human Ecology
 
variables cont
variables contvariables cont
variables cont
 
Non parametric test
Non parametric testNon parametric test
Non parametric test
 
Presentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlalPresentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlal
 
1. complete stats notes
1. complete stats notes1. complete stats notes
1. complete stats notes
 
BASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdf
BASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdfBASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdf
BASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdf
 
Basics of statistics
Basics of statisticsBasics of statistics
Basics of statistics
 
Psych Investigations Revision PowerPoint
Psych Investigations Revision PowerPointPsych Investigations Revision PowerPoint
Psych Investigations Revision PowerPoint
 
Parametric tests seminar
Parametric tests seminarParametric tests seminar
Parametric tests seminar
 
Stats-Review-Maie-St-John-5-20-2009.ppt
Stats-Review-Maie-St-John-5-20-2009.pptStats-Review-Maie-St-John-5-20-2009.ppt
Stats-Review-Maie-St-John-5-20-2009.ppt
 
R - what do the numbers mean? #RStats
R - what do the numbers mean? #RStatsR - what do the numbers mean? #RStats
R - what do the numbers mean? #RStats
 
ststs nw.pptx
ststs nw.pptxststs nw.pptx
ststs nw.pptx
 
Intro statistics
Intro statisticsIntro statistics
Intro statistics
 
Ds 2251 -_hypothesis test
Ds 2251 -_hypothesis testDs 2251 -_hypothesis test
Ds 2251 -_hypothesis test
 
Opinion Dynamics of Skeptical Agents Read-Through
Opinion Dynamics of Skeptical Agents Read-ThroughOpinion Dynamics of Skeptical Agents Read-Through
Opinion Dynamics of Skeptical Agents Read-Through
 

Recently uploaded

Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxabhijeetpadhi001
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxUnboundStockton
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 

Recently uploaded (20)

Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptx
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 

Basic Statistical Concepts

  • 1. Learning Module 1 Course Preparation: Basic Statistical Concepts
  • 2. Types of data analysis • Quantitative Methods – Testing theories using numbers • Qualitative Methods – Testing theories using language • Magazine articles/Interviews • Conversations • Newspapers • Media broadcasts
  • 4. Test your theory • Hypothesis: – Coca-cola kills sperm. • Independent Variable – The proposed cause – A predictor variable – A manipulated variable (in experiments) – Coca-cola in the hypothesis above • Dependent Variable – The proposed effect – An outcome variable – Measured not manipulated (in experiments) – Sperm in the hypothesis above
  • 5. Levels of measurement • Categorical (entities are divided into distinct categories): – Binary variable: There are only two categories • e.g. dead or alive. – Nominal variable: There are more than two categories • e.g. whether someone is an omnivore, vegetarian, vegan, or fruitarian. – Ordinal variable: The same as a nominal variable but the categories have a logical order • e.g. whether people got a fail, a pass, a merit or a distinction in their exam. • Continuous (entities get a distinct score): – Interval variable: Equal intervals on the variable represent equal differences in the property being measured • e.g. the difference between 6 and 8 is equivalent to the difference between 13 and 15. – Ratio variable: The same as an interval variable, but the ratios of scores on the scale must also make sense • e.g. a score of 16 on an anxiety scale means that the person is, in reality, twice as anxious as someone scoring 8. https://www.scribbr.com/statistics/ratio-data
  • 6. Validity • Whether an instrument measures what it set out to measure. • Content validity – Evidence that the content of a test corresponds to the content of the construct it was designed to cover • Ecological validity – Evidence that the results of a study, experiment or test can be applied, and allow inferences, to real-world conditions.
  • 7. Reliability • Reliability – The ability of the measure to produce the same results under the same conditions. • Test-Retest Reliability – The ability of a measure to produce consistent results when the same entities are tested at two different points in time.
  • 8. How to measure • Correlational research: – Observing what naturally goes on in the world without directly interfering with it. • Experimental research: – One or more variable is systematically manipulated to see their effect (alone or in combination) on an outcome variable. – Statements can be made about cause and effect.
  • 9. Types of variation • Systematic Variation – Differences in performance created by a specific experimental manipulation. • Unsystematic Variation – Differences in performance created by unknown factors. • Age, Gender, IQ, Time of day, Measurement error etc. • Randomization – Minimizes unsystematic variation.
  • 10. Analysing data: histograms • Frequency Distributions (aka Histograms) – A graph plotting values of observations on the horizontal axis, with a bar showing how many times each value occurred in the data set. • The ‘Normal’ Distribution – Bell shaped – Symmetrical around the centre
  • 11. The normal distribution The curve shows the idealized shape. 0 1000 2000 3000 −4 −2 0 2 4 Score Frequency
  • 12. Properties of frequency distributions • Skew – The symmetry of the distribution. – Positive skew (scores bunched at low values with the tail pointing to high values). – Negative skew (scores bunched at high values with the tail pointing to low values). • Kurtosis – The ‘heaviness’ of the tails. – Leptokurtic = heavy tails. – Platykurtic = light tails.
  • 13. Skew Positive skew Negative skew 0 1000 2000 3000 4000 Frequency
  • 15. Central tendency: the mode • Mode – The most frequent score • Bimodal – Having two modes • Multimodal – Having several modes
  • 16. Central tendency: the median • The Median is the middle score when scores are ordered:
  • 17. Central tendency: the mean • Mean – The sum of scores divided by the number of scores. – Number of friends of 11 Facebook users.
  • 18. The dispersion: range • The Range – The smallest score subtracted from the largest – For our Facebook friends data the highest score is 234 and the lowest is 22; therefore the range is: 234 −22 = 212
  • 19. The dispersion: the interquartile range • Quartiles – The three values that split the sorted data into four equal parts. – Second Quartile = median. – Lower quartile = median of lower half of the data – Upper quartile = median of upper half of the data
  • 20. Sum of squared errors, SS • Indicates the total dispersion, or total deviance of scores from the mean: • More useful to work with the average dispersion, known as the variance:
  • 21. Standard deviation • The variance gives us a measure in units squared. – In our Facebook example we would have to say that the average error in out data was 3224.6 friends squared. • This problem is solved by taking the square root of the variance, which is known as the standard deviation:
  • 22. Important things to remember • The sum of squares, variance, and standard deviation represent the same thing: – The ‘Fit’ of the mean to the data – The variability in the data – How well the mean represents the observed data – Error
  • 23. Going beyond the data: z-scores • Z-scores – Standardising a score with respect to the other scores in the group. – Expresses a score in terms of how many standard deviations it is away from the mean. – The distribution of z-scores has a mean of 0 and SD = 1. s X X z − =
  • 24. Probability density function of a normal distribution z Density 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −4.00 −3.00 −1.96 −1.00 0.00 1.00 1.65 1.96 3.00 4.00 Probability = .025 Probability = .025 Probability = .95
  • 25. Populations and samples • Population – The collection of units (be they people, plankton, plants, cities, suicidal authors, etc.) to which we want to generalize a set of findings or a statistical model. • Sample – A smaller (but hopefully representative) collection of units from a population used to determine truths about that population
  • 26. Samples and populations • Sample – Mean and Standard Deviation(SD) describe only the sample from which they were calculated. • Population – Mean and SD are intended to describe the entire population (very rare in Psychology). • Sample to population: – Mean and SD are obtained from a sample, but are used to estimate the mean and SD of the population (very common in psychology).
  • 27. N s X =  0 1 2 3 1 2 3 4 5 Sample mean Frequency Population µ = 3 M = 3 Mean = 3 SD = 1.22 True value (parameter) Estimates (statistics) M = 3 M = 3 M = 4 M = 4 M = 2 M = 2 M= 5 M = 1 Standard Error
  • 28. Types of hypotheses • Null hypothesis, H0 • There is no effect. • E.g. Big Brother contestants and members of the public will not differ in their scores on personality disorder questionnaires • The alternative hypothesis, H1 • AKA the experimental hypothesis • E.g. Big Brother contestants will score higher on personality disorder questionnaires than members of the public
  • 29.
  • 30. Type I and Type II errors • Type I error – Occurs when we believe that there is a genuine effect in our population, when in fact there isn’t. – The probability is the α-level (usually 0.05) • Type II error – Occurs when we believe that there is no effect in the population when, in reality, there is. – The probability is the β-level (often 0.2)
  • 31. Misconceptions around p-values • Misconception 1: A significant result means that the effect is important – No, because significance depends on sample size. • Misconception 2: A non-significant result means that the null hypothesis is true – No, a non-significant result tells us only that the effect is not big enough to be found (given our sample size), it doesn’t tell us that the effect size is zero. • Misconception 3: A significant result means that the null hypothesis is false? – No, it is logically not possible to conclude this .
  • 32. Effect sizes • An effect size is a standardized measure of the size of an effect: – Standardized = comparable across studies – Not (as) reliant on the sample size – Allows people to objectively evaluate the size of observed effect.
  • 33. Effect size measures • There are several effect size measures that can be used: – Cohen’s d – Pearson’s r
  • 34. The correlation coefficient, r • We’ll cover this measure later in detail … r = −1 r = −0.5 r = 0 r = 1 r = 0.5 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 Duration of performance (minutes) Time after performance before leaving (mins)
  • 35. Cohen’s d መ 𝑑 = ത 𝑋1 − ത 𝑋2 𝑠𝑝 𝑠𝑝 = 𝑁1 − 1 𝑠1 2 + 𝑁2 − 1 𝑠2 2 𝑁1 + 𝑁2 − 2
  • 36. Effect size measures • r = .1, d = .2 (small effect): – the effect explains 1% of the total variance. • r = .3, d = .5 (medium effect): – the effect accounts for 9% of the total variance. • r = .5, d = .8 (large effect): – the effect accounts for 25% of the variance.