Statistics 1 (FPN) QP

Statistics 1 (FPN)– Question Pool

Welcome to Success Formula Question Pool
Disclaimers
• All slides and its materials are the property of Success Formula
• You get an exclusive free personal access once buying the course the slides are made for
• The slides are individually marked, and Success Formula can track to which users they belong
• No part of this slide deck may be reproduced, distributed, or transmitted (hereafter in this slide
referred together as “Shared”) in any form or by any means, including sharing the material on
platforms such as StudyDrive
• In case slides are shared, Success Formula can attempt legal actions towards the sharing party in line
with European and Dutch Law (Copyright laws)
1
Error Bounty
• If you find any mistake in this slide deck, let us know and we will refund you the cost of the slides
• Only the first person indicating the mistake gets the refund

Answers
Question
Some people seem to like Breaking Bad, others like Prison Break. What is the percentage of people that
watch TV?
2
A. The Walking Dead
B. Depends on the year
C. All of them
D. Answer D because it is the best answer
Answer: C
Introduction question Question topic
The question
Difficulty
Answers
Correct
Answer

Significance level
*** Always use a significance level of 0.05 if
otherwise not specified***
3

Stats1 – Question Pool
Probability Theory

Answers
Question
Florian wants to show Julian a new magic trick. As part of the trick, Julian has to pull a card out of a 52
card deck, 3 times in a row, each time keeping the card before pulling the next one. There are 26 red
cards and 26 black cards.
Which statement is incorrect?
5
A. The probability that out of the three chosen cards, there is at least one red card or at least one black
card is equal to 1
B. The outcome of the 2nd trial will influence the outcome of the 3rd trial
C. The probability of picking a queen of hearts equals the probability of picking a queen of hearts
given that in the previous trial Julian picked a 7 of spades
D. The sample space is all the possible combinations of cards that can be drawn in a sample of 3
Answer: C
1. Probability Theory

1E. Probability Theory
Question
Florian wants to show Julian a new magic trick. As part of the trick, Julian has to pull a card out of a 52
card deck, 3 times in a row, each time keeping the card before pulling the next one. There are 26 red
cards and 26 black cards. Which statement is incorrect?
6
Solution
A. Correct. Since the deck of cards has an equal number of red and black cards, Julian will definitely
pick at least 1 card of either black or red colour, meaning that we have a perfect probability equal to
1
B. Correct. Every time Julian picks a card, he does not put it back, meaning that each outcome of every
trial will influence the next one (the events become dependent)
C. Incorrect. P(QH) = P(QH/7S) à That would be correct if the events were independent. In other
words, if after every trial, Julian put his chosen card back in the deck.
D. Correct. Julian picks 3 cards in total so any possible combination that he can make with 3 cards is
included in the sample space

Answers
Question
Suppose that 2 dice are rolled at the same time. Calculate the following probabilities:
• P(A): The sum of the two numbers is equal to 1
• P(B): The sum of the two numbers is equal to 5
• P(C): The sum of the two numbers is less than 13
7
A. P(A) = 0.5, P(B) = 0.23, P(C) = 0
B. P(A) = 0, P(B) = 0.111, P(C) = 1
C. P(A) = 1, P(B) = 0.12, P(C) = 0
D. The probabilities cannot be calculated
Answer: B

Question
Suppose that 2 dice are rolled at the same time.
Calculate the following probabilities:
• P(A): The sum of the two numbers is
equal to 1
• P(B): The sum of the two numbers is
equal to 5
• P(C): The sum of the two numbers is less
than 13
Sample Space:
(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
(5,1) (5,2) (5,3) (5,4) (5,5) (5,6)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
8
Solution
No possible combination resulting from rolling 2
dice at the same time can give us a sum equal to 1
since dice do not have the number 0.
• The smallest sum we can find is equal to 2,
resulting from the combination (1,1)
• P(A) = 0
To calculate P(B), we need to identify from our
sample space the combinations that yield a sum
of 5. In this case, we have 4 combinations
(colored ones).
• We can use the general formula
• P(A) =
𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝑨
𝒕𝒐𝒕𝒂𝒍
• 𝑷 𝑨 =
𝟒
𝟑𝟔
=
𝟏
𝟗
= 𝟎. 𝟏𝟏𝟏
We can observe that the combination resulting in
the largest sum is the (6,6) with a sum of 12.
• This means that all possible combinations will
yield a sum lower than 13
• P(C) is the probability of the entire sample
space
• P(C) = 1

Answers
Question
An experiment has four mutually exclusive outcomes, A, B, C, and D. If P(A) = 0.33, P(B) = 0.17, P(C) =
0.43, P(D) = 0.07, which of the following statements must be true?
9
A. All of the events are independent with each other
B. The marginal probability of A equals the conditional probability of A given D
C. The joint probability of C and B is equal to 0
D. None of the alternatives is correct
Answer: C

Question
An experiment has four mutually exclusive outcomes, A, B, C, and D. If P(A) = 0.33, P(B) = 0.17, P(C) =
0.43, P(D) = 0.07, which of the following statements must be true?
10
Solution
A. Incorrect. Given that all of our 4 events are mutually exclusive, they cannot happen at the same
time. Thus, we know that our events must be dependent on each other.
B. Incorrect. This is only the case when the 2 events are independent with one another [𝑃 𝐴 =
𝑃 ⁄
𝐴 𝐵 .]
C. Correct. Οur events are mutually exclusive, meaning that they cannot happen at the same time.
[P(C AND B) = 0]
D. Incorrect. C is the correct statement.

Answers
Question
Suppose we conduct a random experiment and two events, A and B are independent. Which of the
following rules can we use to prove the relationship between A and B?
11
A. P(A and B) = 0
B. P(and B) = P(A) x P(B/A)
C. P(A or B) = P(A) + P(B) – P(A and B)
D. P(A)=P(A/B)
Answer: D

Question
Suppose we conduct a random experiment and two events, A and B are independent. Which of the
following rules can we use to prove the relationship between A and B?
12
Solution
A. Incorrect. P(A and B) = 0 is the rule for spotting disjoint events. It shows that the two events cannot
happen at the same time.
B. Incorrect. P(A and B) = P(A) x P(B/A) is the general multiplication rule
C. Incorrect. P(A or B) = P(A) + P(B) – P(A and B) is the general addition rule
D. Correct. P(A) = P(A/B) is a rule for spotting independent events, showing that the probability of
event A is not influenced by the occurrence of event B

Answers
Question
A recent survey showed that 45% of Success Formula students prefer to visit Tapijn park to relax after a
long day of studying. Also, 27% of UM students both like to go to Tapijn park and the city center to
relax. Finally, the survey showed that 40% of students said that they don’t visit the city center for some
time off. Based on the above data, determine the following probabilities:
a. PA: the probability that a randomly selected UM student visits Tapijn given that he/she also
visits the city center
b. PB: the probability that a randomly selected UM student visits Tapijn or visits the city center
13
A. P(A) = 0.45, P(B) = 0.27
B. P(A) = 0.88, P(B) = 0
C. P(A) = 0.18, P(B) = 0.85
D. P(A) = 0.45, P(B) = 0.78
Answer: D

Question
A recent survey showed that 45% of Success
Formula students prefer to visit Tapijn park to
relax after a long day of studying. Also, 27% of
UM students both like to go to Tapijn park and
the city center to relax. Finally, the survey
showed that 40% of students said that they don’t
visit the city center for some time off. Based on
the above data, determine the following
probabilities:
a. PA: the probability that a randomly
selected UM student visits Tapijn given
that he/she also visits the city center
b. PB: the probability that a randomly
selected UM student visits Tapijn or
visits the city center
P(Tapijn) = 0.45
P(Tapijn AND City) = 0.27
𝑷 𝑪𝒊𝒕𝒚! = 0.4
P(City) =𝟏 − 𝑷 𝑪𝒊𝒕𝒚!
P(City) = 𝟏 − 𝟎. 𝟒 = 𝟎. 𝟔
14
Solution
Ø For P(A) we are looking for the P(Tapijn/City)
Ø We can first check if these 2 events are
independent
• 𝑃 𝐴 𝐴𝑁𝐷 𝐵 = 𝑃 𝐴 ×𝑃 𝐵 à rule for
spotting independence
• 0.27 = 0.45 × 0.6
• 0.27 = 0.27 à P(Tapijn) and P(City) are
independent
• P(Tapijn/City) = P(Tapijn)
• P(A) = 0.45
Ø For P(B) we want the P(Tapijn Or City)
Ø The joint probability of these events is not
equal to 0, thus the events are non-disjoint
Ø We can use the general formula
• 𝑃 𝐵 = 𝑃 𝑇𝑎𝑝𝑖𝑗𝑛 + 𝑃 𝐶𝑖𝑡𝑦 −
𝑃 𝑇𝑎𝑝𝑖𝑗𝑛 𝐴𝑛𝑑 𝐶𝑖𝑡𝑦
• P(B) = 0.45 + 0.6 – 0.27
• P(B) = 0.78

Answers
Question
Suppose one runs a random experiment with 3 events (A, B, C). Events A and B are disjoint, C is
independent of A and dependent with B. P(B) = 0.3, P(C/B) = 0.135, P(C/A) =0.48, P(C and A) = 0.16.
Calculate the following probabilities:
a. P(C)
b. P(A and B)
c. P(B or C)
d. P(A or B)
15
A. P(C) = 0.48, P(A and B) = 0, P(B or C) = 0.74, P(A or B) = 0.63
B. P(C) = 0.48, P(A and B) = 0.0405, P(B or C) = 0.78, P(A or B) = 0
C. P(C) = 0.48, P(A and B) = 0, P(B or C) = 0.63, P(A or B) = 0.74
D. P(C) = 0.48, P(A and B) = 0.73, P(B or C) = 0.86, P(A or B) = 0.63
Answer: A

Question
Suppose one runs a random experiment with 3
events (A, B, C). Events A and B are disjoint, C is
independent of A and dependent with B. P(B) =
0.3, P(C/B) = 0.135, P(C/A) =0.48, P(C and A) =
0.16. Calculate the following probabilities:
a. P(C)
b. P(A and B)
c. P(B or C)
d. P(A or B)
16
Graph
Event C
Event B
Event A
Solution
Since events A and C are independent we can say:
• P(C) = P(C/A)
• P(C) = 0.48
We know that events A and B are disjoint and we
also see that there is no intersection in the graph:
• P(A and B) = 0
P(B or C) = P(B) + P(C) – P(B and C)
• We do not have P(B and C) but we can find it
using the multiplication rule
• P(B and C) = P(B) x P(C/B) = 0.3 X 0.135 =
0.0405
• P(B or C) = 0.3 + 0.48 - 0.0405 = 0.74
Since A and B are disjoint events we will use the
special form of the formula:
• P(A or B) = P(A) +P(B)
• We can calculate P(A) using the
multiplication rule
• P(C and A) = P(A) x P(C)
• à P(A) = 0.16/0.48 = 0.33
P(A or B) = 0.33 + 0.3 = 0.63

Answers
Question
Remco decides to investigate which Dutch delicacy is most preferred by students in Maastricht. He
writes down his results in the following table. Calculate the following probabilities:
1. The probability that we randomly select a student who likes fries, given that they are a male
2. The probability that we randomly select a student who is a female, given they like fries
3. The probability that the student likes bitterballen
17
A. P(1) = 66.67%, P(2) = 34.78%, P(3) = 32.5%
B. P(1) = 20%, P(2) = 66.67%, P(3) = 17.5%
C. P(1) = 34.78%, P(2) = 33.33%%, P(3) = 32.5%
D. P(1) = 34.78%, P(2) = 23.52%, P(3) = 17.5%
Answer: C
Fries Bitterballen Stroopwaffles
Male 40 35 40 115
Female 20 30 35 85
60 65 80 200

Question
Remco decides to investigate which Dutch
delicacy is most preferred by students in
Maastricht. He writes down his results in the
following table. Calculate the following
probabilities:
1. The probability that we randomly select a
student who likes fries, given that they are a
male
2. The probability that we randomly select a
student who is a female, given they like fries
3. The probability that the student likes
bitterballen
18
Solution
P(1) = P(Fries/Male)
• It is a conditional probability so we are not
working within the entire sample space
• The condition indicates the denominator
• 𝑃 1 =
!"
##$
= 34,78%
P(2) = P(Female/Fries)
• P(2) =
"#
$#
= 33.33%
P(3) = P(Bitterballen)
• It is the marginal probability within the
entire sample space
• P(3) =
$%
"##
= 32.5%
Fries Bitter
ballen
Stroop
waffles
Male 40 35 40 115
Female 20 30 35 85
60 65 80 200

Answers
Question
Refer to the table from the previous question. Which of the following statements is correct:
19
A. The probability P(Bitterballen/Female) is not evaluated across the entire sample space
B. The events of picking randomly someone that is a female and of picking randomly someone who
likes stroopwaffles are disjoint
C. The marginal probability of P(Fries) is equal to the conditional probability of P(Fries/Male)
D. The events of randomly picking a male and randomly picking someone that likes stroopwaffles are
independent
Answer: A
Fries Bitterballen Stroopwaffles
Male 40 35 40 115
Female 20 30 35 85
60 65 80 200

Question
Refer to the table from the previous question.
Which of the following statements is correct:
20
Solution
A. Correct. P(Bitterballen/Female) is not
evaluated across the entire sample space,
Conditional probabilities are evaluated across
a subset of the entire sample space, in this
case acorss the subset of females.
B. Incorrect. We can see from the table that
there are females that prefer stroopwaffles
(n=35), so these 2 events can happen at the
same time (not Disjoint)
C. Incorrect. P(Fries) ≠ P(Fries/Male)
𝑃(𝐹𝑟𝑖𝑒𝑠) =
60
200
= 0.3
𝑃
𝐹𝑟𝑖𝑒𝑠
𝑀𝑎𝑙𝑒
=
40
115
= 0.35
D. Incorrect. P(Male) ≠ P(Male/Stroopwaffles)
𝑃 𝑀𝑎𝑙𝑒 =
115
200
= 0.575
𝑃 𝑀𝑎𝑙𝑒/𝑆𝑡𝑟𝑜𝑜𝑝𝑤𝑎𝑓𝑓𝑙𝑒𝑠 =
40
80
= 0.2
Fries Bitterb
allen
Stroop
waffles
Male 40 35 40 115
Female 20 30 35 85
60 65 80 200

Answers
Question
The probability of meeting someone who wears eyeglasses randomly in the street is 0.55. When
meeting 4 random people, what is the probability that the number of people that you meet wearing
eyeglasses is 3 or higher?
21
A. P(X≥ 3) = 0.392
B. P(X≥ 3) = 0.346
C. P(X≥ 3) = 0.092
D. The probability cannot be calculated because we do not have the sample size
Answer: A

Question
The probability of meeting someone who wears
eyeglasses randomly in the street is 0.55. When
meeting 4 random people, what is the probability
that the number of people that you meet wearing
eyeglasses is 3 or higher?
22
Solution
G
G
G
G
NG
NG
G
NG
NG
G
G
NG
NG
G
NG
NG
G
G
G
NG
NG
G
NG
NG
G
G
NG
NG
G
NG
0.55
0.45

23
Find the Right
Combinations
Since we are looking for the probability of meeting 3 or more people with glasses
in our sample of 4, the right combinations are the following:
• G-G-G-G
• G-G-G-NG
• G-G-NG-G
• G-NG-G-G
• NG-G-G-G
Calculate the
Probabilities
We need to calculate the probabilities using multiplication for each of the
combinations:
• G-G-G-G è 0.55 x 0.55 x 0.55 x 0.55 = 0.092
• G-G-G-NG è 0.55 x 0.55 x 0.55 x 0.45 = 0.075
• G-G-NG-G è 0.55 x 0.55 x 0.45 x 0.55 = 0.075
• G-NG-G-G è 0.55 x 0.45 x 0.55 x 0.55 = 0.075
• NG-G-G-G è 0.45 x 0.55 x 0.55 x 0.55 = 0.075
Sum Them Up
We need to add all of the probabilities we just calculated to find the overall
probability of meeting 3 or more people with glasses [P(x ≥ 3)]
• 0.092 + 0.075 + 0.075 + 0.075 + 0.075 = 0.392

Answers
Question
Given the following probability distribution, what is the approximate variance of X?
24
A. 4.05
B. -1.66
C. 7.38
D. 15.52
Answer: D
X P(x)
0 0.4
1 0.8
2 0.32
3 0.15
4 0.54

Question
25
Solution
Ø First, we need to calculate the expected value in order to use in the formula for the variance:
• µ𝒙 = ∑ 𝑃(𝑥) ∗ x = 0 x 0.4 + 1 x 0.8 + 2 x 0.32 + 3 x 0.15 + 4 x 0.54 = 4.05
Ø We can now calculate the variance using the formula 𝜎3² = ∑ 𝑃(𝑥) ∗ (𝑥 − µ3)²
• 𝜎3² = 0.4 0 − 4.05 4
+ 0.8 1 − 4.05 4
+ 0.32 2 − 4.05 4
+ 0.15 3 − 4.05 4
+ 0.54 4 − 4.05 4
𝜎3² = (6.56) + (7.44) + (1.34) + (0.17) + (0.00135)
𝝈𝒙² = 15.52
Given the following probability distribution, what is the variance of X?
X P(x)
0 0.4
1 0.8
2 0.32
3 0.15
4 0.54

Probability Distribution

Answers
Question
Thomas takes a standardized test as part of his university application. Standardized tests allow
comparisons to be made regarding student achievement. When he received his results, he was told that
he scored -0.28 in terms of Z-scores. However, he is not sure whether that is a good or bad result.
Given that the test scores are normally distributed, what can he conclude from the result?
27
A. He did better than half of the participants
B. He did worse than half of the participants
C. He did worse than 28% of the participants
D. Nothing can be said because we do not have the standard deviation and the mean
Answer: B
1. Probability Distribution

1E. Probability Distribution
Question
Thomas takes a standardized test as part of his university application. Standardized tests allow
comparisons to be made regarding student achievement. When he received his results, he was told that
he scored -0.28 in terms of Z-scores. However, he is not sure whether that is a good or bad result. Given
that the test scores are normally distributed, what can he conclude from the result?
28
Solution
Ø Since Thomas has a Z-score equal to -0.28, it means that he scored 0.28 standard deviations below
the mean. The negative sign indicates the direction in regards to the mean. The mean is the average,
with 50% of the scores below and 50% of the scores above it. Since Thomas is on the left side, we can
say that he performed worse than 50% of the test takers.

29
µ
50% 50%
Z=-0.28

Answers
Question
Lea decides to investigate the average income distribution in her hometown. She observes that the
majority of households have a low to middle income and a small minority with a high-income.
Which of the following statements is correct?
30
A. Scores located within 1 standard deviation to the left and right of the mean make up 68% of the
entire data set
B. A household with an income of 2.3 standard deviations above the mean is in the top 2.5% of the
population
C. The variable in question is a discrete variable
D. None of the above statements is correct
Answer: D

Question
Lea decides to investigate the average income distribution in her hometown. She observes that the
majority of households have a low to middle income and a small minority with a high-income.
31
Solution
Ø From the discription, we can understand that the distribution of average income is right skewed,
rather than a normal distribution.
Ø A) and B) alternatives are wrong because they refer to the rule of thumb (68%-95%-99.7%), which
can only be used for normal distributions
Ø The thrid alternative is wrong because the variable of average income can take infinite possible
values, thus the variable is continuous

Answers
Question
Alexandra decides to measure extraversion scores of students at Success Formula. The scores are well
modeled by a normal distribution with a mean of 72 and a standard deviation of 14. What is the
probability of a randomly selected person to score between 66 and 76 for extraversion?
32
A. 28.05%
B. 61.41%
C. 32.98%
D. 40.82%
Answer: A

Question
Alexandra decides to measure extraversion scores of students at Success Formula. The scores are well
modeled by a normal distribution with a mean of 72 and a standard deviation of 14. What is the
probability of a randomly selected person to score between 66 and 76 for extraversion?
33
Solution
Calculate the z-scores: 𝑧& =
'$('"
&)
= 0.29 and 𝑧" =
$$('"
&)
= −0.43
Look up probabilities in z-table: 𝑧& = 0.29 → 61.41% and 𝑧" = −0.43 → 33.36%
Calculate the probability that the score is between 66 and 78: 61.41% − 33.36% = 28.05%

Answers
Question
Suppose that Alexandra measures extraversion scores for a different population with a mean of 80 and
a standard deviation of 9. What is the probability that a randomly selected person scores higher than
91?
34
A. 73.89%
B. 11.12%
C. 40.57%
D. 55.63%
Answer: B

Question
Suppose that Alexandra measures extraversion scores for a different population with a mean of 80 and
a standard deviation of 9. What is the probability that a randomly selected person scores higher than
91?
35
Solution
Calculate the z-scores: 𝑧& =
*(+
,
=
-&(.#
-
= 1.22
Look up probabilities in z-table: 𝑧& = 1.22 → 0.8888 (𝑇ℎ𝑖𝑠 𝑖𝑠 𝑡ℎ𝑒 𝑙𝑒𝑓𝑡 𝑠𝑖𝑑𝑒𝑑 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦)
Calculate the probability that score is higher than 91 (right sided probability):
1 − 0.8888 = 0.1112
→ 11.12%

Answers
Question
According to the Central Limit Theorem:
36
A. The sample distribution becomes normal if there is a sufficient sample size (n>25)
B. The sampling distribution becomes normal only when the population distribution is normal
C. Regardless of the shape of the population distribution, the sampling distribution will always be
normal
D. As a sample size increases, the sample mean and standard deviation will be closer in value to the
population mean µ and standard deviation σ
Answer: D

Questions
According to the Central Limit Theorem:
37
Solution
A. Incorrect. It is not the sample distribution that approaches normality when there is a sufficiently
large sample. It is the sampling distribution.
B. Incorrect. The sampling distribution is indeed normal when the population distribution is normal
but it can also approach normality whenever the sample size is suffciently large, regardless of the
population’s shape
C. Incorrect. The sampling distribution is not always normal. For a small sample size, it has a similar
shape to the population distribution and not necessarly normal. For a large sample size, it becomes
approximately normal
D. Correct. As the sample size becomes larger, the mean of all sampled variables and the variances of
the samples become approximately equal to that of the population.

Answers
Question
Maja plans to study the effects of Omega-3 supplements on antisocial behaviour. She develops a
measurement which will be filled by her participants before and after a 2-month long trial during which
subjects will be taking daily omega-3 supplements. However, she has trouble recruiting a high number
of participants.
Given that the sample size is not large enough, which of the following statements is incorrect:
38
A. The sample mean is a biased estimator of the population mean
B. The shape of the sampling distribution will be similar to that of the population distribution
C. The standard error will probably be too high
D. There is a high risk of unreliable statements about population parameters
Answer: A

Question
Maja plans to study the effects of Omega-3 supplements on antisocial behaviour. She develops a
measurement which will be filled by her participants before and after a 2-month long trial during which
subjects will be taking daily omega-3 supplements. However, she has trouble recruiting a high number
of participants.
Given that the sample size is not large enough, which of the following statements is incorrect:
39
Solution
A. This statement is incorrect. Bias is not depended on the size of the sample. We might have an
inaccurate estimate, but if we are using the right one for the population parameter, the estimate is
still unbiased. An estimate will be biased if the estimate is not the appropriate one (e.g., no random
sample)
B. Correct. Since Maja has a small sample size, the sampling distribution has a similar shape to the
population distribution and not necessarly a normnal one.
C. Correct. Based on the C.L.T, the lower the sample size, the greater the standard error
D. Correct. Larger sample sizes allow more reliable statements about population parameters,
compared to small sample sizes.

40
Estimator
Something that is used in statistics to estimate some facts about population.
à Sample mean is an estimator of population mean.
Bias
Bias = the difference between the expected value that is estimated and the true
value of the parameter
à The V
𝑿 of a simple random sample is always unbiased.
Efficiency
The accuracy of the sample mean.
à The larger the sample size, the smaller the standard error.
à The smaller the standard error, the more efficient the estimate.

Answers
Question
Alexithymia is a personality trait which features inability to describe, identify and experience
emotions. In a population of people with borderline alexithymia, emotional intelligence scores have a
mean of 57 and a standard deviation of 15. The population distribution is skewed the right. Darian
takes a simple random sample of 32. What is the probability that our sample mean will be between 55
and 60?
41
A. 74.86%
B. 13.11%
C. 64.42%
D. The probability cannot be calculated because the population distribution is skewed
Answer: C

Question
Alexithymia is a personality trait which features
inability to describe, identify and experience
emotions. In a population of people with
borderline alexithymia, emotional intelligence
scores have a mean of 57 and a standard
deviation of 15. The population distribution is
skewed the right. Darian takes a simple random
sample of 32. What is the probability that our
sample mean will be between 55 and 60?
µ = 57
σ = 15
n = 32
à Central Limit Theorem applies (n >25)
42
Solution
Ø Calculate Z-scores
𝑧& =
X
Χ − 𝜇
𝜎
𝑛
=
60 − 57
15
32
= 1.13
z" =
X
Χ − 𝜇
𝜎
𝑛
=
55 − 57
15
32
= −0.75
Ø Look up probabilities in z-table
𝑧& = 1.13 → 87.08%
𝑧" = −0.75 → 22.66%
Ø Calculate the probability that the score is
between 55 and 60:
87.08% − 22.66% = 64.42%

Answers
Question
A certain variable follows a normal population distribution. The population mean is equal to 23.48 and
the standard deviation equal to 4.657. The probability that the sample mean is higher than 24 equals
25.14%.
Calculate the sample size.
43
A. 49
B. 24
C. 36
D. The sample size cannot be calculated
Answer: C

Question
A certain variable follows a normal population
distribution. The population mean is equal to
23.48 and the standard deviation equal to 4.657.
The probability that the sample mean is higher
than 24 equals 25.14%.
Calculate the sample size.
µ = 23.48
σ = 4.657
P( ̅
𝑥 > 24) = 25.14%
44
Solution
Ø We need to see for which Z-score, the
probability of having a sample mean
higher than 24 equals 25.14%
• Since it is a right-sided probability, we
need to substract from 1 (table gives
left-sided probabilities)
• 1-0.2514=0.7486
• We can find the 0.7486 in the table and
it is for the z-score of 0.67
Ø We can use the Z-formula
𝑧 =
X
𝑋 − 𝜇
𝜎
𝑛
0.67 =
24 − 23.48
4.657
𝑛
=
0.52
4.657
𝑛
0.67 =
0.52× 𝑛
4.657
𝑛 =
0.67×4.657
0.52
= 6
𝒏 = 𝟔𝟐 = 𝟑𝟔

Answers
Question
Eero develops a new brand of cherry soda and he has decided on a specific bottle design. The contents
of soda bottles are normally distriuted with a mean of 400 and a standard deviation of 7. There is a
8.38% chance that the average contents of a 4-pack will exceed how many ml?
45
A. 400.12
B. 404.83
C. 407.31
D. 400.60
Answer: B

Question
Eero develops a new brand of cherry soda and he has decided on a specific bottle design. The contents
of soda bottles are normally distriuted with a mean of 400 and a standard deviation of 7. There is a
8.38% chance that the average contents of a 4-pack will exceed how many ml?
46
Solution
Ø We know that the contents of the soda bottles are normally distributed, thus we can use the Z-table
Ø P( ̅
𝑥>?)=8.38 (right sided probability) ⇔ 1– 0.0838 = 0.9162 ⇔ Z = 1.38
𝑍 =
̅
𝑥 − 𝜇
g
𝜎
𝑛
1.38 =
̅
𝑥 − 400
g
7
4
4.83 + 400 = ̅
𝑥
̅
𝑥 = 404.83

Answers
Question
Leonie wishes to investigate homeslessness experiences in Maastricht. However, there is no list of
homeless people in the city. She decides to use instead a non-random sampling method known as
snowball sampling. Leonie meets one homeless person who participates in her research and also put
her in contact with other homeless people in the area that they know. Using this method she is able to
gather 178 participants.
Which of following statements pertaining to the population estimator is true?
47
A. The estimator is unbiased and efficient
B. The estimator is unbiased and not efficient
C. The estimator is biased and efficient
D. The estimator is biased and not efficient
Answer: C

Question
Leonie wishes to investigate homeslessness experiences in Maastricht. However, there is no list of
homeless people in the city. She decides to use instead a non-random sampling method known as
snowball sampling. Leonie meets one homeless person who participates in her research and also put
her in contact with other homeless people in the area that they know. Using this method she is able to
gather 178 participants.
Which of following statements pertaining to the population mean estimator is true?
48
Solution
Ø Leonie is using a non-random sampling method, meaning that her sample is not random. This can
lead to Leonie using an inappropriate estimator for the population mean which would make her
estimator biased. ’Bias’ has nothing to do with the sample size
Ø Leonie has a sample size of 178 participants which is a sufficiently large sample (C.L.T). Thus, her
estimator for the population mean will indeed be efficient. As the sample size increases, the
standard error decreases

Hypothesis Testing

Answers
Question
A researcher claims that he was able to develop a drug that enhances human attention. He will test this
hypothesis by recruiting 80 individuals with Attention Deficit Disorder (ADD). He divides evenly his
sample into 2 groups and makes sure that the groups are matched in their attention levels. He
continues by administering the drug only in group 1, keeping group 2 as a control. Finally, all
participants across both groups have to complete an Attention Test, with higher scores indicating
worse attention.
What is the researcher’s null and alternative hypothesis?
50
A. H0: µ1= µ2, Hα: µ1 ≠ µ2
B. H0: µ1 ≠ µ2, Hα: µ1< µ2
C. H0: µ1= µ2 Hα: µ1> µ2
D. H0: µ1= µ2 Hα: µ1< µ2
Answer: D

1E. Hypothesis Testing
Question
A researcher claims that he was able to develop a drug that enhances human attention. He will test this
hypothesis by recruiting 80 individuals with Attention Deficit Disorder (ADD). He divides evenly his
sample into 2 groups and makes sure that the groups are matched in their attention levels. He
continues by administering the drug only in group 1, keeping group 2 as a control. Finally, all
participants across both groups have to complete an Attention Test, with higher scores indicating
worse attention.
What is the researcher’s null and alternative hypothesis?
51
Solution
A. Incorrect. The alternative hypothesis indicates a two-sided test (Hα: µ1 ≠ µ2). The researcher wants to test
the hypothesis that the drug enhances human attention, so we are looking for a one-sided test.
B. Incorrect. The null hypothesis always suggests that there is no significant relationship between our data.
In this case, it is the hypothesis that the drug will not have an effect on the mean of group 1 (H0: µ1 =µ2)
C. Incorrect. The alternative hypothesis states that the mean of group 1 should be higher than that of group
2 after the drug administration. However, higher scores mean worse attention levels. Since the researcher
expects that the drug is beneficial, we should be expecting that group 1 has better attention levels than
group 2, thus lower scores
D. Correct. The alternative hypothesis claims that group 2 will have worse attention relative to group 1, as
seen from their higher test scores

Answers
Question
Refer back to the example in question one. The researcher is informed that the population of people
with ADD is skewed to the right. Which of the following statements is correct?
52
A. The researcher can still test his hypothesis because normality is not a necessary condition
B. The researcher can still test his hypothesis because his sample size is large enough
C. The researcher cannot test his hypothesis because there is no normality in the population
D. The researcher cannot test his hypothesis because his sample size is not large enough
Answer: B
2. Hypothesis Testing

Question
Refer back to the example in question one. The researcher is informed that the population of people
with ADD is skewed to the right. Which of the following statements is correct?
53
Solution
A. Incorrect. In order to be able to test our hypothesis, we need to make sure that we are working with
a normal distribution
B. Correct. The researcher can indeed do the test because he has a large enough sample size, meaning
that the central limit theorem applies (= the sampling distribution approximates a normal
distribution as the sample size gets larger, regardless of the population distribution)
C. Incorrect. Since the central limit theorem applies, we do not need to worry about the skewed
population distribution
D. Incorrect. The sample size is large enough. The cut-off for the central limit theorem to apply is n ≥
25

Answers
Question
Florian believes that a new Artificial Intelligence teaching method can influence student ratings
compared to using human tutors. He is however unsure about what this influence can look like because,
despite the AI’s greater efficiency, students might still prefer human interaction during their tutorials.
Florian then takes a SRS of 27 students from a population of students with a mean rating of µ=30,2 and
a standard deviation of σ=16. The sample of students take a lesson from the AI system and then give it a
rating with a mean of 24,5.
Can Florian conclude that the mean rating of the AI system is significantly different from the mean of
the normal method?
54
A. Yes, we reject the null hypothesis with the p-value of 0.0322
B. Yes, we reject the null hypothesis with the p-value of 0.0644
C. No, we cannot reject the null hypothesis with the p-value of 0.0322
D. No, we cannot reject the null hypothesis with the p-value of 0.0644
Answer: D

Question
Florian believes that a new Artificial Intelligence teaching
method can influence student ratings compared to using
human tutors. He is however unsure about what this
influence can look like because, despite the AI’s greater
efficiency, students might still prefer human interaction
during their tutorials. Florian then takes a SRS of 27 students
from a population of students with a mean rating of µ=30,2
and a standard deviation of σ=16. The sample of students
take a lesson from the AI system and then give it a rating
with a mean of 24,5. The significance level is 5%
Can Florian conclude that the mean rating of the AI system
is significantly different from the mean of the normal
method?
55
Data
Η0: 𝜇& = 𝜇"
Hα: 𝜇& ≠ 𝜇" (2-tailed test)
α = 0.05
µ = 30.2
σ = 16
n = 27
̅
𝑥 = 24.5
Solution
Ø The sample size is large enough (n=27), so we
can continue with the test
Ø We can use the Z formula to calculate the Zobs
𝑍012 =
X
𝑋 − 𝜇
𝜎
𝑛
=
24.5 − 30.2
16
27
= −1.85
Ø Using the Z-table we see that a Zobs with a
value of -1.85 is matched to a p-value of
0.0322
Ø Since we have a 2-tailed test, we need to
double our p-value
𝑝 − 𝑣𝑎𝑙𝑢𝑒×2
0.0322×2 = 0.0644
Ø We can then compare our p-value to the alpha
0.0644 > 0.05
Ø The p-value is larger than the α, thus the
null hypothesis cannot be rejected

Answers
Question
Suppose that for a two-sided test, an experimenter decides to have a significance level of 0.10.
Which of the following statements is incorrect?
56
A. The Z-critical is going to be equal to ±1.65
B. The probability of a type 1 error is equal to 10%
C. If the null hypothesis is rejected at this level, then it will also be rejected at α=0.05
D. With the current significance level, there is a lower probability of not rejecting a false null
hypothesis compared to a significance level of 0.05
Answer: C

Question
Suppose that for a two-sided test, an
experimenter decides to have a significance level
of 0.10.
Which of the following statements is incorrect?
57
Solution
A. Correct. In case of a two-sided test with
α=10%, then the Z-critical becomes +/- 1.65
B. Correct. The probability of a type 1 error is
always equal to the significance level of the
study
• Type 1 error = α = 10%
C. Incorrect. If the null hypothesis is rejected at α
= 10%, it does not necessarily mean that it
will be rejected at α = 1%
• E.g., a p-value equal to 0.04 is smaller
than 0.10, however it is not smaller than
0.01. Thus, the H0 would be rejected at α
= 10% but not at α = 1%
D. Correct. By increasing the significance level,
we make the decision criteria more lenient,
making it more difficult to commit a type 2
error. However, we simultaneously increase
the risk of a false positive, that is rejecting a
true null hypothesis
90%
5%
5%

Answers
Question
A questionnaire has been constructed to measure the level of psychopathy for incarcerated individuals.
The population is normally distributed with a mean of 44 and a standard deviation of 12. A researcher
wants to check the hypothesis that the population mean is different, so she draws a SRS of 23
individuals. The sample mean is 53.
What are the boundaries of a 90% confidence interval based on this specific sample?
58
A. [48.87, 57.13]
B. [48.14, 56.90]
C. [43.89, 54.96]
D. [49.63, 52.47]
Answer: A

Question
A questionnaire has been constructed to measure the level of psychopathy for incarcerated individuals.
The population is normally distributed with a mean of 44 and a standard deviation of 12. A researcher
wants to check the hypothesis that the population mean is different, so she draws a SRS of 23
individuals. The sample mean is 53.
What are the boundaries of a 90% confidence interval based on this specific sample?
59
Solution
H0: µ = 44
Hα: µ≠ 44
µ = 44
σ = 12
n = 23
X
𝑋 = 53
Zc = 1.65 (because it is a 90% CI)
𝑋𝑜𝑏𝑠 ± 𝑍𝑐×
𝜎
𝑛
53 ± 1.65×
12
23
53 − 1.65×
12
23
= 53 − 1.65×2.5 = 48.87
53 + 1.65×
12
23
= 53 + 1.65×2.5 = 57.13
[48.87, 57.13]

Answers
Question
Suppose we have a 95% Confidence Interval [37.2, 42.5].
Calculate the sample mean and the standard error
60
A. X
𝑋 = 40.05, 𝑆𝐸 = 3,39
B. X
𝑋 = 38.74, 𝑆𝐸 = 4.63
C. X
𝑋 = 39.85, 𝑆𝐸 = 1.35
D. X
𝑋 = 41.40, 𝑆𝐸 = 2.22
Answer: C

Sample Mean
Suppose we have a 95% Confidence Interval
[37.2, 42.5].
Calculate the sample mean and the standard
error.
α = 5%
Zc = 1.96
CI [37.2, 42.5]
V
𝒙 ± 𝒁𝒄×
𝝈
𝒏
V
𝒙 ± 𝟏. 𝟗𝟔×
𝝈
𝒏
61
Standard Error
Ø Confidence interval: x̄012 ± 𝑍3 ∗ g
4
5
Ø From the previous calculations we can see
that:
1.96×
𝜎
𝑛
= ̅
𝑥−37.2
Ø We already found the sample mean, so we can
use it to calculate the fruction:
1.96×
𝜎
𝑛
= 39.85 − 37.2
𝜎
𝑛
=
2.65
1.96
𝜎
𝑛
= 1.35
37.2 = ̅
𝑥 − 1.96×
𝜎
𝑛
1.96×
𝜎
𝑛
= ̅
𝑥−37.2
42.5 = ̅
𝑥 + (1.96×
𝜎
𝑛
)
42.5 = ̅
𝑥 + ̅
𝑥 − 37.2
2 ̅
𝑥 = 42.5 + 37.2
2 ̅
𝑥 = 79.7
̅
𝑥 =
79.7
2
4
𝒙 = 𝟑𝟗. 𝟖𝟓
Standard Error

Answers
Question
Going back to the example of the previous question, what can be said about the null hypothesis, given
that the population mean is equal to 36.05?
62
A. The null hypothesis is accepted
B. The null hypothesis is rejected
C. The null hypothesis cannot be rejected
D. Nothing can be said about the null hypothesis with the current data
Answer: B

Question
Going back to the example of the previous question, what can be said about the null hypothesis, given
that the population mean is equal to 36.05?
63
Solution
A. Incorrect. When doing a hypothesis test, we can either reject the null hypothesis or do not reject
the null hypothesis, but we can never accept the null hypothesis. We cannot conclude that the null
hypothesis is true merely because we did not find evidence to reject it
B. Correct. We can see that for our 2-tailed test, the population mean is not included within the range
of the 90% CI, so the null hypothesis is rejected
C. Incorrect. Since the population mean is not included in the confidence interval, the null hypothesis
is rejected
D. Incorrect. The second statement is correct.

64
Condifence
Interval
Ø A confidence interval is an interval estimate of µ.
Ø It shows the values that the population mean probably falls between
V
𝑿 ± 𝒁𝒄×
𝝈
𝒏
Interpretation
Example: 95% Confidence Interval
Ø If we draw infinite Confidence Intervals, then 95% of those CI have the
population mean µ
Hypothesis
Testing
Ø We can use the confidence interval to see if the null hypothesis is rejected or
not for a two-tailed test
Ø If the population mean from the null hypothesis is located inside the interval,
then the null hypothesis cannot be rejected because the specific value is a
possible population mean
Ø If the population mean from the null hypothesis is not located inside the
interval, the null hypothesis is rejected

Answers
Question
Tobias investigates the effects of participative leadership on satisfaction levels within employees.The
sample mean is equal to 73.8. The boundaries of the 95% confidence interval are [71.4, 76.5].
Calculate the margin of error and the standard error.
65
A. ME = 5.7, SE = 1.22
B. ME = 2.4, SE = 1.22
C. ME = 2.9, SE = 3.91
D. ME = 2.4, SE = 4.75
Answer: B

Question
Tobias investigates the effects of participative leadership on satisfaction levels within employees.The
sample mean is equal to 73.8. The boundaries of the 95% confidence interval are [71.4, 76.5].
Calculate the margin of error and the standard error
66
Solution
X
𝑋 = 73.8
95% 𝐶. 𝐼 → [71.4, 76.5]
Zcritical = 1.96
Margin of error:
L
𝑋 ± 𝑍5×
𝜎
𝑛
L
𝑋 − 𝑍5×
𝜎
𝑛
= 71.4
𝑍5×
𝜎
𝑛
= L
𝑋 − 71.4 = 73.8 − 71.4
𝑍5×
𝜎
𝑛
= 2.4
Standard error:
𝑍6×
𝜎
𝑛
= 2.4
𝜎
𝑛
=
2.4
𝑍6
=
2.4
1.96
= 1.22

Answers
Question
Kian is the HR manager for Success Formula. He noticed that the employees are lately having more
stress than usual, so he decides to evaluate their stress levels using a measurement scale (less points =
less stress). On average, the 26 employees had a stress score of 83 with a standard deviation of 17 . Kian
then decided to implement a mindfulness program with the goal of reducing stress scores by 8 points.
The significance level is 5%
What is the power of the test, given that the mindfulness program works as Kian was expecting?
67
A. 0.7734
B. 0.2266
C. 0.6066
D. 0.7123
Answer: B

Question
Kian is the HR manager for Success Formula. He
noticed that the employees are lately having
more stress than usual, so he decides to evaluate
their stress levels using a measurement scale
(less points = less stress). On average, the 26
employees had a stress score of 83 with a
standard deviation of 17 . Kian then decided to
implement a mindfulness program with the goal
of reducing stress scores by 8 points. The
significance level is 5%
What is the power of the test, given that the
mindfulness program works as Kian was
expecting?
H0: µ = 83
Ηα: µ < 83
Zc = -1.65
α = 0.05
n = 26
σ = 17
µ = 83
µ (new) = 75
68
Answer
Ø Find the critical value
𝑍3 =
𝑋3 − 𝜇
𝜎
𝑛
−1.65 =
Χ3 − 83
17
26
−5.49 = 𝑋3 − 83 ⇒ 𝑋3 = 77.51
Ø Solve for Z
𝑍3 =
𝑋3 − 𝜇(𝑛𝑒𝑤)
𝜎
𝑛
Z =
77.51 − 75
17
26
= 0.75
Ø Find the β
• Using the Z-table, we find a p-value of
0.7734
Ø To calculate the power we use the formula:
𝑷𝒐𝒘𝒆𝒓 = 𝟏 − 𝜷
𝑷𝒐𝒘𝒆𝒓 = 𝟏 − 𝟎. 𝟕𝟕𝟑𝟒 = 𝟎. 𝟐𝟐𝟔𝟔

69
Type II
Error
Ø Definition: We fail to reject a false null hypothesis
Ø Measured by β
Ø Calculation:
• Find the critical value where 𝑯𝒐 would be rejected.
• 𝑍5 =
𝑿𝒄78"
9
#
$
à solve for 𝑿𝒄
• Z =
𝑿𝒄78%
9
#
$
à solve for Z, then look up P
Power
Ø Definition: The probability that we are able to reject a false null hypothesis
Ø Calculation:
• Power = 1 - 𝜷
Illustration

Answers
Question
Suppose Micheal is conducting an experiment on fear conditioning. He uses a sample of 65 participants
and a significance level of 5%. Before he begins, he wants to make sure that the probability of rejecting
a true null hypothesis is as small as possible.
70
A. He should increase his sample size
B. He should increase the effect size
C. He should increase the significance level
D. None of the above
Answer: D

Questions
Suppose Micheal is conducting an experiment on fear conditioning. He uses a sample of 65 participants
and a significance level of 5%. Before he begins, he wants to make sure that the probability of rejecting
a true null hypothesis is as small as possible.
71
Solution
A. Incorrect. By increasing the sample size, we decrease the standard error and thus the probability of
not rejecting a false null hypothesis (Type II error)
B. Incorrect. Increasing the effect size is difficult in real life since researchers do not have any control
over it. Theoretically, the higher the effect size, the lower the probability of failing to reject a null
hypothesis (Type II error)
C. Incorrect. By increasing the significance level, it becomes easier to reject a null hypothesis. We
increase the probability of rejecting a true H0 hypothesis (Type I error)
D. None of the above alternatives is correct. Rejecting a true null hypothesis is the Type I error and its
probability is measured by α. We can reduce the probability by reducing the α, but this increases the
probability of type II error (Nor recommended)

Stats1 - Question Pool
T-tests
72

Answers
Question
A randomly drawn sample of 60 university students undergo exam training. Before the training, their
mean score on a practice exam was 68. After the training, their mean score improved by 7 points. What
(t-)test would you employ to check if the exam training had a significant effect?
73
A. One-sample t-test
B. Paired samples t-test
C. Independent samples t-test
D. Two-sample t-test
Answer: B
1. T-tests

1E. T-tests
Question
A randomly drawn sample of 60 university students undergo exam training. Before the training, their
mean score on a practice exam was 68. After the training, their mean score improved by 7 points. What
(t-)test would you employ to check if the exam training had a significant effect?
74
Solution
A. Incorrect, we compare two dependent samples not the one sample against the population.
B. Correct, the groups are paired since we test the sample twice (before and after exam training).
C. Incorrect, the two groups are not independent, they are dependent.
D. Incorrect, a two-samples t-test is an independent t-test. The groups were dependent, not
independent.

Answers
Question
When testing a null hypothesis about a single population mean, a t-test is usually performed rather
than a z-test. A t-test is more likely to be employed because…
75
A. A t-test has more power than a z-test, leading to a more reliable result.
B. Quantitative variables can only be analysed with t-tests.
C. Z-tests are more prone to type I errors, which are to be avoided.
D. In practice, the standard deviation of a population is rarely known.
Answer: D
2. T-tests

2E. T-tests
T-tests
When to use a t-test?
When we can’t use the z-scores because, σ
(population standard deviation) is unknown
• We have to estimate for both parameters.
• We use an extra estimate (Sx)
• T-distribution is more dispersed relative to
the z-distribution
• T-test is always less powerful
76
Z-tests
Z-tests measure of how many standard deviations
our sample (V
𝑿) differs from the hypothesized
value of the population mean (𝝁).
• Makes use of the z-distribution
• More powerful than a t-test
• Most times cannot be used, since in reality
we do not know much about the
parameters of the population

Answers
Question
A researcher is interested in the effect of wearing red lipstick on the score at minigolf. They ask 40
people to wear red lipstick while playing 18 holes on the minigolf court. 70 people played the same 18
holes without wearing red lipstick. The dependent variable is the obtained score after the 18 holes (a
lower score is considered to be better). The red lipstick condition had a mean score of 47.5 and a
standard deviation of 4.3. The no-red lipstick condition had a mean score of 62 and a standard
deviation of 9.2.
Which test should the researcher use to test the hull hypothesis that the score at minigolf is not
affected by wearing red lipstick?
77
A. An independent samples t-test, assuming unequal population variances.
B. An independent samples t-test, assuming equal population variances.
C. A paired samples t-test.
D. A one-sample t-tests.
Answer: A
3. T-tests

3E. T-tests
Question
A researcher is interested in the effect of wearing red lipstick on the score at minigolf. They ask 40
people to wear red lipstick while playing 18 holes on the minigolf court. 70 people played the same 18
holes without wearing red lipstick. The dependent variable is the obtained score after the 18 holes (a
lower score is considered to be better). The red lipstick condition had a mean score of 47.5 and a
standard deviation of 4.3. The no-red lipstick condition had a mean score of 62 and a standard
deviation of 9.2.
Which test should the researcher use to test the hull hypothesis that the score at minigolf is not
affected by wearing red lipstick?
78
Solution
A. Correct. The 2 groups are independent, and we compare their samples. The goal of the test is to
check if the 2 samples come from populations with equal means. We see that the rule of thumb
(𝑆𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑆𝐷 ×2 > 𝐵𝑖𝑔𝑔𝑒𝑟 𝑆𝐷) does not hold and the groups don’t have equal sample sizes. This
means we have to do the t-test without assuming equal variances
B. Incorrect. We cannot assume equal variances because the rule of thumb is violated and the group
sizes are not equal
C. Incorrect. Paired samples t-test requires matched groups or a within-subject design.
D. Incorrect. One sample t-test is used when we have 1 population and want to check if its mean is
equal to a specific value.

3E. T-tests
Assumption T-Test Concerned How to Determine What if Violated
Normality All T-Tests
1. Histogram of
Sample Scores looks
normal
2. Sample Size is
large (Central Limit
Theorem)
Can’t do T-test
Quantitative All T-Tests
Dependent variable is
quantitative
Can’t do T-test
Dependent Groups Paired T-Test
The groups are
matched
Two-Sample T-test
Independent Groups Two-Samples T-Test
Two separate groups
are measured.
Paired T-test
Equal Variance Two-Samples T-Test
1. One sample SD is
not 2x bigger than
the other. (Rule of
Thumb).
2. Levene’s Test is not
significant.
3. The sample sizes
are equal.
If the assumption is
violated Two-Sample
T-test not assuming
Equal variance has to
be used.
à Less powerful
79

Answers
Question
The effect of Ritalin on test performance is tested. 31 participants received a Ritalin pill while another
31 participants received a placebo. The test performance is assumed to be good if the score on the test
is high. The null hypothesis is that exam performance is the same both under Ritalin and placebo, while
the alternative hypothesis is that Ritalin leads to better test performance. The table below presents the
group statistics, computed by SPSS (equal variances assumed).
What statement is incorrect?
80
A. The means of the two populations are very similar. However, a visual inspection of the group
statistics is not enough to reject the null hypothesis.
B. The equal variances assumption is violated, thus we should not interpret the test
C. The equal variances assumption is not violated, thus we can interpret the test
D. During the t-test, we should compute the weighted average of the two standard deviations
Answer: B
4. T-tests
condition N Mean Std. Deviation Std. Error Mean
Test score placebo 31 10.1182 1.9463 .1699
Ritalin 31 10.9374 2.2824 .4099

4E. T-tests
Question
The effect of Ritalin on test performance is tested. 31 participants received a Ritalin pill while another
31 participants received a placebo. The test performance is assumed to be good if the score on the test
is high. The null hypothesis is that exam performance is the same both under Ritalin and placebo, while
the alternative hypothesis is that Ritalin leads to better test performance. The table below presents the
group statistics, computed by SPSS.
What statement is incorrect?
81
Solution
A. Correct. Sample means are random variables, meaning they change depending on the sample. Thus
in order to be able to make conclusions about the populations we need to make sure whether the
differences between the means are indeed significant.
B. Incorrect. The equal variances assumption is not violated. We can check this using the rule of
thumb (biggest SD < smallest SD x 2)
C. Correct. Using the rule of thumb, we can see that the product of the smallest SD multiplied by 2 is
bigger than the bigger SD (Ritalin group), thus the assumption is not violated
D. Correct. Since the equal variances assumtpion is not violated, the 2 standard deviations estimate
the same population standard deviation. By computing their weighted average (pooled SD), we have
the best estimate of σ
condition N Mean Std. Deviation Std. Error Mean
Test score placebo 31 10.1182 1.9463 .1699
Ritalin 31 10.9374 2.2824 .4099

4E. T-test
82
Checking
Equal Variances
Assumption
We can use 2 ways to check for the assumption
1. Rule of Thumb
– Smaller SDx2 should be larger than the Bigger SF
2. Levene’s Test
– If the test is significant, the variances are unequal (H0: 𝜎;
4
= 𝜎4
4
)
Violation of
Assumption
If this assumption is violated, we can continue with the t-test if the sample size
across both samples is approximately equally large
Special case
If there is violation AND the samples have a difference in size, we can do the t-test
but only with the following formula:
𝑡 =
x̅! − x̅" − (𝜇!− 𝜇")
𝑠!
"
𝑛!
+
𝑠"
"
𝑛"
If H0: 𝜇! = 𝜇" → = 0

Answers
Question
Natalia is a memory researcher and as part of her pilot study, she wishes to test the differences in
memory recall between severe anxiety patients and controls. She suspects that anxiety patients will
have different memory recall scores compared to controls. After a memory test, she compares the
scores of the groups. The anxiety group has a mean of 12.6 and a standard deviation of 3.38. The
control group has a mean of 13.4 and a standard deviation of 2.61. There are 70 participants in total,
equally divided into the 2 groups.
What can Natalia conclude about the null hypothesis.
83
A. The null hypothesis is not rejected with 0.10 ≤ 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 0.15
B. The null hypothesis is rejected with 0.01 ≤ 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 0.05
C. The null hypothesis is not rejected with 0.20 ≤ 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 0.30
D. The nyll hypothesis is rejected with 0.02 ≤ 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 0.025
Answer: C
5. T-tests

5E. T-tests
Question
Natalia is a memory researcher and as part of her
pilot study, she wishes to test the differences in
memory recall between severe anxiety patients
and controls. She suspects that anxiety patients
will have different memory recall scores
compared to controls. After a memory test, she
compares the scores of the groups. The control
group has a mean of 13.4 and a standard
deviation of 2.61. The anxiety group has a mean
of 12.6 and a standard deviation of 3.38. There
are 70 participants in total, equally divided into
the 2 groups.
What can Natalia conclude about the null
hypothesis.
H0: µ1=µ2
Hα: µ1≠ µ2
n1=n2=35
X1=13.4
X2= 12.6
S1=2.61
S2=3.38
84
Solution
Ø Since equal variances assumed, we need to
calculate the pooled standard deviation
𝑠#=
𝑛! − 1 𝑠!
" + (𝑛" − 1)𝑠"²
(𝑛!−1) + (𝑛" − 1)
𝑆𝑝 =
34 < 2.61" + 34 < 3.38"
34 + 34
= 3.02
Ø Next, we need to calculate the Tobs
𝑇 =
@
𝑋! − @
𝑋"
𝑆𝑝 <
1
𝑛1
+
1
𝑛2
𝑇 =
13.4 − 12.6
3.02 <
1
35
+
1
35
𝑇 =
0.8
3.02 < 0.24
= 1.11
Ø Using the t-table we see that the p-value is
between the 0.10 and the 0.15. For a 2-
tailed test, we need to double these values
0.20 ≤ 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 0.30
Bigger SD < Smallest SD x 2
3.38 < 2.61 x 2
3.38 <5.22 (True)
à Equal variances assumed

Answers
Question
85
A. [-6.52, -3.88]
B. [-6.34; -4.59]
C. [-6.50; -4.0]
D. [-7.29;-3.91]
Answer: A
6. T-tests
An ice cream company has two new potential flavours ready for the market. They developed a tastiness
scale scored from 0 to 30. 40 volunteers tasted flavour A and another 25 volunteers tasted flavour B.
The obtained values are: @
𝑋$= 22.8, @
𝑋% = 28, 𝑠$ = 4.2 and 𝑆% = 1.9.
What is the 95% Confidence Interval corresponding to this t-test?

6E. T-tests
Question
An ice cream company has two new potential
flavours ready for the market. They developed a
tastiness scale scored from 0 to 30. 40 volunteers
tasted flavour A and another 25 volunteers tasted
flavour B. The obtained values are: @
𝑋$= 22.8,
@
𝑋% = 28, 𝑠$ = 4.2 and 𝑆% = 1.9.
What is the 95% Confidence Interval
corresponding to this t-test?
nA=40
nB=25
@
𝑋$= 22.8
@
𝑋% = 28
𝑆$ = 4.2
𝑆% = 1.9
86
Solution
Ø We are dealing with 2 independent groups,
thus we should have an independent samples
t-test
Ø We have to decide if the assumption of equal
variances is violated, in order to use the
correct fomrulas
𝑆𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑆𝐷 ×2 > 𝐵𝑖𝑔𝑔𝑒𝑟 𝑆𝐷
1.9×2 > 4.2
3.8 > 4.2 𝑁𝑜𝑡 𝑡𝑟𝑢𝑒
Ø The equal variances assumption is violated,
thus we use the special case of the t-test
@
𝑋! − @
𝑋" ± 𝑇 <
𝑠!
"
𝑛!
+
𝑆"
"
𝑛"
22.8 − 28 ± 1.711 <
4.2"
40
+
1.9"
25
−5.2 ± 1.711 < 0.5854
−5.2 ± 1.711 < 0.77
−5.2 ± 1.32
[−6.52, −3.88]

6E. T-tests
Confidence Interval: General Formula
Observed X±𝑡& ∗ Standard Error
Example: Two-Sample T-Test
N=20 (both conditions), @
𝑋$= −2.1, @
𝑋% = −3.5,
𝑠$ = 2.05 and 𝑆% = 1.89. What is the 95% CI?
@
𝑋$ − @
𝑋% ±𝑡& ∗ (𝑠# ∗
!
'!
+
!
'"
)
𝑠#=
!(∗".+,"-!(∗!..( ²
!(-!(
= 1.97
1.4±2.04*(1.97 ∗
!
"+
+
!
"+
)
= [0.13;2.67]
Standard Errors of The Different T-Tests
One-Sample T-test
T
𝑠
𝑛
Paired Sample T-test
T
𝑠0
𝑛
Two-Sample T-test
𝑠# ∗
1
𝑛!
+
1
𝑛"
Pooled Standard Deviation:
𝑠#=
'!1! 2!
"-('"1!)2"²
('!1!) - ('"1!)
Two-Sample T-test
Equal variance not assumed
𝑠!
"
𝑛!
+
𝑠"
"
𝑛"
87

Answers
Question
Suppose we are testing the null hypothesis that the population mean is equal to a specific value and the
test is right sided. Refer to the SPSS output.
88
A. The null hypothesis is rejected for a significance level of 2.5%
B. The null hypothesis is not rejected for a significance level of 5%
C. The degrees of freedom were found by taking the smallest sample size and subtracting 1
D. None of the alternatives is correct
Answer: A
7. T-tests
Test Value = 570
t df Sig. (2-
tailed)
Mean
Difference
95% Confidence Interval
Lower Upper
Test score 2.139 29 0.041 20.333 0.89 39.77

7E. T-tests
Question
Suppose we are testing the null hypothesis that the population mean is equal to a specific value and the
test is right sided. Refer to the SPSS output.
89
Solution
A. Correct. The SPSS output gives the p-value for a two-sided test. However, we have a one-tailed test
(right sided test means that the alternative hypothesis has the (<) symbol). Thus, we need to divide
the p-value by two (0.041/2=0.0205). We can now see that the corrected p-value is smaller than
0.025, thus the H0 is rejected at an α = 2.5%
B. Incorrect. The corrected p-value is smaller than 0.05 as well. Thus, the H0 is rejected at α = 5% as
well.
C. Incorrect. Since we have a one sample t-test, the formula for the degrees of freedom is N-1. It is for
an independent samples t-test, not assuming equal variances that we take the smallest n and
subtract 1 for the df
D. Incorrect. A is the correct one
Test Value = 570
t df Sig. (2-
tailed)
Mean
Difference
95% Confidence Interval
Lower Upper
Test score 2.139 29 0.041 20.333 0.89 39.77

Answers
Question
A researcher wants to test whether ethnic background influences IQ scores of Dutch primary school
children. They draw a sample of 50 children with grandparents of Turkish origin and another 50
children with Dutch grandparents. Each child of Turkish descend is match for age and sex with a Dutch
one. The groups data is summarized in the table below.
A paired sample t-test was used to test this hypothesis. Which of the following tests could have yielded
the same result?
90
Mean N Std. Deviation Std. Error Mean
Turkish 98.657 50 10.0023 1.6523
Dutch 103.203 50 14.5602 2.2436
A. An independent t-test, assuming equal population variances.
B. An independent t-test, assuming unequal population variances.
C. A one-sample t-test, conducted for the difference in IQ score between matched children.
D. None of the answer above.
Answer: C
8. T-tests

8E. T-tests
Question
A researcher wants to test whether ethnic background influences IQ scores of Dutch primary school
children. They draw a sample of 50 children with grandparents of Turkish origin and another 50
children with Dutch grandparents. Each child of Turkish descend is match for age and sex with a Dutch
one. The groups data is summarized in the table below.
A paired sample t-test was used to test this hypothesis. Which of the following tests could have yielded
the same result?
91
Solution
A. Incorrect, the two groups are match, so they are dependent, not independent.
B. Incorrect, the two groups are match, so they are dependent, not independent.
C. Correct, a paired samples t-test compares the means of the samples to check whether there is a
difference between their means. The 2 tests have the same calculations, thus if one finds the
mean differences and then performs a one sample t-test on the differences, they would get the
same result.
D. Incorrect. Answer is C
Mean N Std. Deviation Std. Error Mean
Turkish 98.657 50 10.0023 1.6523
Dutch 103.203 50 14.5602 2.2436

Answers
Question
Inspect the given
output.
What answer is
Correct?
92
A. Lavene’s Test is not significant, therefore equal variances can be assumed.
B. The Tobs is equal to -2.845
C. According to the t-table, the null hypothesis is rejected
D. All answers are correct.
Answer: D
9. T-tests
?
?
?
?

9E. T-tests
Question
Inspect the given
output.
What answer is
Correct?
93
Solution
A. Correct. Levene’s Test has the null hypothesis that the population variances are equal (𝜎!
"
= 𝜎"
"
).
Since we can see that the p-value is a lot larger than 0.05 (p-value = 0.582), we can say that the null
hypothesis is not rejected and that there is no violation of the equal variances assumption
B. Correct. We can calculate the Tobs by dividing the Mean difference ( ̅
𝑥! − ̅
𝑥" = −14.00) by the Std.
Error difference (𝑠# ∗
!
'!
+
!
'"
= 4.92). This will give us -2.845
C. Correct. The null hypothesis in this case is rejected because the value 0 is not located in the 95% CI,
meaning that the population difference between the 2 groups cannot be 0
?
?
?
?

Answers
Question
Florian is the GM of Success Formula and has recently heard that colour can influence learning
performances and outcomes. He was informed that research has shown that the colour blue leads to
better performances in tests and better recall. The classes at SF however are painted in white. Florian
decides to test if indeed the colour blue leads to better results compared to white. He gathers 38
students and assigns them to 2 groups. The groups are matched together in regards to skill, age,
motivation and more. One group takes the class in a room painted white, while the second group in a
room painted blue. The test score means afterwards are compared. The population distribution of
difference scores is normal.
Florian gets the following SPSS output. Which statement is correct?
94
10. T-tests
Paired Differences
Mean Std.
Deviation
Std.
Error
Mean
95% CI T df Sig
(2-
tailed)
Lower Upper
Pair 1. White
-
Blue
-.579 2.524 .579 -1.795 .637 -1.000 18 .331

Answers
Question
Florian gets the following SPSS output. Which statement is correct?
95
A. There is a probability of 0.331 that the H0 is true
B. The researcher might be making a Type I error
C. The researcher might be making a Type II error.
D. Since the TOBS is not located within the 95% CI, the null hypothesis can be rejected
Answer: C
10. T-tests

10E. T-tests
Question
Florian gets the following SPSS output (next slide). Which statement is correct?
96
Solution
A. Incorrect. The p-value is 0.331 and it is defined as the probability that our data (or more extreme
data) would have occurred, given that the null hypothesis is true. The p-value does not give the
probability that H0 is true. It is the conditional probability with the condition that H0 is true
B. Incorrect. Type 1 error is defined as rejecting a true null hypothesis. However, our p-value is larger
than 0.05, thus we did nor reject the null hypothesis in the first place. The probability that we are
making a Type 1 error in this case is 0%
C. Correct. Type 2 error is defiened as not rejecting a false null hypothesis. Since the p-value is larger
than our significance level, we did reject H0, but there is always the chance that we made an error
D. Incorrect. While using the CI to see if the H0 is rejected or not for a paired samples t-test, we need
to see if the value 0 is located in the interval, not the Tobs. This is becausle the null hypothesis states
that there is no difference.

10E. T-tests
Type I Error
The null hypothesis is true but we reject it.
à Measured with α
97
Graphical Illustration
Type II Error
The null hypothesis is false but we fail to reject it.
à Measured by β

Stats I – Question Pool
ANOVA
98

Answers
Question
ANOVA assumes the following statistical model: 𝑌𝑖𝑗 = 𝜇 + 𝛼𝑖 + 𝜀𝑖𝑗, in which Yij denoting the score
of person j in group i.
Choose the incorrect statement from below:
99
A. µ1= Yij - 𝜀𝑖𝑗 represents the mean of group 1
B. εij has a different value for each individual participant, regardless of treatment effects.
C. µ is a variable effect, specific to each participant.
D. If there is no treatment effect, αi is equal among all participants.
Answer: C
1. ANOVA

1E. ANOVA
Question
ANOVA assumes the following statistical model: 𝑌𝑖𝑗 = 𝜇 + 𝛼𝑖 + 𝜀𝑖𝑗, in which Yij denoting the score
of person j in group i.
Choose the incorrect statement from below:
100
Solution
A. Correct. The difference between the individual score from the group mean is a great indicator of the
unexplained variation caused by factors not controlled. It can be written as 𝜀𝑖𝑗 = Yij − 𝜇5 ⇔ 𝜇5 =
Yij − 𝜀𝑖𝑗
B. Correct. Individual differences are uncontrollable factors that result in the divergence of scores of
participants within the same groups. For each participant, regardless the treatment effects, the
individual differences/residual factors are different
C. Incorrect, µ is a constant effect. It refers to the factors that are the same in all conditions. It stays
the same for each subject.
D. Correct, if there is no treatment effect, 𝛂𝐢 (for all participants) = 0.

1E. ANOVA
Main Formula
𝐘𝐢𝐣 = 𝛍 + 𝛂𝐢 + 𝛆𝐢𝐣
101
Sum of Squares
∑(𝒀𝒊𝒋 -Ӯ)² = ∑𝒊𝒏𝒊(Ӯ𝒊- Ӯ)² + ∑(𝒀𝒊𝒋 - Ӯ𝒊)²
Participant
j in group i
Constant
effect
Effect
of group i
Effect of remaining
factors of participant
j in group i (error)
= + +
Total sum
of squares
(TSS)
Between
group sum
of squares
(SSG)
Within
group sum
of squares
(SSE)
= +

1E. ANOVA
Example
SSG (Between Groups)
SSG = ∑5𝑛5(Ӯ5- Ӯ)²
SSG = 3*(2-4)²+3*(4-4)²+3*(6-4)²
SSG = 24
Tip: Alternative notation of 𝛼5= µ5 - µ
Here µ5=Ӯ5 (mean of single group) and µ=Ӯ (total
mean).
Preparation
What is the mean of each group
Ӯ!= (1+2+3)/3 = 2
Ӯ"= (3+4+5)/3 = 4
Ӯ7= (4+5+6)/3 = 6
What is the total mean?
Ӯ = (2+4+6)/3 = 4
SSW (Within Groups)
SSW = ∑(𝑌58 - Ӯ5)²
SSW = (1-2)²+(2-2)²+(3-2)²+(3-4)²…+(7-6)²
SSW = 6
Tip: Alternative notation of 𝜀58= 𝑌58 - µ5
Here µ5 is the same as Ӯ5. Both describe the mean
of a single group.
G1 G2 G3
P1 1 3 5
P2 2 4 6
P3 3 5 7
3 different conditions with 3 participants each
102

Answers
Question
Participants were asked to memorise a list of words. They were divided into several groups, each using a
different memorization technique. 60 minutes later, the experimenter assessed how many words they
could still remember (the dependent variable RECALL in the output). Which statement is correct?
103
A. The experimental setting had 3 conditions.
B. The total variance equals 4.91
C. The ANOVA test is significant (𝛂= 5%).
D. All answer are correct.
Answer: D
2. ANOVA
41.566
41.850
83.416
20.783
2.790

2E. ANOVA
104
Question Solution
A. Correct. The degrees of freedom between
groups is given by the formula 𝑘 − 1.
à Degrees of freedom for “between
groups” is equal to “number of groups
minus 1” (k-1). In our case we had 3
conditions so df=(3-1) = 2
B. Correct. The total variance can be found by
the formula 𝑀𝑆9:9;< =
==9
>?#
=
.7.@!A
!B
= 4.91
C. Correct. The ANOVA SPSS output has a p-
value of 0.006 for an F=7.447. The p-value is
smaller than the significance level 5%, thus
the test is significant.
D. Yes, they are all correct.
Participants were asked to memorise a list of
words. They were divided into several groups,
each using a different memorization technique.
60 minutes later, the experimenter assessed how
many words they could still remember (the
dependent variable RECALL in the output).
Which statement is correct?
41.566
41.850
83.416
20.783
2.790

Answers
Question
A sample of n= 35 participants was randomly selected from UM students pool. A baseline assessment
rated their arachnophobia. After undergoing 2 sessions of exposure therapy (to spiders), their
arachnophobia was measured again with the same scale. The researcher wants to see if the 2 sessions of
exposure therapy had a significant effect.
Should an ANOVA test be performed on this data set?
105
A. Yes, the normality assumptions hold since the sample size is big enough.
B. Yes, the equal variances assumptions is met because 35 participants were tested both times.
C. No, The independence assumption is violated.
D. Yes, the data is quantitative as their phobia is rated on scale.
Answer: C
3. ANOVA

3E. ANOVA
Answers
A sample of n= 35 participants was randomly selected from UM students pool. A baseline assessment
rated their arachnophobia. After undergoing 2 sessions of exposure therapy (to spiders), their
arachnophobia was measured again with the same scale. The researcher wants to see if the 2 sessions of
exposure therapy had a significant effect.
Should an ANOVA test be performed on this data set?
106
Solution
A. Correct, but the main criteria for an ANOVA: independent groups is violated. Thus, an ANOVA is
not the suitable test here.
B. Incorrect, the same sample is tested twice (baseline and after exposure). We are not comparing
independent groups.
C. Correct, the same sample is tested twice (baseline and after exposure). We are not comparing
independent groups.
D. Correct, but the main criteria for an ANOVA: independent groups is violated. Thus, an ANOVA is
not the suitable test here.

Answers
Question
An experiment on the effect of listening to music on information retention is performed. A total sample
of 75 is divided into three equally large groups. All three groups are asked to memorized a list of words
while either (a) listening to Vivaldi, (b) listening to AC/DC, or (c) listening to crickets singing.
An analysis of variance is performed. It is concluded that the null hypothesis cannot be rejected.
What statement is correct?
107
A. MSG and MSE are both unbiased estimators of the error variance.
B. Since the null hypothesis is true, then the difference between groups is as large as difference within
groups.
C. There is no group effect.
D. All are correct
Answer: D
4. ANOVA

4E. ANOVA
Question
An experiment on the effect of listening to music on information retention is performed. A total sample
of 75 is divided into three equally large groups. All three groups are asked to memorized a list of words
while either (a) listening to Vivaldi, (b) listening to AC/DC, or (c) listening to crickets singing.
An analysis of variance is performed. It is concluded that the null hypothesis cannot be rejected.
What statement is correct?
108
Solution
A. Correct. When H0 is rejected, it means that the difference between groups was caused by
uncontrolled factors (error). This means that the MS(G) is an unbiased estimator of error variance.
MSE is an unbiased estimator of error variance in any case.
B. Correct. The difference between groups is measured by MSG while the difference within groups is
measured by MSE. In the case of a true null hypothesis, both MSE and MSQ are unbiased estimators
of error variance, thus MSE=MSG
C. Correct. The H0 for ANOVA states that the means of all groups are equal, meaning that there is no
treatment effect.
D. Correct

4E. ANOVA
109
Unbiased
Estimator
• MSE is an unbiased estimator of error variance.
Pooled Variance
• Since we already have the assumption that all populations have equal variance,
we can take the average of estimates.
𝑆𝑝" =
𝑁! − 1 ×𝑆!
"
+ 𝑁" − 1 ×𝑆"
"
+. . +(𝑁' − 1)×𝑆'
"
𝑁! − 1 + 𝑁" − 1 +. . +(𝑁' − 1)
Conclusion
• MSE = Sp
2
• Accurate and efficient error estimate.

4E. ANOVA
Random Variables
MSG and MSE count as random variables.
MSE and MSG as Estimators of Error Variance
If there is no group effect (𝐻+: true) MSE as well as MSG count as unbiased estimations of the error
variance.
Relation of MSE and MSG
MSE is the error (or noise)
MSG is the error + the effect of the group.
If 𝐻+ is true and there is no effect of the group
MSE and MSG will be approximately equal.
Another way to phrase this would be, the
difference between groups is as large as
difference within groups.
110

Answers
Question
Synesthesia is a perceptual phenomeneon in which there is an experience of 2 sensory/cognitive
pathways. Synesthesia has been linked to enhanced memory skills due to increased association
available. Anton wanders if there is a difference in memory recall between different synesthesia types.
He gathers 120 participants and within his sample, there are 4 different synesthesia types. Each group
has an equal number of participants. After a memorization period, Anton gives his participants a
memory test. Following an ANOVA, SSG = 167.91 and SSE = 1760.88
What can be concluded?
111
A. H0 not rejected with p-value > 0.05
B. H0 rejected with 0.025 ≤ 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 0.05
C. H0 rejected with 0.01 ≤ 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 0.025
D. Ho not rejected because Fobs< Fcritical
Answer: C
5. ANOVA

5E. ANOVA
Question
Synesthesia is a perceptual phenomeneon in
which there is an experience of 2
sensory/cognitive pathways. Synesthesia has
been linked to enhanced memory skills due to
increased association available. Anton wanders if
there is a difference in memory recall between
different synesthesia types. He gathers 120
participants and within his sample, there are 4
different synesthesia types. Each group has an
equal number of participants. After a
memorization period, Anton gives his
participants a memory test. Following an
ANOVA, SSG = 167.91 and SSE = 1760.88
112
Solution
Ø Calculate the degrees of freedom
𝑑𝑓 𝐺 = 𝑘 − 1 = 4 − 1 = 3
𝑑𝑓 𝐸 = 𝑁 − 𝑘 = 120 − 4 = 116
Ø Calculate the Mean Squares
𝑀𝑆 𝐺 =
𝑆𝑆𝐺
𝑑𝑓(𝐺)
=
167.91
3
= 55.97
𝑀𝑆 𝐸 =
𝑆𝑆𝐸
𝑑𝑓(𝐸)
=
1760.88
116
= 15.18
Ø Calculate the F-value
𝐹 =
𝑀𝑆(𝐺)
𝑀𝑆(𝐸)
=
55.97
15.18
= 3.687
Ø By taking a look at the F-table we see that for
α=0.05, the Fc(3.116)=2.70, which means the
null hypothesis is rejected
𝐹C%= > 𝐹D
Ø On the next pages we see that for α=0.01, the
Fc = 3.98
0.01 ≤ 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 0.025

Answers
Question
Based on the ANOVA output, which of the following statements are correct?
113
A. The scores on the dependent variable likely vary due to residual effects only.
B. The scores on the dependent variable likely vary due to residual effects and group effect.
C. The scores on the dependent variable likely vary due to group effect only
D. The scores on the dependent variable likely do not vary due to residual effects nor due
to the group effect.
Answer: B
6. ANOVA
Sum of Squares df Mean Square F Sig
Between Groups 126 1 126 4.4843 ?
Within Groups 1630 58 28.1034
Total 1756 59

6E. ANOVA
Question
Based on the ANOVA output, which of the following statements are correct?
114
Solution
Ø Using the F-table, we can see that for α=0.05, the 𝐹𝑐 1.58 = 4.03
Ø The Fobs is bigger than the Fc, meaning that the null hypothesis is rejected
Ø There is an overall treatment effect, thus not all group means are the same
Ø However, error cannot be controlled for, so it is always there
Scores likely vary due to treatment/group effect AND error/residual factors
Between Groups 126 1 126 4.4843 ?
Total 1756 59

Answers
Question
Maja conducted a study with 5 conditions and 30 participants in total have been recruited.
Choose the correct statement:
115
A. F = 21.801, not significant
B. F = 17.474, not significant
C. F = 19.625, significant
D. F = 18.926, significant
Answer: D
7. ANOVA
?
?
?
?
?
? ?
?
2244.500
9041.367

7E. ANOVA
Question
Maja conducted a study with 5 conditions and 30
participants in total have been recruited.
Choose the correct statement:
116
Solution
1) Calcualte the SS(G):
𝑆𝑆𝑇 = 𝑆𝑆𝐺 + 𝑆𝑆𝐸
𝑆𝑆𝐺 = 𝑆𝑆𝑇 − 𝑆𝑆𝐸
𝑆𝑆𝐺 = 9041.367 − 2244.5 = 6796.867
2) Calcualte degrees of freedom:
𝑑𝑓 𝐺 = 𝑘 − 1 = 5 − 1 = 4
𝑑𝑓 𝐸 = 𝑁 − 𝑘 = 30 − 5 = 25
𝑑𝑓 𝑇 = 𝑁 − 1 = 30 − 1 = 29
3) Calculate Mean squares:
𝑀𝑆 𝐺 =
𝑆𝑆𝐺
𝑑𝑓(𝐺)
=
6796.867
4
= 1699.217
𝑀𝑆 𝐸 =
𝑆𝑆𝐸
𝑑𝑓(𝐸)
=
2244.5
25
= 89.780
4) Calculate F-value:
𝐹 =
𝑀𝑆𝐺
𝑀𝑆𝐸
=
1699.217
89.780
= 18.926
5) Use the F-table to reach yout decision:
𝐹𝑐 4,25 = 2.76 ⇒ 𝐹𝑜𝑏𝑠 > 𝐹𝑐 ⇒ 𝑆𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑡
?
2244.500
9041.367
?
?
?
?
?
? ?

Answers
Question
Micheal is a sports enthusiast. He wants to investigate which form of excersise leads to better
concentration. He recruits 75 participants and assigns them randomly to 3 groups (cardio, weights,
crossfit). He later measures their concentration levels and compares the means of the groups.
Given that Micheal ended up rejecting the null hypothesis, which of the following is correct?
117
A. There is no difference in concentration levels between groups
B. Micheal can confidently say that cardio is better than weights
C. Micheal needs an extra statistical analysis
D. There is no treatment effect
Answer: C
8. ANOVA

8E. ANOVA
Question
Micheal is a sports enthusiast. He wants to investigate which form of excersise leads to better
concentration. He recruits 75 participants and assigns them randomly to 3 groups (cardio, weights,
crossfit). He later measures their concentration levels and compares the means of the groups.
Given that Micheal ended up rejecting the null hypothesis, which of the following is correct?
118
Solution
A. Incorrect. The null hypothesis states that all group means are the same (no treatment effect). By
rejecting the null hypothesis we can confidently say that not all group means are the same.
B. Incorrect. By rejecting the null hypothesis, we know that not all group means are the same, however
we do not know where the difference is exactly (e.i., between which groups).
C. Correct. If we want to uncover the exact nature of the group difference, we need to conduct
multiple comparisons.
D. Incorrect. Null hypothesis was rejected, thus there is treatment effect.

Answers
Question
Micheal did conduct multiple comparisons to examine the differences between groups. What can be
concluded based on the SPSS output?
119
9. ANOVA
Dependent Variable: Concentration scores
LSD
(I) Group (J) Group Mean
Difference
Std. Error Sig. 95% Confidence Interval
Lower
Bound
Upper
Bound
Cardio Weights 0.1762 0.5102 0.730 -.8338 1.1861
Crossfit 1.4606 0.5470 0.009 .3778 2.5435
Weights Cardio -.1762 0.5102 0.730 -1.1861 .8338
Crossfit 1.2844 0.5696 0.026 .1569 2.4119
Crossfit Cardio -1.4606 0.5470 0.009 -2.5435 -.3778
Weights -1.2844 0.5696 0.026 -2.4119 -.1569

Answers
Question
Micheal did conduct multiple comparisons to examine the differences between groups. What can be
concluded based on the SPSS output?
120
A. There are 2 statistically significant comparisons
B. There is 1 statistically significant comparison
C. All three comparisons are statistically significant
D. None of the comparisons reaches significance
Answer: B
9. ANOVA

9E. ANOVA
Question
Micheal did conduct multiple comparisons to
examine the differences between groups. What
can be concluded based on the SPSS output?
121
Family-wise Type 1 error
In a multiple comparison the α-value of each
comparison is added up. Hence, the chance of
making a Type I Error increases
Solution
Ø While the output does show 2 comparisons
that reach significance (cardio-crossfit,
weights-crossfit), no Bonferroni correction
has been appied for the family-wise Type 1
error.
Ø By applying the Bonferroni correction
(multiply p-value by number of comparisons),
we see that only the comparison between
cardio and crossfit remains significant
Bonferroni Correction
1. Multiply p-value by number of comparisons
Or
2. Divide significance level by number of
comparisons
Number of comparisons: (k(k-1))/2)

Answers
Question
Given that the groups have equal sample sizes and the following output, which statement is correct?
122
A. The normality assumption was violated, so the test should not have been done
B. An independent samples t-test could be done instead of ANOVA
C. MSE is smaller than MSG, hence the treatment effects are significant
D. If the test is significant, multiple comparisons are the necessary next step
Answer: C
10. ANOVA
Between Groups 126 1 126 4.4843 ?
Total 1756 59

10E. ANOVA
Question
Given that the groups have equal sample sizes, which statement is correct, given the following output?
123
Solution
A. Incorrect. We can see that our sample size is 60 (N-1=59 à N=60). Given that each group has 30
participants, the CLT can be applied, thus the test is robust against a normality violation
B. Correct. Since we have just 2 groups, an independent samples t-test would be equivalent to this
ANOVA.
C. Incorrect. It might be that MSE is smaller than MSG, thus F is bigger than 1, but we always have to
rely on the p-value which tells us whether the result is actually significant
D. Incorrect. Since we only have two groups, if the test is significant, we can immediately tell between
which groups there is a difference, thus it is not a necessity to conduct multiple comparisons.
However if we want to see how the difference will look like, we can continue on with them.
Between Groups 126 1 126 4.4843 ?
Total 1756 59

Proportions, Entire Distributions

Answers
Question
Florian is the new general manager at Success Formula, replacing Michalina. Success formula offers
courses in Psychology, Business Economics and Law. During the time Michalina was GM, 60% the
student population at SF attended Business Economics courses, 25% Psychology courses and 15% Law
courses. After an intense marketing campaign, Florian believes that this year, things will be different.
In a simple random sample of 275 students, 145 of them chose B/E courses, 75 choose psychology and
55 choose law. Based on the data, Florian wants to test whether the population distribution of field
choice will change or will it be the same as during Michalina’s reign as GM.
Does the result from the sample give sufficient evidence?
125
A. No, the null hypothesis is not rejected with the observed value of the statistic test equal to 1.23
B. Yes, the null hypothesis is rejected with the observed value of the statistic test equal to 7,57
C. No, the null hypothesis is not rejected with the observed value of the statistic test equal to 2.50
D. Yes, the null hypothesis is rejected with the observed value of the statistic test equal to 9.93
Answer: B
1. Proportions and Entire Distributions

1E. Proportions and Entire Distributions
Question
Florian is the new general manager at Success Formula, replacing Michalina. Success formula offers
courses in Psychology, Business Economics and Law. During the time Michalina was GM, 60% the
student population at SF attended Business Economics courses, 25% Psychology courses and 15% Law
courses. After an intense marketing campaign, Florian believes that this year, things will be different.
In a simple random sample of 275 students, 145 of them chose B/E courses, 75 choose psychology and
55 choose law. Based on the data, Florian wants to test whether the population distribution of field
choice will change or will it be the same as during Michalina’s reign as GM.
Does the result from the sample give sufficient evidence?
126
Solution
Ø We see that we have only 1 variable (count of students) which has more than 2 levels (3)
Ø We want to see how well the sample distribution fits a specific model
Ø We have to use the X2 Goodness of Fit Test

Data
Model: BE(60%)-Psy(25%)-Law(15%)
N = 275
H0: Distribution within sample fits the model
Hα: Distribution within sample does not fit
model
127
Solution
Ø Calculate Expected Counts [𝐸𝑐 = 𝑁×𝑃 𝑒 ]
• B/E: 275 x 0.6 = 165
• Psy: 275 x 0.25 = 68.75
• Law: 275 x 0.15 = 41.25
Ø Calculate the chi-square
𝑥! = Σ
𝑂𝐶 − 𝐸𝐶 !
𝐸𝐶
𝑥!
=
145 − 165 !
165
+
75 − 68.75 !
68.75
+
55 − 41.25 !
41.25
𝑥! = 2.42 + 0.57 + 4.58
𝒙𝟐 = 𝟕. 𝟓𝟕
Ø Check the x2 table for the p-value
𝟎. 𝟎𝟐 ≤ 𝒑 − 𝒗𝒂𝒍𝒖𝒆 ≤ 𝟎. 𝟎𝟐𝟓
We see that the p-value should be lower than
0.05, thus the H0 that the distribution within the
sample fits the model is rejected.
Students
Business/Economics 145
Psychology 75
Law 55

When to Use
Data type: categorical data
à Check how well a proposed proportion
distribution fits with an observed one.
𝐻#: 𝑇ℎ𝑒 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑤𝑖𝑡ℎ𝑖𝑛 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒
𝑓𝑖𝑡𝑠 𝑜𝑢𝑟 𝑒𝑥𝑝𝑒𝑐𝑡𝑎𝑡𝑖𝑜𝑛
𝐻$: 𝑇ℎ𝑒 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑤𝑖𝑡ℎ𝑖𝑛 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒
𝑑𝑜𝑒𝑠 𝑛𝑜𝑡 𝑓𝑖𝑡 𝑜𝑢𝑟 𝑒𝑥𝑝𝑒𝑐𝑡𝑎𝑡𝑖𝑜𝑛
Degrees of Freedom
Nationality
of class
Dutch 0.2
German 0.5
Belgian 0.2
French 0.1
Formula
Χ!= Σ
Obs−Exp !
Exp
Assumptions
• Categorical Data
• Expected Counts >5
EC = N*p(e)
Df = # of cells – 1
df = 4-1 = 3
128

Answers
Question
Andreia has been researching the effectiveness of dialectical behavior therapy (DBT), a type of
cognitive behavioural therapy, for the development of healthy ways to cope with stress and emotion
regulation. She wonders whether DBT has different efficiency levels for different types of populations.
She decides to take two samples, one of people exhibiting eating disorders and one of people with
substance use disorders. After several sessions, Andreia and her team, note for each subject if there was
improvement or not. Andreia is the first researcher to conduct such a study, so she does not know how
the different disorders can have an effect on improvement.
129
Improvement
Yes No
Disorder Eating Disorders 148 112 260
Substance use
Disorders
173 102 275
321 214 535

Answers
Question
Andreia has been researching the effectiveness of dialectical behavior therapy (DBT), a type of
cognitive behavioural therapy, for the development of healthy ways to cope with stress and emotion
regulation. She wonders whether DBT has different efficiency levels for different types of populations.
She decides to take two samples, one of people exhibiting eating disorders and one of people with
substance use disorders. After several sessions, Andreia and her team, note for each subject if there was
improvement or not. Andreia is the first researcher to conduct such a study, so she does not know how
the different disorders can have an effect on improvement.
130
A. The null hypothesis is not rejected with the observed value of the statistic test equal to 0.98
B. The null hypothesis is rejected with the observed value of the statistic test equal to 1.36
C. The null hypothesis is not rejected with the observed value of the statistic test equal to -1.36
D. The null hypothesis is rejected with the observed value of the statistic test equal to -2.71
Answer: C

Data
Ø We now compare 2 independent samples
Ø The dependent variable is dichotomous
Ø We have to use a 2 proportion z-test
𝐻#: 𝜋% = 𝜋!
𝐻$: 𝜋% ≠ 𝜋!
𝑝1 =
𝑥%
𝑛%
=
148
260
= 0.57
𝑝2 =
𝑥!
𝑛!
=
173
275
= 0.63
𝜋 =
𝑥% + 𝑥!
𝑛% + 𝑛!
=
148 + 173
260 + 275
= 0.6
131
Solution
Ø Calculate the Z
𝑍 =
𝑝1 − 𝑝2 − (𝜋1 − 𝜋2)
𝜋 < (1 − 𝜋) <
1
𝑛1 +
1
𝑛2
𝑍 =
0.57 − 0.63
0.6(1 − 0.6) <
1
260 +
1
275
𝑍 =
−0.06
0.49 < 0.09
= −1.36
Ø Look at the Z-table for the p-value
P-value(z=-1.36)= 0.0869
Ø Double the p-value since it is a two-tailed test
2x0.0869 = 0.1738 > 0.05
The null hypothesis cannot be rejected.

When to Use
Comparing the proportion of two groups
(categorical data).
𝐻#: 𝑝% = 𝑝!
𝐻$: 𝑝% ≠ 𝑝!(two-sided)
𝐻$: 𝑝% < 𝑝!or 𝐻$: 𝑝% > 𝑝!(one-sided)
Assumptions:
• Categorical variables
à dichotomous
• Independent groups
• Normality
- always violated
- Central Limit Theorem
Formulas and Application
Z-score =
('
(!) *
("))#
,-
Estimate: •
𝑝% − •
𝑝!
SE (for z-test):
'
(!∗(%)*
(%)
/!
+
'
("∗(%)*
(!)
/"
Confidence Interval
p1 – p2 ± 𝑍!
"#(#%"#)
'#
+
"((#%"()
'(
132

Answers
Question
Refer back to the previous question. What is the 95% confidence interval?
133
A. [0.063, 0.015]
B. [-0.014, 0.023]
C. [-0.053, 0.090]
D. [1.678, 3.683]
Answer: B

Question
Refer back to the previous question. What is the 95% confidence interval?
134
Solution
𝑝1 − 𝑝2 ± 𝑍𝑐 <
𝑝1 1 − 𝑝1
𝑛1
+
𝑝2 1 − 𝑝2
𝑛2
0.57 − 0.63 ± 1.96 <
0.57 < 0.43
260
+
0.63 < 0.37
275
−0.06 ± 1.96 < 0.042
[−0.014, 0.023]

Answers
Question
Nik wants to see if there is association between the presence of neuroscientific evidence (1=no, 2=yes)
and juror verdicts (not guilty=1, not guilty due to insanity=2 guilty=3).
What can be concluded based on the table?
135
4. Proportion and Entire Distribution
Neuroscientific Evidence
No Yes
Verdict Not Guilty 32 29 61
Not Guilty due
to insanity
55 61 116
Guilty 10 13 23
97 103 200

Answers
Question
Nik wants to see if there is association between the presence of neuroscientific evidence (1=no, 2=yes)
and juror verdicts (not guilty=1, not guilty due to insanity=2 guilty=3).
What can be concluded based on the table?
136
A. The null hypothesis is not rejected with the observed value of the statistic test equal to 0.67
B. The null hypothesis is rejected with the observed value of the statistic test equal to 1.30
C. The null hypothesis is not rejected with the observed value of the statistic test equal to 0.20
D. The null hypothesis is rejected with the observed value of the statistic test equal to 0.65
Answer: A

4E. Proportion and Entire Distribution
Data
Ø We want to study the relationship of two
categorical variables
Ø We use a contigency table
Ø We use the chi-square test for contigency
tables
Expected Counts:
𝐸𝐶 =
𝑇𝑜𝑡𝑎𝑙 𝑟𝑜𝑤 < 𝑡𝑜𝑡𝑎𝑙 𝑐𝑜𝑙𝑢𝑚𝑛
𝑁
137
Solution
Ø Caclualte the chi-square
𝑥! = Σ
𝐸𝐶
𝑋!
=
32 − 29.585 !
29.585
+
55 − 56.26 !
56.26
+
10 − 11.155 !
11.155
+
29 − 31.415 !
31.415
+
61 − 59.740 !
59.740
+
13 − 11.845 !
11.845
𝑥!
= 0.197 + 0.028 + 0.119 + 0.186 + 0.026 + 0.113
𝑋! = 0.669 = 0.67
Ø Calculate df
𝑑𝑓 = #𝑟𝑜𝑤𝑠 − 1 < #𝑐𝑜𝑙𝑢𝑚𝑛𝑠 − 1
= 3 − 1 < 2 − 1 = 2
Ø Check the p-value
The p-value looks to be greater than 0.25, thus
the null hypothesis cannot be rejected.
No Yes
Not Guilty 32
(29.585)
29
(31.415)
Not Guilty due
to Insanity
55
(56.26)
61
(59.740)
Guilty 10
(11.155)
13
(11.845)

When to Use
Data type: categorical data
Design: between-subjects
à When testing if 2 or more population
distributions are equal
à When testing the independence of categorical
variables
Formulas
Formula Χ!= Σ
Obs−Exp
"
-0(
𝐸𝑥𝑝12 =
𝑇𝑜𝑡𝑎𝑙345 ∗ 𝑇𝑜𝑡𝑎𝑙64789/
𝑇𝑎𝑏𝑙𝑒 𝑡𝑜𝑡𝑎𝑙
Total stays the same in the table
à Need to recalculate inner parts
Df = (# of rows-1)*(# of columns–1)
Hypotheses
1) 𝐻#: 𝑎𝑙𝑙 𝑔𝑟𝑜𝑢𝑝𝑠 ℎ𝑎𝑣𝑒 𝑡ℎ𝑒 𝑠𝑎𝑚𝑒 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛
𝐻$: 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑔𝑟𝑜𝑢𝑝 ℎ𝑎𝑠 𝑎
𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛
2) 𝐻#: 𝑡ℎ𝑒𝑟𝑒 𝑖𝑠 𝑛𝑜 𝑎𝑠𝑠𝑜𝑐𝑖𝑎𝑡𝑖𝑜𝑛 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑋 𝑎𝑛𝑑 𝑌
𝐻$: 𝑡ℎ𝑒𝑟𝑒 𝑖𝑠 𝑎𝑛 𝑎𝑠𝑠𝑜𝑐𝑖𝑎𝑡𝑖𝑜𝑛 𝑏𝑒𝑡𝑤𝑒𝑒𝑛
𝑋 𝑎𝑛𝑑 𝑌
Assumption
1. Categorical data
2. Independent groups
3. Expected Counts >5
138

Answers
Question
Rate the correctness of the following 2 statements:
1. If the population is not sufficiently large, but we know that the samples have been drawn from a
normally distributed population, we can still continue with a 2-proportion z-test
2. For a x2 test for contingency tables, the alternative hypothesis can never be one sided
139
A. Both of them are correct
B. 1 is correct, 2 is incorrect
C. 1 is incorrect, 2 is correct
D. Both of them are incorrect
Answer: C

Question
Rate the correctness of the following 2 statements:
1. If the population is not sufficiently large, but we know that the samples have been drawn from a
normally distributed population, we can still continue with a 2-proportion z-test
2. For a x2 test for contingency tables, the alternative hypothesis can never be one sided
140
Solution
1. A dichotomous variable can never be normally distributed. As a result, there will always be a
violation of the normality assumption. In order to continue with the test, the Central Limit
Theorem needs to apply for robustness (n>25)
2. For a x2 test for contigency tables, effects get squared, thus any direction is lost in the process. In
other words, the alternative hypothesis will always be two-sided.

Answers
Question
The munincipality of Maasticht wants residents to stop placing their garbage bags outside on non-pick
up days. Tim is tasked to find a way to reduce this phenomenon. He decides to test two methods, a
social media campaign informing them about how the city gets affected by this behaviour and letters
informing residents of higher monetary penalties. He takes a sample of 2000 residents, in which half of
the residents receive the social media campaign and the other half the letters. Tim wants to test which
method has a stronger effect in decreasing the number of garbage bags outside on non-pick up dates.
What test should Tim use?
141
Garbage bags on non-pick up
dates
Before After
Intervention Social Media
Campaign
… … …
Higher
Monetary
Penalty
… … …
… … 2000

Answers
Question
the residents receive the social media campaign and the other half the letters. Tim wants to test which
method has a stronger effect in decreasing the number of garbage bags outside on non-pick up dates.
142
A. 1 Proportion Z-test
B. x2 Goodness of Fit test
C. 2 Proportion Z-test
D. Independent Samples T-test
Answer: C

Question
the residents receive the social media campaign and the other half the letters. Tim wants to test if the
tendency to throw out garbage bags on non-pick up days after an intervention is distributed differently
in these 2 groups.
143
Solution
A. 1 proportion Z-test requires 1 population/group, a dichotomous variable and we use it to examine if
the proportion of that 1 population is equal to a specific value. However in this example we have
two populations (Social media campaign, monetary fines)
B. X2 goodness of fit test requires 1 variable and it is usd when we want to examine if the sample
proportion distribution fits an expected model. However this example has 2 variables (Intervention,
Garbage bags on a non-pick up date)
C. 2 proportion Z-test requires 2 populations/groups, a categorical and dichotomous variable. We use
it to determine if 2 proportions are different from one another. In this example, all those
requirements are present
D. One of the main assumptions for an independent samples t-test, is that the dependent variable is
quantitative. In this example, the variable is categorical, thus this alternative is automatically false.

Answers
Question
Julian is an aspiring writer who tries to balance his studies with his writing. However, he has been
having some difficulties lately with his schedule. He wonders if he should continue writing all year long
or should he write specifically during a period he is more productive. He analyzed the amount of pages
he was written during the last year. Does the data give sufficient evidence to reject the Null-Hypothesis
that there is no difference in the amount of pages Julian writes during different months?
January:335, February: 340, March: 348
April: 378, May: 289, June:300,
July: 320, August: 313, September: 227,
October: 291 November: 305, December: 316
144
A. X2 = 36.42, H0 rejected
B. X2 = 49.13, H0 rejected
C. X2 = 17.01, H0 not rejected
D. The test cannot be done
Answer: B

Question
Julian is an aspiring writer who tries to balance his
studies with his writing. However, he has been
having some difficulties lately with his schedule. He
wonders if he should continue writing all year long
or should he write specifically during a period he is
more productive. He analyzed the amount of pages
he was written during the last year. Does the data
give sufficient evidence to reject the Null-
Hypothesis that there is no difference in the amount
of pages Julian writes during different months?
145
Data
January:335 February: 340
March: 348 April: 378
May: 289 June:300
July: 320 August: 313,
September: 227 October: 291
November: 305 December: 316
Solution
Ø The expected counts should be calculated
first:
𝐸𝑐 = 𝑁 < 𝜋1 = 3762 <
1
12
= 313.5
*3762 is the sum of all months and we use 1/12
since there are 12 months and we expect them to
have equal amount of pages
Ø Calculate x2
𝒙𝟐
=
𝟑𝟑𝟓 − 𝟑𝟏𝟑. 𝟓 𝟐
𝟑𝟏𝟑. 𝟓
+
𝟑𝟒𝟎 − 𝟑𝟏𝟑. 𝟓 𝟐
𝟑𝟏𝟑. 𝟓
+
𝟑𝟒𝟖 − 𝟑𝟏𝟑. 𝟓 𝟐
𝟑𝟏𝟑. 𝟓
+ ⋯ +
𝟑𝟏𝟔 − 𝟑𝟏𝟑. 𝟓 𝟐
𝟑𝟏𝟑. 𝟓
𝒙𝟐 = 𝟒𝟗. 𝟏𝟑
Ø Df = 12-1 = 11
Ø P-value <<< 0.0005
Ø H0 rejected

Answers
Question
Success formula decides to add courses for the faculty of science (FSE). For the courses, 3 programs, will
be available (weekly meetings, crash courses and privates). Past courses followed a specific distribution
(WM-45%, CC-40%, P-15%), but Florian believes that this time it will be different. He takes a sample of
the first 85 students that signed up for FSE programs and writes down the proportions.
What can be concluded based about the sample distribution?
146
A. The programs follow a different distribution with a test statistic equal to 8.947
B. The programs follow the same distribution with a test statistic equal to 8.947
C. The programs follow a different distribution with a test statistic equal to 4.52
D. The programs follow the same distribution with a test statistic equal to 4.52
Answer: A
Program
Weekly Meeting 29
Crash Course 34
Private 22

Data
Ø We wish to check how well a proportion
distribution within a sample fits a ‘known’
model
Ø We only have 1 variable (program
distribution)
Ø We need to use the x2 Goodness of Fit Test
Expected counts:
𝑬𝒄 = 𝑵 < 𝝅𝒊
• Weekly meetings: 𝐸𝑐 = 85 < 0.45 = 38.25
• Crash Course: 𝐸𝑐 = 85 < 0.40 = 34
• Privates: 𝐸𝑐 = 85 < 0.15 = 12.75
147
Solution
Ø Calculate the chi-square
𝑥! = Σ
𝐸𝐶
𝑥! =
29 − 38.25 !
38.25
+
34 − 34 !
34
+
22 − 12.75 !
12.75
𝑥! = 2.237 + 0 + 6.71
𝒙𝟐 = 𝟖. 𝟗𝟒𝟕
Ø Calculate the df
𝑑𝑓 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑖𝑒𝑠 − 1
𝑑𝑓 = 3 − 1 = 2
Ø Use the x2 table
0.01 ≤ 𝑝 − 𝑣𝑎𝑙𝑢𝑒 ≤ 0.02
Since the p-value is smaller than 0.05, we can
safely reject the null hypothesis

Answers
Question
Aidan is a statistician who wishes to study the voter turnout rate amongst young people in his
hometown. He first decides to test a two-tailed alternative hypothesis about the actual value of the
voter turnout proportion. Previous studies have shown that the proportion is equal to 0.66.
Which of the following 95% Confidence Intervals is the most plausible, given that the null hypothesis
was rejected?
148
A. [0.66, 0.68]
B. [0.58, 0.71]
C. [1.5, 3.4]
D. [0.70, 0.75]
Answer: D

Answers
Question
Aidan is a statistician who wishes to study the voter turnout rate amongst young people in his
hometown. He first decides to test a two-tailed alternative hypothesis about the actual value of the
voter turnout proportion. Previous studies have shown that the proportion is equal to 0.66.
Which of the following 95% Confidence Intervals is the most plausible, given that the null hypothesis
was rejected?
149
A. Incorrect. Since the null hypothesis is rejected, it means that the proportion of the null hypothesis
is not a plausible value that the population proportion can take. This CI includes the 0.66
B. Incorrect. Same reasoning as above
C. Incorrect. Proportions cannot take a value higher than 1, thus this CI is very unlikable
D. Correct. This CI is the only one that does not include the proportion from the rejected null
hypothesis

Answers
Question
Refer to the SPSS output below. Which of the following statements can be correct?
150
A. The p–value should be divided by 2 because we have a 1-sided test
B. There is an association between 2 variables
C. The distribution of A variable is equal at each level of B variable
D. None of the above is correct
Answer: B
Value df Asymp. Sig. (2-
tailed)
Pearson Chi-Square 6.437 2 0.35
Likelihood Ratio 3.821 2 0.51
Linear-by-Linear
Association
4.455 2 0.35
N of valid cases 4.411

Question
Refer to the SPSS output below. Which of the following statements can be correct?
151
Solution
A. Incorrect. Chi-square tests can never have a one sided test because the direction is lost when we
square the effects. The alternative hypothesis is always two-sided, thus we should not change the p-
value whatsoever
B. Correct. The p-value is smaller than 0.05, thus the null hypothesis is rejected. When we wish to
study whether there is an association between 2 variables, the null hypothesis states that there is
no association
C. Incorrect. Another way to write the null hypothesis is by saying that the distribution of the first
variable is equal at every level of the 2nd variable (no association). Since we rejected the null
hypothesis, the distribution of the variable is not equal at each level of the 2nd variable.
D. Incorrect. B is correct
Value df Asymp. Sig. (2-
tailed)
Pearson Chi-Square 6.437 2 0.35
Likelihood Ratio 3.821 2 0.51
Linear-by-Linear
Association
4.455 2 0.35
N of valid cases 4.411

Crash Course
Details
• It is a perfect way to repeat the concepts
before the exam
• We cover all topics that are most likely to
come up at the exam
• 4 hour session + 30 min break
• Maximum of 8 students per session
• 50 euro for a 4 hour session
152
Book Now
Program Concept
A comprehensive summery of most important
topics in small groups, so that you can ask all
your questions.
We got you!

Statistics 1 (FPN) QP

More Related Content

Similar to Statistics 1 (FPN) QP

Recently uploaded

Statistics 1 (FPN) QP