9.pdf

Introduction to Data Analytics
Lecture: Inferential Statistics – Two sample tests
NPTEL MOOC
By
Prof. Nandan Sudarsanam, DoMS, IIT-M and
Prof. B. Ravindran, CS&E, IIT-M

What is the difference?
• Examples from last class:
• Single set of data versus two sets of data
• Think of it as dealing with two variables for the first time
One- Sample situations Two- sample situations
- Average Phosphate levels in Blood
should =<4.8 mg/dl
- Health department only allows 5% of
the toothpastes of each brand to be out
of specification (ratio of fluoride,
abrasives, etc.)
- New garage is inflating repair costs for
accidents. Insurance fraud is suspected.
- Changing the temperature in a foundry
process to see if the mean number of
defects decreases
- Two different manufacturing processes
to compare variance of finished product
in each batch
- Are 10th standard girls taller than 10th
standard boys in India

Steps
• Using the rubric for this example:
• Have a null and alternate hypothesis; H0:𝜇1 = 𝜇2and Halt:𝜇1 ≠ 𝜇2
• Do some basic calculations/arithmetic on the data to create a single number called
the “test statistic”;
• z=
(𝑥1−𝑥2)−𝑑0
𝜎1
2
𝑛1
+
𝜎2
2
𝑛2
• If we assume the null hypothesis to be true (and make some assumptions about the
distributions of various variables), then the ‘test statistic’ should be no different than
a single random draw from a specific probability distribution. This is the Z-
distribution or N(0,12)
• Test the probability that the “test statistic” you calculated belongs to this theoretical
distribution. This is the p-value!; Use Z-tables, Excel, Matlab or R
• Low enough p-value is grounds for rejecting the null hypothesis

More explanation
A B
23.3 21.1
27.4 22.1
19.8 23.2
. .
. .
. .
. .
𝑋1 𝑋2
𝑆1/𝜎1 𝑆2/𝜎2
A B Diff
23.3 21.1 2.2
27.4 22.1 5.3
19.8 23.2 -3.4
. . .
. . .
. . .
. . .
𝑑
𝑆𝑑
For paired t-test
For all unpaired tests

Examples and Formulas
Two Sample
Tests
What are you
testing Example
z-test mean Calcium and placebo
t-test mean Call centre
Paired t-test mean Before-after, Left-right
Proportion z-test
proportion/likeli
hood Defective products
F-test
Standard
deviation Manufacturing process
z=
(𝑥1−𝑥2)−𝑑0
𝜎1
2
𝑛1
+
𝜎2
2
𝑛2
t=
(𝑥1−𝑥2)−𝑑0
𝑠1
2
𝑛1
+
𝑠2
2
𝑛2
df =
𝑠1
2
𝑛1
+
𝑠2
2
𝑛2
2
𝑠1
2
𝑛1
2
𝑛1−1
+
𝑠2
2
𝑛2
2
𝑛2−1
t=
(𝑥1−𝑥2)−𝑑0
𝑠𝑝
1
𝑛1
+
1
𝑛2
𝑠𝑝 =
𝑛1 − 1 𝑠1
2
+ (𝑛2 − 1)𝑠2
2
𝑛1 + 𝑛2 − 2
df = 𝑛1+𝑛2 − 2
Equal Variance
Unequal Variance
𝑡 =
𝑑−𝑑0
(
𝑠𝑑
𝑛
)
; df = n-1
𝑧 =
𝑝1 − 𝑝2
𝑝(1 − 𝑝)(
1
𝑛1
+
1
𝑛2
) 𝑝 =
𝑥1 + 𝑥2
𝑛1 + 𝑛2
𝐹 =
𝑠1
2
𝑠2
2 ; df= 𝑛1−1; 𝑛2 − 1

9.pdf

Recommended

Recommended

More Related Content

Similar to 9.pdf

Similar to 9.pdf (20)

More from ChrisMartin260004

More from ChrisMartin260004 (20)

Recently uploaded

Recently uploaded (20)

9.pdf