TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
Β
9.pdf
1. Introduction to Data Analytics
Lecture: Inferential Statistics β Two sample tests
NPTEL MOOC
By
Prof. Nandan Sudarsanam, DoMS, IIT-M and
Prof. B. Ravindran, CS&E, IIT-M
2. What is the difference?
β’ Examples from last class:
β’ Single set of data versus two sets of data
β’ Think of it as dealing with two variables for the first time
One- Sample situations Two- sample situations
- Average Phosphate levels in Blood
should =<4.8 mg/dl
- Health department only allows 5% of
the toothpastes of each brand to be out
of specification (ratio of fluoride,
abrasives, etc.)
- New garage is inflating repair costs for
accidents. Insurance fraud is suspected.
- Changing the temperature in a foundry
process to see if the mean number of
defects decreases
- Two different manufacturing processes
to compare variance of finished product
in each batch
- Are 10th standard girls taller than 10th
standard boys in India
3. Steps
β’ Using the rubric for this example:
β’ Have a null and alternate hypothesis; H0:π1 = π2and Halt:π1 β π2
β’ Do some basic calculations/arithmetic on the data to create a single number called
the βtest statisticβ;
β’ z=
(π₯1βπ₯2)βπ0
π1
2
π1
+
π2
2
π2
β’ If we assume the null hypothesis to be true (and make some assumptions about the
distributions of various variables), then the βtest statisticβ should be no different than
a single random draw from a specific probability distribution. This is the Z-
distribution or N(0,12)
β’ Test the probability that the βtest statisticβ you calculated belongs to this theoretical
distribution. This is the p-value!; Use Z-tables, Excel, Matlab or R
β’ Low enough p-value is grounds for rejecting the null hypothesis
4. More explanation
A B
23.3 21.1
27.4 22.1
19.8 23.2
. .
. .
. .
. .
π1 π2
π1/π1 π2/π2
A B Diff
23.3 21.1 2.2
27.4 22.1 5.3
19.8 23.2 -3.4
. . .
. . .
. . .
. . .
π
ππ
For paired t-test
For all unpaired tests