SlideShare a Scribd company logo
1 of 63
Topic –
Chapter 9 -
Sampling and Statistical Inference
SUBJECT - Research Methodology in Civil Engineering - CE541
FACULTY GUIDE- Prof. Amit .A. Amin
PREPARED BY:-
Bhavik A. Shah (17TS809)
CIVIL ENGG. DEPARTMENT
BIRLA VISHVAKARMA MAHAVIDYALAYA ENGG. COLLEGE
VALLABH VIDYANAGAR-388120
M.TECH - TRANSPORTATION ENGINEERING
1
Table of Contents
 Introduction
 Parameter and Statistic
 Sampling and Non-Sampling Errors
 Sampling Distribution
 Degree of Freedom
 Standard Error
 Central Limit Theorem
 Finite Population Correction
 Statistical Inference
2
Introduction
 A population is the collection of all the elements of interest.
 A sample is a subset of the population.
 Sampling may be defined as the selection of some part of an aggregate or
totality on the basis of which a judgement or inference about the
aggregate or totality is made. In other words, it is the process of obtaining
information about an entire population by examining only a part of it.
3
Why sample?
 Time of researcher and those being surveyed.
 Cost to group or agency commissioning the survey.
 Confidentiality, anonymity, and other ethical issues.
 Non-interference with population. Large sample could alter the nature of
population, eg. opinion surveys.
 Do not destroy population, eg. crash test only a small sample of automobiles.
 Cooperation of respondents – individuals, firms, administrative agencies.
 Partial data is all that is available, eg. fossils and historical records, climate change.
4
NEED FOR SAMPLING
 Sampling can save time and money. A sample study is usually less expensive than a
census study and produces results at a relatively faster speed.
 Sampling may enable more accurate measurements for a sample study is generally
conducted by trained and experienced investigators.
 Sampling remains the only way when population contains infinitely many members.
 Sampling remains the only choice when a test involves the destruction of the item
under study.
 Sampling usually enables to estimate the sampling errors and, thus, assists in
obtaining information concerning some characteristic of the population.
5
Parameter and Statistic
 A statistic is a characteristic of a sample, whereas a parameter is a characteristic of
a population. Thus, when we work out certain measures such as mean, median,
mode or the like ones from samples, then they are called statistic(s) for they
describe the characteristics of a sample. But when such measures describe the
characteristics of a population, they are known as parameter(s).
 For instance, the population mean (m) is a parameter, whereas the sample mean is a
statistic. To obtain the estimate of a parameter from a statistic constitutes the
prime objective of sampling analysis.
Parameter
= Statistic ± Its Error
6
ParameterStatistic
Mean:
Standard
deviation:
Proportion:
s
X ____
____
____
estimates
estimates
estimates
from sample
from entire
population
p
7
Sampling and Non-Sampling Errors
 Sampling error refers to differences between the sample and the population that
exist only because of the observations that happened to be selected for the sample
Increasing the sample size will reduce this type of error.
8
Types of Sampling Error
 Sample Errors
 Non Sample Errors
9
Sample Errors
 Error caused by the act of taking a sample
 They cause sample results to be different from the results of census
 Differences between the sample and the population that exist only
because of the observations that happened to be selected for the sample
 Statistical Errors are sample error
 We have no control over
10
Non Sample Errors
 Not Control by Sample Size
 Non Response Error
 Response Error
11
Non Response Error
 A non-response error occurs when units selected as part of
the sampling procedure do not respond in whole or in part.
12
Response Errors
 A response or data error is any systematic bias that occurs during data
collection, analysis or interpretation.
Respondent error (e.g., lying, forgetting, etc.)
Interviewer bias
Recording errors
Poorly designed questionnaires
Measurement error
13
Respondent error
 respondent gives an incorrect answer, e.g. due to prestige or competence
implications, or due to sensitivity or social undesirability of question
 respondent misunderstands the requirements
 lack of motivation to give an accurate answer
 “lazy” respondent gives an “average” answer
 question requires memory/recall
 proxy respondents are used, i.e. taking answers from someone other than
the respondent
14
Interviewer bias
 Different interviewers administer a survey in different ways
 Differences occur in reactions of respondents to different interviewers, e.g.
to interviewers of their own sex or own ethnic group
 Inadequate training of interviewers
 Inadequate attention to the selection of interviewers
 There is too high a workload for the interviewer
15
Measurement Error
 The question is unclear, ambiguous or difficult to answer
 The list of possible answers suggested in the recording instrument is
incomplete
 Requested information assumes a framework unfamiliar to the respondent
 The definitions used by the survey are different from those used by the
respondent (e.g. how many part-time employees do you have? See next
slide for an example)
16
Key Points on Errors
 Non-sampling errors are inevitable in production of national statistics.
Important that:-
 At planning stage, all potential non-sampling errors are listed and steps taken
to minimise them are considered.
 If data are collected from other sources, question procedures adopted for data
collection, and data verification at each step of the data chain.
 Critically view the data collected and attempt to resolve queries immediately
they arise.
 Document sources of non-sampling errors so that results presented can be
interpreted meaningfully.
17
Sampling Distributions
 Sampling Distribution of Mean
 Student’s ‘t’ Distribution
 Sampling Distribution of Proportion
 F Distribution
 Chi-square Distribution
18
Sampling distribution of mean
 Mean calculated from a sample is usually the best guess for population mean. But
different samples give different sample means!
 It can be shown that sample means from samples of size n are normally distributed:
 Term is called standard error (standard deviation of sample means).
),(
n
N


n

1x
2x
3x

19
CONT…
Sample mean comes from the normal distribution above.
Knowing normal distribution properties, we can be 95% sure that sample mean is in
the range:
),(
n
N


n
x
n



  96,196,1
20
CONT…
 If population standard deviation is unknown then it can be shown that
sample means from samples of size n are t-distributed with n-1 degrees of
freedom
 As an estimate for standard error we can use
n
s
21
T-distribution
 T-distribution is quite similar to normal distribution, but the exact shape of
t-distribution depends on sample size
 When sample size increases then t-distribution approaches normal
distribution
 T-distribution’s critical values can be calculated with Excel
=TINV(probability ; degrees of freedom)
 In the case of error margin for mean degrees of freedom equals n – 1
(n=sample size)
 Ex. Critical value for 95% confidence level when sample size is 50:
=TINV(0,05;49) results 2,00957
22
Sampling Distribution of Proportion
 Proportion calculated from a sample is usually the best guess for
population proportion. But different samples give different sample
proportions!
 It can be shown that proportions from samples of size n are normally
distributed
 Standard error (standard deviation of sample proportions) is
 As an estimate for standard error we use
)
)1(
,(
n
N



n
)1(  
n
pp )1( 
23
Error margin for proportion
 Based on the sampling distribution of proportion we can be
95% sure that population proportion is (95% confidence
interval)
n
pp
p
n
pp
p
)1(
96,1
)1(
96,1



 
24
F distribution25
Chi-Square Distribution26
Degree of Freedom
 In statistics, the number of degrees of freedom is the number of values in the final
calculation of a statistic that are free to vary.
 The number of independent ways by which a dynamic system can move, without
violating any constraint imposed on it, is called number of degrees of freedom. In
other words, the number of degrees of freedom can be defined as the minimum
number of independent coordinates that can specify the position of the system
completely.
 df = n - 1
27
Standard Error
 The Standard Deviation of sampling distribution of a statistic is known as its
standard error (S.E) and is considered the key to sampling theory.
 The utility of the concept of standard error in statistical induction arises on account
of the following reasons:
 The Standard error helps in testing whether the difference between observed and
expected frequencies could arise due to chance.
 The standard error gives an idea about the reliability and precision of a sample. The
smaller the S.E., the greater the uniformity of sampling distribution and hence, greater is
the reliability of sample.
 The standard error enables us to specify the limits within which the parameters of the
population are expected to lie with a specified degree of confidence. Such an interval is
usually known as confidence interval.
28
29
Central Limit Thereom
 When sampling is from a normal population, the means of samples drawn from
such a population are themselves normally distributed. But when sampling is not
from a normal population, the size of the sample plays a critical role. When n is
small, the shape of the distribution will depend largely on the shape of the parent
population, but as n gets large (n > 30), the shape of the sampling distribution will
become more and more like a normal distribution, irrespective of the shape of the
parent population.
 The theorem which explains this sort of relationship between the shape of the
population distribution and the sampling distribution of the mean is known as the
central limit theorem.
 “The significance of the central limit theorem lies in the fact that it permits us to
use sample statistics to make inferences about population parameters without
knowing anything about the shape of the frequency distribution of that population
other than what we can get from the sample.”
30
Finite Population Correction
 The Finite Population Correction Factor (FPC) is used when you sample without
replacement from more than 5% of a finite population.
 It’s needed because under these circumstances, the Central Limit Theorem doesn’t
hold and the standard error of the estimate (e.g. the mean or proportion) will be
too big.
 In basic terms, the FPC captures the difference between sampling with replacement
and sampling without replacement.
 FPC = ((N-n)/(N-1))1/2
31
CONT…
The following table of values shows how the FPC decreases for a population of 10,000
as the sample size gets larger:
32
Statistical Inference
Use a random sample to
learn something about a
larger population
33
Inference
 Two ways to make inference
 Estimation of parameters
 * Point Estimation (X or p)
 * Intervals Estimation
 Hypothesis Testing
34
Mean, , is
unknown
Population Point estimate
I am 95%
confident that 
is between 40 &
60
Mean
X = 50
Sample
Interval estimate
Estimation of parameters35
Parameter
= Statistic ± Its Error
36
Sampling Distribution
X or PX or P X or P
37
Standard Error
SE (Mean) =
S
n
SE (p) =
p(1-p)
n
Quantitative Variable
Qualitative Variable
38
95% Samples
X
_
X - 1.96 SE X + 1.96 SE
 SESE Z-axis
1 - α
α/2α/2
Confidence Interval39
95% Samples
SESE  p
p + 1.96 SEp - 1.96 SE
Z-axis
1 - α
α/2α/2
Confidence Interval40
Example (Sample size≥30)
 An epidemiologist studied the blood glucose level of a random sample of
100 patients. The mean was 170, with a SD of 10.
 SE = 10/10 = 1
 Then CI:
  = 170 + 1.96  1 168.04   ≥ 171.96
95
%
 = X + Z SE
42
Hypothesis testing
 A statistical method that uses sample data to evaluate a
hypothesis about a population parameter. It is intended to
help researchers differentiate between real and random
patterns in the data.
43
What is a Hypothesis?
 An assumption
about the
population
parameter.
I assume the mean SBP of
participants is 120 mmHg
44
Null & Alternative Hypotheses
 H0 Null Hypothesis states the Assumption to be tested e.g. SBP of
participants = 120 (H0: m = 120).
 H1 Alternative Hypothesis is the opposite of the null hypothesis (SBP of
participants ≠ 120 (H1: m ≠ 120). It may or may not be accepted and it is
the hypothesis that is believed to be true by the researcher
45
Level of Significance, a
 Defines unlikely values of sample statistic if null hypothesis is
true. Called rejection region of sampling distribution
 Typical values are 0.01, 0.05
 Selected by the Researcher at the Start
 Provides the Critical Value(s) of the Test
46
0
a Critical
Value(s)
Rejection
Regions
Level of Significance and the Rejection Region47
H0: Innocent
Jury Trial Hypothesis Test
Actual Situation Actual Situation
Verdict Innocent Guilty Decision H0 True H0 False
Innocent Correct Error
Accept
H0
1 - a
Type II
Error (b )
Guilty Error Correct
H0
Type I
Error
(a )
Power
(1 - b)
False
Negative
False
Positive
Reject
Result Possibilities48
Hypothesis Testing: Steps
 Test the Assumption that the true mean SBP of participants is 120 mmHg.
 State H0 H0 : m = 120
 State H1 H1 : m  120
 Choose a a = 0.05
 Choose n n = 100
 Choose Test: Z, t, X2 Test
49
Hypothesis Testing: Steps
 Compute Test Statistic
 Search for Critical Value
 Make Statistical Decision rule
 Express Decision
50
One sample-mean Test
 Assumptions
 Population is normally distributed
 t test statistic
n
s
x
t 0
errorstandard
valuenullmeansample 



51
Example Normal Body Temperature
 What is normal body temperature? Is it actually 37.6oC (on average)?
State the null and alternative hypotheses
 H0:  = 37.6oC
 Ha:   37.6oC
52
Example Normal Body Temp (cont)
n
s
x
t 0
errorstandard
valuenullmeansample 



Data: random sample of n = 18 normal body temps
37.2 36.8 38.0 37.6 37.2 36.8 37.4 38.7 37.2
36.4 36.6 37.4 37.0 38.2 37.6 36.1 36.2 37.5
Variable n Mean SD SE t P
Temperature 18 37.22 0.68 0.161 2.38 0.029
Summarize data with a test statistic
53
STUDENT’S t DISTRIBUTION TABLE
Degrees of
freedom
Probability (p value)
0.10 0.05 0.01
1 6.314 12.706 63.657
5 2.015 2.571 4.032
10 1.813 2.228 3.169
17 1.740 2.110 2.898
20 1.725 2.086 2.845
24 1.711 2.064 2.797
25 1.708 2.060 2.787
 1.645 1.960 2.576
54
Example Normal Body Temp (cont)
 Find the p-value
 df = n – 1 = 18 – 1 = 17
 From SPSS: p-value = 0.029
 From t Table: p-value is between
0.05 and 0.01.
 Area to left of t = -2.11 equals
area to right of t = +2.11.
 The value t = 2.38 is between
column headings 2.110& 2.898
in table, and for df =17, the p-
values are 0.05 and 0.01.
-2.11 +2.11 t
55
Example Normal Body Temp (cont)
 Decide whether or not the result is statistically significant based on the p-
value
 Using a = 0.05 as the level of significance criterion, the results are
statistically significant because 0.029 is less than 0.05. In other words, we
can reject the null hypothesis.
 Report the Conclusion
 We can conclude, based on these data, that the mean temperature in the
human population does not equal 37.6.
56
Case Study: - STATISTICAL INFERENCE OF A CASE
STUDY IN CHINA: ACTIVE PHOSPHATE REMOVAL
FROM EUTROPHIC WATER
 China is a country that exports a huge amount of duck meat. Recently, more and
more people raise ducks in ponds together with fish. Previous research has shown
that the yield of fish in a duckfish integrated system pond is greater than the yield
in non-integrated system ponds.
 At the same time, the duck-fish system reduced the pollution significantly.
However, there is still polluted water left due to the entering phosphorous and
nitrate from ducks (Adel K. Soliman, 2000)
57
Experimental Design and Sample
Collection
 Experimental Design and Sample Collection
 The experiments were performed in Anhui, China. Three ponds, A, B and C, were
selected.
 Pond A is our treatment pond where we planted the water spinach. It had ducks
and fishes. Pond B is a pond with ducks and fishes without water spinach. Pond C is
the control pond with fishes only. We built a floating bed of size 5m*1.2m to fix the
water spinach in pond A.
58
Sample Collection
 The samples were obtained each of these locations in three ponds:
 A1: concentration from water within water spinach area in pond A;
 A2: concentration from water outside water spinach area in pond A;
 B1: concentration from water under duck sheds in pond B;
 B2: concentration from water away from duck sheds in pond B;
 C1: concentration from water in pond C without duck or water spinach.
59
Data Analysis Result
 When we plot the measurements from a same pond, we get Figure 3. The
observations for both ammonia nitrogen and active phosphate from A1
continuously decrease. The observations for ammonia-nitrogen from C1 do not
show decreasing trend.
60
Conclusion
 We performed multiple paired t-test to compare the mean concentrations of
ammonia-nitrogen at various locations. The p-value between samples from A1 and
A2 is greater than 0.1, so there is no real difference in the concentration of
ammonia-nitrogen at two different locations within pond A.
 There is a real difference in the concentration of active phosphate at two different
locations within pond B. The significance test recommends that the water near the
ducks is more polluted by the active phosphate content than the water elsewhere.
The p-value between samples from A1 and B2 is close to 0.0005, so there is a
significant evidence that planting water spinach reduces the active phosphate
content in the water.
61
Reference
 CR Kothari - Research Methodology Methods and Techniques , 2nd Revised edition,
New Age International Publishers.
 June Luo, Ling Zu - STATISTICAL INFERENCE OF A CASE STUDY IN CHINA: ACTIVE
PHOSPHATE REMOVAL FROM EUTROPHIC WATER, Department of Applied
Economics and Statistics, Clemson University.
 https://www.slideshare.net/rambhu21/sampling-and-sampling-errors-19870549/62
62
THANK YOU For Bearing.
Bhavik A. Shah (17TS809)
63

More Related Content

What's hot

Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statisticsAileen Balbido
 
Hypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-testHypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-testShakehand with Life
 
The sampling distribution
The sampling distributionThe sampling distribution
The sampling distributionHarve Abella
 
Measure of skewness
Measure of skewnessMeasure of skewness
Measure of skewnessathul cs
 
Ppt for 1.1 introduction to statistical inference
Ppt for 1.1 introduction to statistical inferencePpt for 1.1 introduction to statistical inference
Ppt for 1.1 introduction to statistical inferencevasu Chemistry
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statisticsHiba Armouche
 
Chapter 5 part1- The Sampling Distribution of a Sample Mean
Chapter 5 part1- The Sampling Distribution of a Sample MeanChapter 5 part1- The Sampling Distribution of a Sample Mean
Chapter 5 part1- The Sampling Distribution of a Sample Meannszakir
 
Sampling methods and sample size
Sampling methods and sample size  Sampling methods and sample size
Sampling methods and sample size mdanaee
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive StatisticsCIToolkit
 
Introduction to kurtosis
Introduction to kurtosisIntroduction to kurtosis
Introduction to kurtosisAmba Datt Pant
 
SOURCES OF ERROR AND SCALES OF MEASUREMENT
SOURCES OF ERROR AND SCALES OF MEASUREMENTSOURCES OF ERROR AND SCALES OF MEASUREMENT
SOURCES OF ERROR AND SCALES OF MEASUREMENTashanrajpar
 
Topic 7 measurement in research
Topic 7   measurement in researchTopic 7   measurement in research
Topic 7 measurement in researchDhani Ahmad
 

What's hot (20)

Sampling design ppt
Sampling design pptSampling design ppt
Sampling design ppt
 
Sample and sample size
Sample and sample sizeSample and sample size
Sample and sample size
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distribution
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
Hypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-testHypothesis testing; z test, t-test. f-test
Hypothesis testing; z test, t-test. f-test
 
The sampling distribution
The sampling distributionThe sampling distribution
The sampling distribution
 
Sampling techniques
Sampling techniquesSampling techniques
Sampling techniques
 
Measure of skewness
Measure of skewnessMeasure of skewness
Measure of skewness
 
Ppt for 1.1 introduction to statistical inference
Ppt for 1.1 introduction to statistical inferencePpt for 1.1 introduction to statistical inference
Ppt for 1.1 introduction to statistical inference
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
Chapter 5 part1- The Sampling Distribution of a Sample Mean
Chapter 5 part1- The Sampling Distribution of a Sample MeanChapter 5 part1- The Sampling Distribution of a Sample Mean
Chapter 5 part1- The Sampling Distribution of a Sample Mean
 
Sampling methods and sample size
Sampling methods and sample size  Sampling methods and sample size
Sampling methods and sample size
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Introduction to kurtosis
Introduction to kurtosisIntroduction to kurtosis
Introduction to kurtosis
 
SOURCES OF ERROR AND SCALES OF MEASUREMENT
SOURCES OF ERROR AND SCALES OF MEASUREMENTSOURCES OF ERROR AND SCALES OF MEASUREMENT
SOURCES OF ERROR AND SCALES OF MEASUREMENT
 
Topic 7 measurement in research
Topic 7   measurement in researchTopic 7   measurement in research
Topic 7 measurement in research
 
Inferential Statistics
Inferential StatisticsInferential Statistics
Inferential Statistics
 
Confidence Intervals
Confidence IntervalsConfidence Intervals
Confidence Intervals
 
Central tendency
Central tendencyCentral tendency
Central tendency
 

Similar to Sampling and statistical inference

SAMPLING IN RESEARCH METHODOLOGY
SAMPLING IN RESEARCH METHODOLOGYSAMPLING IN RESEARCH METHODOLOGY
SAMPLING IN RESEARCH METHODOLOGYFarha Nisha
 
Parameter and statistic in Research Methdology- Module 5
Parameter and statistic in Research Methdology- Module 5Parameter and statistic in Research Methdology- Module 5
Parameter and statistic in Research Methdology- Module 5sathishbl92b
 
Reseach Methdology Statistic- Module 6.pptx
Reseach Methdology Statistic- Module 6.pptxReseach Methdology Statistic- Module 6.pptx
Reseach Methdology Statistic- Module 6.pptxsathishbl92b
 
Research Inference Statistic- Module 7.pptx
Research Inference Statistic- Module 7.pptxResearch Inference Statistic- Module 7.pptx
Research Inference Statistic- Module 7.pptxsathishbl92b
 
5_lectureslides.pptx
5_lectureslides.pptx5_lectureslides.pptx
5_lectureslides.pptxsuchita74
 
Sampling_Distribution_stat_of_Mean_New.pptx
Sampling_Distribution_stat_of_Mean_New.pptxSampling_Distribution_stat_of_Mean_New.pptx
Sampling_Distribution_stat_of_Mean_New.pptxRajJirel
 
Chapter 3 part3-Toward Statistical Inference
Chapter 3 part3-Toward Statistical InferenceChapter 3 part3-Toward Statistical Inference
Chapter 3 part3-Toward Statistical Inferencenszakir
 
STA 222 Lecture 1 Introduction to Statistical Inference.pptx
STA 222 Lecture 1 Introduction to Statistical Inference.pptxSTA 222 Lecture 1 Introduction to Statistical Inference.pptx
STA 222 Lecture 1 Introduction to Statistical Inference.pptxtaiyesamuel
 
sampling
samplingsampling
samplingkamalu4
 
Sampling method son research methodology
Sampling method son research methodologySampling method son research methodology
Sampling method son research methodologyLevisMithamo
 

Similar to Sampling and statistical inference (20)

SAMPLING IN RESEARCH METHODOLOGY
SAMPLING IN RESEARCH METHODOLOGYSAMPLING IN RESEARCH METHODOLOGY
SAMPLING IN RESEARCH METHODOLOGY
 
Parameter and statistic in Research Methdology- Module 5
Parameter and statistic in Research Methdology- Module 5Parameter and statistic in Research Methdology- Module 5
Parameter and statistic in Research Methdology- Module 5
 
Reseach Methdology Statistic- Module 6.pptx
Reseach Methdology Statistic- Module 6.pptxReseach Methdology Statistic- Module 6.pptx
Reseach Methdology Statistic- Module 6.pptx
 
Research Inference Statistic- Module 7.pptx
Research Inference Statistic- Module 7.pptxResearch Inference Statistic- Module 7.pptx
Research Inference Statistic- Module 7.pptx
 
SAMPLING.pptx
SAMPLING.pptxSAMPLING.pptx
SAMPLING.pptx
 
Sampling
Sampling Sampling
Sampling
 
CH 3 Sampling (3).pptx.ppt
CH 3 Sampling (3).pptx.pptCH 3 Sampling (3).pptx.ppt
CH 3 Sampling (3).pptx.ppt
 
5_lectureslides.pptx
5_lectureslides.pptx5_lectureslides.pptx
5_lectureslides.pptx
 
Sampling_Distribution_stat_of_Mean_New.pptx
Sampling_Distribution_stat_of_Mean_New.pptxSampling_Distribution_stat_of_Mean_New.pptx
Sampling_Distribution_stat_of_Mean_New.pptx
 
Chapter 3 part3-Toward Statistical Inference
Chapter 3 part3-Toward Statistical InferenceChapter 3 part3-Toward Statistical Inference
Chapter 3 part3-Toward Statistical Inference
 
STA 222 Lecture 1 Introduction to Statistical Inference.pptx
STA 222 Lecture 1 Introduction to Statistical Inference.pptxSTA 222 Lecture 1 Introduction to Statistical Inference.pptx
STA 222 Lecture 1 Introduction to Statistical Inference.pptx
 
Chapter_9.pptx
Chapter_9.pptxChapter_9.pptx
Chapter_9.pptx
 
Presentation1
Presentation1Presentation1
Presentation1
 
Sampling
SamplingSampling
Sampling
 
Sampling methods in medical research
Sampling methods in medical researchSampling methods in medical research
Sampling methods in medical research
 
Chapter5.ppt
Chapter5.pptChapter5.ppt
Chapter5.ppt
 
sampling
samplingsampling
sampling
 
Chapter5
Chapter5Chapter5
Chapter5
 
Chapter5.ppt
Chapter5.pptChapter5.ppt
Chapter5.ppt
 
Sampling method son research methodology
Sampling method son research methodologySampling method son research methodology
Sampling method son research methodology
 

More from Bhavik A Shah

The battle against corruption starts from within
The battle against corruption starts from withinThe battle against corruption starts from within
The battle against corruption starts from withinBhavik A Shah
 
Slope deflection method
Slope deflection methodSlope deflection method
Slope deflection methodBhavik A Shah
 
Purpose of Valuation
Purpose of ValuationPurpose of Valuation
Purpose of ValuationBhavik A Shah
 
Development of Sonpari village Under the Scheme of Smart Village
Development of Sonpari village Under the Scheme of Smart VillageDevelopment of Sonpari village Under the Scheme of Smart Village
Development of Sonpari village Under the Scheme of Smart VillageBhavik A Shah
 
Moment Distribution Method
Moment Distribution MethodMoment Distribution Method
Moment Distribution MethodBhavik A Shah
 
Indeterminate frame by using energy principle
Indeterminate frame by using energy principleIndeterminate frame by using energy principle
Indeterminate frame by using energy principleBhavik A Shah
 
Geographic information system
Geographic information system Geographic information system
Geographic information system Bhavik A Shah
 
Food chains and food Webs
Food chains and food WebsFood chains and food Webs
Food chains and food WebsBhavik A Shah
 

More from Bhavik A Shah (20)

The battle against corruption starts from within
The battle against corruption starts from withinThe battle against corruption starts from within
The battle against corruption starts from within
 
Swachchhta shapath
Swachchhta shapathSwachchhta shapath
Swachchhta shapath
 
Smart city
Smart citySmart city
Smart city
 
Slope deflection method
Slope deflection methodSlope deflection method
Slope deflection method
 
Purpose of Valuation
Purpose of ValuationPurpose of Valuation
Purpose of Valuation
 
Development of Sonpari village Under the Scheme of Smart Village
Development of Sonpari village Under the Scheme of Smart VillageDevelopment of Sonpari village Under the Scheme of Smart Village
Development of Sonpari village Under the Scheme of Smart Village
 
Orientation
OrientationOrientation
Orientation
 
Monetary Policy
Monetary PolicyMonetary Policy
Monetary Policy
 
Moment Distribution Method
Moment Distribution MethodMoment Distribution Method
Moment Distribution Method
 
Matrix methods
Matrix methodsMatrix methods
Matrix methods
 
Interpolation
InterpolationInterpolation
Interpolation
 
Indeterminate frame by using energy principle
Indeterminate frame by using energy principleIndeterminate frame by using energy principle
Indeterminate frame by using energy principle
 
Hardened concrete
Hardened concreteHardened concrete
Hardened concrete
 
Survey required
Survey requiredSurvey required
Survey required
 
The Water act 1947
The Water act 1947The Water act 1947
The Water act 1947
 
Traffic engineering
Traffic engineeringTraffic engineering
Traffic engineering
 
Flood management
Flood managementFlood management
Flood management
 
Geographic information system
Geographic information system Geographic information system
Geographic information system
 
Strain measurement
Strain measurementStrain measurement
Strain measurement
 
Food chains and food Webs
Food chains and food WebsFood chains and food Webs
Food chains and food Webs
 

Recently uploaded

TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
An introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxAn introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxPurva Nikam
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxPoojaBan
 
Comparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization TechniquesComparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization Techniquesugginaramesh
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvLewisJB
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 

Recently uploaded (20)

TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
An introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptxAn introduction to Semiconductor and its types.pptx
An introduction to Semiconductor and its types.pptx
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Heart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptxHeart Disease Prediction using machine learning.pptx
Heart Disease Prediction using machine learning.pptx
 
Comparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization TechniquesComparative Analysis of Text Summarization Techniques
Comparative Analysis of Text Summarization Techniques
 
Work Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvvWork Experience-Dalton Park.pptxfvvvvvvv
Work Experience-Dalton Park.pptxfvvvvvvv
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 

Sampling and statistical inference

  • 1. Topic – Chapter 9 - Sampling and Statistical Inference SUBJECT - Research Methodology in Civil Engineering - CE541 FACULTY GUIDE- Prof. Amit .A. Amin PREPARED BY:- Bhavik A. Shah (17TS809) CIVIL ENGG. DEPARTMENT BIRLA VISHVAKARMA MAHAVIDYALAYA ENGG. COLLEGE VALLABH VIDYANAGAR-388120 M.TECH - TRANSPORTATION ENGINEERING 1
  • 2. Table of Contents  Introduction  Parameter and Statistic  Sampling and Non-Sampling Errors  Sampling Distribution  Degree of Freedom  Standard Error  Central Limit Theorem  Finite Population Correction  Statistical Inference 2
  • 3. Introduction  A population is the collection of all the elements of interest.  A sample is a subset of the population.  Sampling may be defined as the selection of some part of an aggregate or totality on the basis of which a judgement or inference about the aggregate or totality is made. In other words, it is the process of obtaining information about an entire population by examining only a part of it. 3
  • 4. Why sample?  Time of researcher and those being surveyed.  Cost to group or agency commissioning the survey.  Confidentiality, anonymity, and other ethical issues.  Non-interference with population. Large sample could alter the nature of population, eg. opinion surveys.  Do not destroy population, eg. crash test only a small sample of automobiles.  Cooperation of respondents – individuals, firms, administrative agencies.  Partial data is all that is available, eg. fossils and historical records, climate change. 4
  • 5. NEED FOR SAMPLING  Sampling can save time and money. A sample study is usually less expensive than a census study and produces results at a relatively faster speed.  Sampling may enable more accurate measurements for a sample study is generally conducted by trained and experienced investigators.  Sampling remains the only way when population contains infinitely many members.  Sampling remains the only choice when a test involves the destruction of the item under study.  Sampling usually enables to estimate the sampling errors and, thus, assists in obtaining information concerning some characteristic of the population. 5
  • 6. Parameter and Statistic  A statistic is a characteristic of a sample, whereas a parameter is a characteristic of a population. Thus, when we work out certain measures such as mean, median, mode or the like ones from samples, then they are called statistic(s) for they describe the characteristics of a sample. But when such measures describe the characteristics of a population, they are known as parameter(s).  For instance, the population mean (m) is a parameter, whereas the sample mean is a statistic. To obtain the estimate of a parameter from a statistic constitutes the prime objective of sampling analysis. Parameter = Statistic ± Its Error 6
  • 8. Sampling and Non-Sampling Errors  Sampling error refers to differences between the sample and the population that exist only because of the observations that happened to be selected for the sample Increasing the sample size will reduce this type of error. 8
  • 9. Types of Sampling Error  Sample Errors  Non Sample Errors 9
  • 10. Sample Errors  Error caused by the act of taking a sample  They cause sample results to be different from the results of census  Differences between the sample and the population that exist only because of the observations that happened to be selected for the sample  Statistical Errors are sample error  We have no control over 10
  • 11. Non Sample Errors  Not Control by Sample Size  Non Response Error  Response Error 11
  • 12. Non Response Error  A non-response error occurs when units selected as part of the sampling procedure do not respond in whole or in part. 12
  • 13. Response Errors  A response or data error is any systematic bias that occurs during data collection, analysis or interpretation. Respondent error (e.g., lying, forgetting, etc.) Interviewer bias Recording errors Poorly designed questionnaires Measurement error 13
  • 14. Respondent error  respondent gives an incorrect answer, e.g. due to prestige or competence implications, or due to sensitivity or social undesirability of question  respondent misunderstands the requirements  lack of motivation to give an accurate answer  “lazy” respondent gives an “average” answer  question requires memory/recall  proxy respondents are used, i.e. taking answers from someone other than the respondent 14
  • 15. Interviewer bias  Different interviewers administer a survey in different ways  Differences occur in reactions of respondents to different interviewers, e.g. to interviewers of their own sex or own ethnic group  Inadequate training of interviewers  Inadequate attention to the selection of interviewers  There is too high a workload for the interviewer 15
  • 16. Measurement Error  The question is unclear, ambiguous or difficult to answer  The list of possible answers suggested in the recording instrument is incomplete  Requested information assumes a framework unfamiliar to the respondent  The definitions used by the survey are different from those used by the respondent (e.g. how many part-time employees do you have? See next slide for an example) 16
  • 17. Key Points on Errors  Non-sampling errors are inevitable in production of national statistics. Important that:-  At planning stage, all potential non-sampling errors are listed and steps taken to minimise them are considered.  If data are collected from other sources, question procedures adopted for data collection, and data verification at each step of the data chain.  Critically view the data collected and attempt to resolve queries immediately they arise.  Document sources of non-sampling errors so that results presented can be interpreted meaningfully. 17
  • 18. Sampling Distributions  Sampling Distribution of Mean  Student’s ‘t’ Distribution  Sampling Distribution of Proportion  F Distribution  Chi-square Distribution 18
  • 19. Sampling distribution of mean  Mean calculated from a sample is usually the best guess for population mean. But different samples give different sample means!  It can be shown that sample means from samples of size n are normally distributed:  Term is called standard error (standard deviation of sample means). ),( n N   n  1x 2x 3x  19
  • 20. CONT… Sample mean comes from the normal distribution above. Knowing normal distribution properties, we can be 95% sure that sample mean is in the range: ),( n N   n x n      96,196,1 20
  • 21. CONT…  If population standard deviation is unknown then it can be shown that sample means from samples of size n are t-distributed with n-1 degrees of freedom  As an estimate for standard error we can use n s 21
  • 22. T-distribution  T-distribution is quite similar to normal distribution, but the exact shape of t-distribution depends on sample size  When sample size increases then t-distribution approaches normal distribution  T-distribution’s critical values can be calculated with Excel =TINV(probability ; degrees of freedom)  In the case of error margin for mean degrees of freedom equals n – 1 (n=sample size)  Ex. Critical value for 95% confidence level when sample size is 50: =TINV(0,05;49) results 2,00957 22
  • 23. Sampling Distribution of Proportion  Proportion calculated from a sample is usually the best guess for population proportion. But different samples give different sample proportions!  It can be shown that proportions from samples of size n are normally distributed  Standard error (standard deviation of sample proportions) is  As an estimate for standard error we use ) )1( ,( n N    n )1(   n pp )1(  23
  • 24. Error margin for proportion  Based on the sampling distribution of proportion we can be 95% sure that population proportion is (95% confidence interval) n pp p n pp p )1( 96,1 )1( 96,1      24
  • 27. Degree of Freedom  In statistics, the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary.  The number of independent ways by which a dynamic system can move, without violating any constraint imposed on it, is called number of degrees of freedom. In other words, the number of degrees of freedom can be defined as the minimum number of independent coordinates that can specify the position of the system completely.  df = n - 1 27
  • 28. Standard Error  The Standard Deviation of sampling distribution of a statistic is known as its standard error (S.E) and is considered the key to sampling theory.  The utility of the concept of standard error in statistical induction arises on account of the following reasons:  The Standard error helps in testing whether the difference between observed and expected frequencies could arise due to chance.  The standard error gives an idea about the reliability and precision of a sample. The smaller the S.E., the greater the uniformity of sampling distribution and hence, greater is the reliability of sample.  The standard error enables us to specify the limits within which the parameters of the population are expected to lie with a specified degree of confidence. Such an interval is usually known as confidence interval. 28
  • 29. 29
  • 30. Central Limit Thereom  When sampling is from a normal population, the means of samples drawn from such a population are themselves normally distributed. But when sampling is not from a normal population, the size of the sample plays a critical role. When n is small, the shape of the distribution will depend largely on the shape of the parent population, but as n gets large (n > 30), the shape of the sampling distribution will become more and more like a normal distribution, irrespective of the shape of the parent population.  The theorem which explains this sort of relationship between the shape of the population distribution and the sampling distribution of the mean is known as the central limit theorem.  “The significance of the central limit theorem lies in the fact that it permits us to use sample statistics to make inferences about population parameters without knowing anything about the shape of the frequency distribution of that population other than what we can get from the sample.” 30
  • 31. Finite Population Correction  The Finite Population Correction Factor (FPC) is used when you sample without replacement from more than 5% of a finite population.  It’s needed because under these circumstances, the Central Limit Theorem doesn’t hold and the standard error of the estimate (e.g. the mean or proportion) will be too big.  In basic terms, the FPC captures the difference between sampling with replacement and sampling without replacement.  FPC = ((N-n)/(N-1))1/2 31
  • 32. CONT… The following table of values shows how the FPC decreases for a population of 10,000 as the sample size gets larger: 32
  • 33. Statistical Inference Use a random sample to learn something about a larger population 33
  • 34. Inference  Two ways to make inference  Estimation of parameters  * Point Estimation (X or p)  * Intervals Estimation  Hypothesis Testing 34
  • 35. Mean, , is unknown Population Point estimate I am 95% confident that  is between 40 & 60 Mean X = 50 Sample Interval estimate Estimation of parameters35
  • 36. Parameter = Statistic ± Its Error 36
  • 37. Sampling Distribution X or PX or P X or P 37
  • 38. Standard Error SE (Mean) = S n SE (p) = p(1-p) n Quantitative Variable Qualitative Variable 38
  • 39. 95% Samples X _ X - 1.96 SE X + 1.96 SE  SESE Z-axis 1 - α α/2α/2 Confidence Interval39
  • 40. 95% Samples SESE  p p + 1.96 SEp - 1.96 SE Z-axis 1 - α α/2α/2 Confidence Interval40
  • 41.
  • 42. Example (Sample size≥30)  An epidemiologist studied the blood glucose level of a random sample of 100 patients. The mean was 170, with a SD of 10.  SE = 10/10 = 1  Then CI:   = 170 + 1.96  1 168.04   ≥ 171.96 95 %  = X + Z SE 42
  • 43. Hypothesis testing  A statistical method that uses sample data to evaluate a hypothesis about a population parameter. It is intended to help researchers differentiate between real and random patterns in the data. 43
  • 44. What is a Hypothesis?  An assumption about the population parameter. I assume the mean SBP of participants is 120 mmHg 44
  • 45. Null & Alternative Hypotheses  H0 Null Hypothesis states the Assumption to be tested e.g. SBP of participants = 120 (H0: m = 120).  H1 Alternative Hypothesis is the opposite of the null hypothesis (SBP of participants ≠ 120 (H1: m ≠ 120). It may or may not be accepted and it is the hypothesis that is believed to be true by the researcher 45
  • 46. Level of Significance, a  Defines unlikely values of sample statistic if null hypothesis is true. Called rejection region of sampling distribution  Typical values are 0.01, 0.05  Selected by the Researcher at the Start  Provides the Critical Value(s) of the Test 46
  • 47. 0 a Critical Value(s) Rejection Regions Level of Significance and the Rejection Region47
  • 48. H0: Innocent Jury Trial Hypothesis Test Actual Situation Actual Situation Verdict Innocent Guilty Decision H0 True H0 False Innocent Correct Error Accept H0 1 - a Type II Error (b ) Guilty Error Correct H0 Type I Error (a ) Power (1 - b) False Negative False Positive Reject Result Possibilities48
  • 49. Hypothesis Testing: Steps  Test the Assumption that the true mean SBP of participants is 120 mmHg.  State H0 H0 : m = 120  State H1 H1 : m  120  Choose a a = 0.05  Choose n n = 100  Choose Test: Z, t, X2 Test 49
  • 50. Hypothesis Testing: Steps  Compute Test Statistic  Search for Critical Value  Make Statistical Decision rule  Express Decision 50
  • 51. One sample-mean Test  Assumptions  Population is normally distributed  t test statistic n s x t 0 errorstandard valuenullmeansample     51
  • 52. Example Normal Body Temperature  What is normal body temperature? Is it actually 37.6oC (on average)? State the null and alternative hypotheses  H0:  = 37.6oC  Ha:   37.6oC 52
  • 53. Example Normal Body Temp (cont) n s x t 0 errorstandard valuenullmeansample     Data: random sample of n = 18 normal body temps 37.2 36.8 38.0 37.6 37.2 36.8 37.4 38.7 37.2 36.4 36.6 37.4 37.0 38.2 37.6 36.1 36.2 37.5 Variable n Mean SD SE t P Temperature 18 37.22 0.68 0.161 2.38 0.029 Summarize data with a test statistic 53
  • 54. STUDENT’S t DISTRIBUTION TABLE Degrees of freedom Probability (p value) 0.10 0.05 0.01 1 6.314 12.706 63.657 5 2.015 2.571 4.032 10 1.813 2.228 3.169 17 1.740 2.110 2.898 20 1.725 2.086 2.845 24 1.711 2.064 2.797 25 1.708 2.060 2.787  1.645 1.960 2.576 54
  • 55. Example Normal Body Temp (cont)  Find the p-value  df = n – 1 = 18 – 1 = 17  From SPSS: p-value = 0.029  From t Table: p-value is between 0.05 and 0.01.  Area to left of t = -2.11 equals area to right of t = +2.11.  The value t = 2.38 is between column headings 2.110& 2.898 in table, and for df =17, the p- values are 0.05 and 0.01. -2.11 +2.11 t 55
  • 56. Example Normal Body Temp (cont)  Decide whether or not the result is statistically significant based on the p- value  Using a = 0.05 as the level of significance criterion, the results are statistically significant because 0.029 is less than 0.05. In other words, we can reject the null hypothesis.  Report the Conclusion  We can conclude, based on these data, that the mean temperature in the human population does not equal 37.6. 56
  • 57. Case Study: - STATISTICAL INFERENCE OF A CASE STUDY IN CHINA: ACTIVE PHOSPHATE REMOVAL FROM EUTROPHIC WATER  China is a country that exports a huge amount of duck meat. Recently, more and more people raise ducks in ponds together with fish. Previous research has shown that the yield of fish in a duckfish integrated system pond is greater than the yield in non-integrated system ponds.  At the same time, the duck-fish system reduced the pollution significantly. However, there is still polluted water left due to the entering phosphorous and nitrate from ducks (Adel K. Soliman, 2000) 57
  • 58. Experimental Design and Sample Collection  Experimental Design and Sample Collection  The experiments were performed in Anhui, China. Three ponds, A, B and C, were selected.  Pond A is our treatment pond where we planted the water spinach. It had ducks and fishes. Pond B is a pond with ducks and fishes without water spinach. Pond C is the control pond with fishes only. We built a floating bed of size 5m*1.2m to fix the water spinach in pond A. 58
  • 59. Sample Collection  The samples were obtained each of these locations in three ponds:  A1: concentration from water within water spinach area in pond A;  A2: concentration from water outside water spinach area in pond A;  B1: concentration from water under duck sheds in pond B;  B2: concentration from water away from duck sheds in pond B;  C1: concentration from water in pond C without duck or water spinach. 59
  • 60. Data Analysis Result  When we plot the measurements from a same pond, we get Figure 3. The observations for both ammonia nitrogen and active phosphate from A1 continuously decrease. The observations for ammonia-nitrogen from C1 do not show decreasing trend. 60
  • 61. Conclusion  We performed multiple paired t-test to compare the mean concentrations of ammonia-nitrogen at various locations. The p-value between samples from A1 and A2 is greater than 0.1, so there is no real difference in the concentration of ammonia-nitrogen at two different locations within pond A.  There is a real difference in the concentration of active phosphate at two different locations within pond B. The significance test recommends that the water near the ducks is more polluted by the active phosphate content than the water elsewhere. The p-value between samples from A1 and B2 is close to 0.0005, so there is a significant evidence that planting water spinach reduces the active phosphate content in the water. 61
  • 62. Reference  CR Kothari - Research Methodology Methods and Techniques , 2nd Revised edition, New Age International Publishers.  June Luo, Ling Zu - STATISTICAL INFERENCE OF A CASE STUDY IN CHINA: ACTIVE PHOSPHATE REMOVAL FROM EUTROPHIC WATER, Department of Applied Economics and Statistics, Clemson University.  https://www.slideshare.net/rambhu21/sampling-and-sampling-errors-19870549/62 62
  • 63. THANK YOU For Bearing. Bhavik A. Shah (17TS809) 63