Small Sampling Theory Presentation1
Upcoming SlideShare
Loading in...5
×
 

Small Sampling Theory Presentation1

on

  • 8,523 views

 

Statistics

Views

Total Views
8,523
Views on SlideShare
8,167
Embed Views
356

Actions

Likes
1
Downloads
224
Comments
0

3 Embeds 356

http://www.scoop.it 340
http://www.slideshare.net 12
http://translate.googleusercontent.com 4

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Small Sampling Theory Presentation1 Small Sampling Theory Presentation1 Presentation Transcript

  • Small Sampling Theory
    • Small sample theory : The study of statistical inference with small sample (i.e. n ≤30) . It includes t-distribution and F-distribution. They are defined in terms of “ number of degrees of freedom”.
    • Degrees of freedom ν : Number of useful items of information generated by a sample of given size with respect to the estimation of a given population parameter.
    • OR
    • Total number of observations minus the number of independent constraints imposed on the observations.
    • n - no. of observations
    • k - no. of independent constants
    • then n - k = no. of degrees of freedom
    • Example:- X = A + B + C , (10 = 2 + 3 + C , so C = 5)
    • n = 4 , k = 3
    • n – k = 1 , so 1 degree of freedom.
    Introduction
  • t - Distribution
    • William Sealy Gosset published t-distribution in 1908 in Biometrika under pen name “Student”.
    • When sample size is large than 30, then sampling distribution of mean will follow Normal distribution.
    • If sample size is less than 30, then sample statistic will follow t-distribution.
    • Probability density function of t-distribution:
    • Y 0 is a constant depending on n such that area under the curve is 1.
    • t -table gives the probability integral of t -distribution.
  • Properties of t-Distribution
    • Ranges from – ∞ to ∞
    • Bell-shaped and symmetrical around mean zero.
    • Its shape changes as the no. of degrees of freedom changes. Hence ν is a parameter of t-distribution.
    • Variance is always greater than one and is defined only when v ≥ 3, given as
    • It is more platykurtic (less peaked at the centre and higher in tails) than normal distribution.
    • It has greater dispersion than normal distribution. As n gets larger, t-distribution approaches normal form.
  • Steps involved in testing of hypothesis.
    • Establish a null hypothesis
    • Suggest an alternate hypothesis.
    • Calculate t value.
    • Find degrees of freedom.
    • Set up a suitable significance level.
    • From t -table find critical value of t using α (risk of type 1 error, significance level) and v- degrees of freedom.
    • If calculated t value is less than critical value obtained from table, then null hypothesis is accepted. Otherwise alternate hypothesis is accepted.
  • Applications of t - distribution
    • Test of Hypothesis about the population mean.
    • Test of Hypothesis about the difference between two mean.
    • Test of Hypothesis about the difference between two mean with dependent samples.
    • Test of Hypothesis about coefficient of correlation.
  • 1. Test of Hypothesis about the population mean( σ unknown and small sample size )
    • Null hypothesis:
    • t value is given as:
    • Standard deviation of sample is given as:
    • Degrees of freedom = n – 1
    • Calculate table value at specified significance level & d.f.
    • If calculated value is more than table value then null hypothesis is rejected.
    • 100(1- α )% Confidence interval for population mean:
  • Test of hypothesis about the difference between two means
    • When population variances are unknown,
    • t-test can be used in two types.
    • When variances are equal.
    • When variances are not equal.
  • (a) Case of equal variances
    • Null hypothesis: μ 1 = µ 2
    • t value is given as:
    • where,
    • and
    • Degrees of freedom: n 1 + n 2 – 2
    • Calculate table value at specified significance level & d.f.
    • If calculated value is more than table value then null hypothesis is rejected.
  • (b) Case of unequal variances
    • When population variances are not equal, we use unbiased estimators s 1 2 and s 2 2 to replace σ 1 2 and σ 2 2 .
    • Here, sampling distribution has large variability than population variability.
    • t value:
    • Degrees of freedom:
    • Calculate table value at specified significance level & d.f.
    • If calculated value is more than table value then null hypothesis is rejected.
  • Confidence interval for the difference between two means Two samples of sizes n 1 and n 2 are randomly and independently drawn from two normally distributed populations with unknowns but equal variances. The 100(1- α )% confidence interval for µ 1 - µ 2 is given by:
  • (3) Test of hypothesis about the difference between two means with dependent samples (paired t-test)
    • Samples are dependent, each observation in one sample is associated with some particular observation in second sample.
    • Observations in two samples should be collected in form called matched pairs.
    • Two samples should have same number of units.
    • Instead of 2 samples we can get one random sample of pairs and two measurements associated with a pair will be related to each other. Example: in before and after type experiments or when observations are matched by rise or some other criterion.
    • Null hypothesis: μ 1 = µ 2
    • t value is given as:
    • where, mean of differences,
    • standard deviation of differences,
    • Degrees of freedom = n – 1
    • Calculate table value at specified significance level & d.f.
    • If calculated value is more than table value then null hypothesis is rejected.
    • Confidence interval for the mean of the difference:
  • (4) Testing of hypothesis about coefficient of correlation.
    • Case 1: testing the hypothesis when the population coefficient of correlation equals zero, i.e., H o : ρ =0
    • Case 2: testing the hypothesis when the population coefficient of correlation equals some other value than zero, i.e., H o : ρ = ρ o
    • Case 3: testing the hypothesis for the difference between two independent correlation coefficients.
  • Case 1: testing the hypothesis when the population coefficient of correlation equals zero, i.e., H o : ρ =0
    • Null hypothesis: there is no correlation in population, i.e., H o : ρ =0
    • t value is given as:
    • Degrees of freedom: n-2
    • Calculate table value at specified significance level & d.f.
    • If calculated value is more than table value then null hypothesis is rejected, then there is linear relationship between the variables.
  • Case 2: testing the hypothesis when the population coefficient of correlation equals some other value than zero, i.e., H o : ρ = ρ o
    • When ρ≠ 0, test based on t-distribution will not be appropriate, but Fisher’s z-transformation will be applicable.
    • z = 0.5 log e (1+r)/(1-r)
    • OR
    • z = 1.1513 log 10 (1+r)/(1-r)
    • Z is normally distributed with mean
    • z ρ = 0.5 log e (1+ ρ )/(1- ρ )
    • Standard deviation: σ z = 1/√(n-3)
    • This test is more applicable if sample size is large ( atleast 10).
    • Null hypothesis: H o : ρ = ρ o
    • Test statistic:
    • Which follows approx. standard normal distribution.
  • Case 3: testing the hypothesis for the difference between two independent correlation coefficients
    • To test the hypothesis of 2 correlation coefficients derived from two separate samples, compare the difference of the 2 corresponding values of z with the standard error of that difference.
    • Formula used:
    • If the absolute value of this statistic is greater than 1.96, the difference will be significant at 5% significance level.
  • The F - Distribution
    • Named in honour of R.A. Fisher who studied it in 1924.
    • It is defined in terms of ratio of the variances of two normally distributed populations. So, it sometimes also called variance ratio.
    • F – distribution :
    • where,
    • s 1 2 , s 2 2 are unbiased estimator of σ 1 2 , σ 2 2 resp.
    • Degrees of freedom: v1 = n 1 -1, v 2 - 1
    • If σ 1 2 = σ 2 2 , then , F=s 1 2 /s 2 2
    • It depends on v 1 and v 2 for numerator and denominator resp., so v 1 and v 2 are parameters of F distribution.
    • For different values of v 1 and v 2 we will get different distributions.
  • Probability density function
    • Probability density function of F-distribution:
  • Properties of F-distribution
    • It is positively skewed and its skewness decreases with increase in v 1 and v 2 .
    • Value of F must always be positive or zero, since variances are squares. So its value lies between 0 and ∞.
    • Mean and variance of F-distribution:
    • Mean = v 2 /(-v 2 -2), for v 2 > 2
    • Variance = 2v 2 2 (v1+v 2 -2) , for v 2 > 4
    • v 1 (v 2 -2) 2 (v 2 -4)
    • Shape of F-distribution depends upon the number of degrees of freedom.
    • The areas in left hand side of the distribution can be found by taking reciprocal of F values corresponding to the right hand side, when the no. of degrees of freedom in nr. And in dr. are interchanged. It is known as reciprocal property,
    • F 1- α ,v 1 ,v 2 =1/F α ,v 2 ,v 1
    • we can find lower tail f values from corresponding upper tail F values, which are given in appendix.
  • Testing of hypothesis for equality of two variances
    • It is based on the variances in two independently selected random samples drawn from two normal populations.
    • Null hypothesis H o : σ 1 2 = σ 2 2
    • F = s 1 2 / σ 1 2 , which reduces to F = s 1 2
    • s 2 2 / σ 2 2 s 2 2
    • place large sample variance in numerator.
    • Degrees of freedom v 1 and v 2 .
    • Find table value using v 1 and v 2 .
    • If calculated F value exceeds table F value, null hypothesis is rejected.
  • Confidence interval for the ratio of two variances
    • 100(1- α )% confidence interval for the ratio of the variances of two normally distributed populations is given by:
    • s 1 2 /s 2 2 < σ 1 2 < s 1 2 /s 2 2
    • F (1- α /2) σ 2 2 F α /2