ANOVAWhen measurement data are influenced by several kind of effects operating simultaneously orwhen more than two means of independent samples are involved, analysis of variance technique(ANOVA) is used. One way ANOVA (One criteria on the basis of which data is classified). Two way ANOVA (Data is classified on the basis of two criteria) or Three or more way ANOVA.Estimate of population variance i. e. mean square between the sample is calculated, estimate ofpopulation variance i. e. mean square within the sample is calculated, ratio of the two variance iscalculated and inference is drawn.— Example: Manufacturer wants to know which type of packaging is best among three types of packaging.— study compared the effects of four 1-month point-of-purchase promotions on sales. AThe unit sales for five stores using all four promotions in different months follow.Calculate the F-ratio. At 0.01 level of significance do the promotions produce different effects on sales
Chi Square Test It is a Non-Parametric Test. Testing a certain hypothesis regarding Population ratio. Measures the deviation of sample from population ratio Applications of Chi Square Test are - o Testing Goodness of Fit o Testing Association or Dependence o Testing for Homogeneity Formula for theoretical distribution is -Exercise Question No. 1: In the following table test whether Blood group is associated with severityof disease or not? Choose 5% level of significance.Exercise Question No. 2: For the following table data test whether use of fertilizer is associated withownership of farm? alpha = 0.05
FundamentalsVariable TypesBroadly there are two types of variables – Qualitative and Quantitative variablesBoth of these variables can be further sub divided into two categories for eachThere are two type of Qualitative variables – (1) Nominal Variables (2) Ordinal VariablesNominal variables are used for naming or labeling.Examples of nominal variables are – Gender, Jersey number of player, etc.Ordinal variables are used to order observations on the basis of intensity of property or power ofproperty they posses. Values are ordered on the basis of non-numeric criteria.Examples are intensity of pain – mild, moderate, severe; smoking status – heavy or light; Beauty, intelligence etc.Discrete Variable: A quantitative variable is discrete when it results from counting.It takes on zero or positive integer value.Examples are - The number of male children in a family with three children (0, 1, 2 or 3),The number of spots on the up-face of a die; The number of red blood cells in a cubic milliliter of bloodContinuous Variable: A quantitative variable results from measuring.The accuracy of a continuous variable depends on the refinement of the measurement process.Theoretically it can take infinite number of possible values.Examples are - Birth weight of a new born infant; Amount of carbon monoxide in a person’s lung;Level of cholesterol in a cubic milliliter of blood.Types of Tests(1) Non-Parametric and (2) Parametric Tests
Non-Parametric tests - Assumes that variables are measured on nominal or ordinal scale.These tests do not make any assumption about the shape of the population from which the samples aredrawn hence also known as “distribution free tests” or more commonly “Non-parametric tests”.Examples : Sign test, Chi square test, Kruskal – Wallis test, Rank correlation, etc.Parametric Tests - One that is based on certain parameters. Our samples were either large or comesform normally distributed population. Examples are - Student’s t-test, F-test, Z-test , etc.HypothesisThere are two types of statistical hypothesis – (1) Null hypothesis (2) Alternate hypothesisNull Hypothesis is hypothesis of no difference and is generally represented by H0 and alternativehypothesis is just opposite of null hypothesis and is represented by Ha.e.g. H0 = Sample mean = Population meanChoosing level of SignificanceA statistical hypothesis test provides a process for accepting or rejecting null hypothesis (H0) ORrejecting or accepting alternative hypothesis (H1), while knowing the error rate associated with decision.The null hypothesis (H0) is either true or false and there are two possible decision accept or reject nullhypothesis (H0) creating four possible outcomes to a statistical hypothesis test. Decision H0 is true H0 is false Accept H0 Correct Decision Type II error ( ) Reject H0 Type I error ( ) Correct DecisionA type I error occurs when a true null hypothesis is rejected or when a false alternative hypothesis is
accepted i.e. a random variation has been mistaken for a “real” differenceType I error is called level of significance and it is decided prior to experiment.A significance level of say 1% implies that the researcher is running the risk of being wrong in acceptingor rejecting the hypothesis in 1 out of every 100 occasions.It is possible to test a hypothesis at any level of significance.A type II error occurs when a null hypothesis is accepted when alternative hypothesis is true.(1 – is known as power of testTest for Difference of MeansTo test whether there is significant difference between means of two samples or between sampleand population mean test for difference of means is applied. Here also there are four situations viz. 1. When sample is large and to test difference between means of sample and population 2. When sample is large and to test difference between means of two independent samples 3. When sample is small and to test difference between means of sample and population 4. When sample is small and to test difference between means of two independent samples. 5. Paired t - testThe formulas for these test are taught to you in the class. Given below are the problems related tothese situations. Solve them.When sample is large and to test difference between means of sample and populationExercise Question No. 3In a survey on hearing levels of school children with normal hearing it was foundthat in the frequency 500 cycles per second, 62 children tested in the sound proofroom has a mean hearing threshold of 15.5 decibels with a standard deviation of6.5. 76 comparable children who were tested in the field had a mean threshold of20.0 decibels with a standard deviation of 7.1. Test if there is any difference betweenthe hearing levels recorded in the sound proof room in the field.When sample is large and to test difference between means of two independent samplesExercise Question No. 4A potential buyer want to purchase bulbs in bulk and he wants to decide bulb of whichcompany A or B he should purchase?
Which of the two brands is of better quality?When sample is small and to test difference between means of sample and populationExercise Question No. 5A health survey in a few village revealed that the normal serum protein value of children in thatlocality is 7.0 g/100 ml. A group of 16 children who received high protein food for a period of 6months had serum protein value is given in the table. Can we consider that the mean serumprotein level of those who were fed on high protein diet is different from that of the generalpopulation?When sample is small and to test difference between means of twoindependent samplesExercise Question No. 6— a feeding trial 17 children were given high protein food supplement to their normal Indiet and 15 comparable children were kept under normal diet. They were kept on thisfeeding for 7 months. At the end of this study the change in the (initial – final) Hb levelof two groups was assessed (data given in table). Does it provides any evidence to say that thechange in the Hb level of the children who received high protein food is different from thecontrol group?
Paired Sample t-testExercise Question No. 7Twelve pre-school children were given a supplement of multi-purpose food for aperiod of four months. Their skin fold thickness (in mm) was measured before thecommencement of the programme and also at the end. Test if there is any change inskin fold thickness.
Time Series AnalysisThe term “Time Series” is used to refer to any group of statistical information accumulated overregular interval.A time series is an arrangement of statistical data in a chronological order in occurrence with itstime ofoccurrence.Examples are — series relating to prices, — production and consumption of various commodities, — agriculture and industrial production, — national income and foreign exchange reserves, — investments, sales and profits of business houses, — bank deposits and clearings, — prices and dividends of shares in a stock exchange marketsTime series analysis is used to detect pattern of change in statistical information over regularinterval of time.We project these patterns to arrive at an estimate for the futureThere are four kinds of variation or change involved in time-series analysis — Secular trend — Cyclical fluctuation — Seasonal variation — Irregular or random variationsSecular trend — The value of variable tends to increase or decrease over a long period of time. — E.g. The steady increase in the cost of living recorded by Consumer Price Index.
— For an individual year cost of living varies a great deal but if we examine a long term period we see that the trend is towards steady increase.Cyclical fluctuation — Business cycle. — Business cycle hits a peak above the trend line. — Business activity hitting a low point below the trend line. — The time between hitting peaks and falling to low points is at least 1 year and it can be as many as 15-20 years. — Cyclical movements do not follow any regular pattern but move in somewhat unpredictable manner.Seasonal variation — Seasonal variation involves pattern of change within a year that tend to be repeated from year to year. — E.g. a physician can expect a substantial increase in the number of flu cases every winter. Sale of ice cream — Sale of crackers in festive season. — Because of regular pattern, they are useful in forecasting the future.Irregular variations — In many situations the value of a variable is completely unpredictable, changing in a random manner. Irregular variation describes such movements. — Results of some unexpected events. In most of the instances a time series will contain several of these components. Thus, overall variation in a single time series can be described in terms of these four different kinds of variations.Trend Analysis Of the four components of time series secular trend represents the long term direction of the series. — To describe the trend component we can fit a trend line by the method of least squares. — Reasons for studying secular trends — Historical patterns — Projecting past patterns or trends into future
— Eliminate the trend component— Trends can be linear or curvilinear.— Fitting the linear trend Equation for estimating a straight line Y = a + bX where a = intercept; b = slope of line; Y = dependent variable and X = Time Equation for estimating a and b when time is coded Equation for a Example