2.
Statistics <ul><li>The systematic and scientific treatment of quantitative measurement is precisely known as statistics. </li></ul><ul><li>Statistics may be called as science of counting. </li></ul><ul><li>Statistics is concerned with the collection, classification (or organization), presentation and analysis of data which are measurable in numerical terms. </li></ul>
3.
Stages of Statistical Investigation <ul><li>Collection of Data </li></ul><ul><li>Organization of data </li></ul><ul><li>Presentation of data </li></ul><ul><li>Analysis </li></ul><ul><li>Interpretation of Results </li></ul>
4.
Statistics <ul><li>It is divided into two major parts: Descriptive and Inferential Statistics. </li></ul><ul><li>Descriptive statistics , is a set of methods to describe data that we have collected. i.e. summarization of data. </li></ul><ul><li>Inferential statistics, is a set of methods used to make a generalization, estimate, prediction or decision. When we want to draw conclusions about a distribution. </li></ul>
5.
Collection of Data <ul><li>Data can be collected by two ways: </li></ul><ul><li>>>> Primary Data Collection </li></ul><ul><li>It is the data collected by a particular person or organization for his own use. </li></ul><ul><li>>>> Secondary Data Collection </li></ul><ul><li>It is the data collected by some other person or organization, but the investigator also get it for his use. </li></ul>
6.
Methods of Primary data collection <ul><li>Direct personal interview </li></ul><ul><li>Data through questionnaire </li></ul><ul><li>Indirect investigation </li></ul><ul><li>Etc. </li></ul>
7.
Methods of Secondary data collection <ul><li>Data collected through newspapers & periodicals. </li></ul><ul><li>Data collected from research papers. </li></ul><ul><li>Data collected from government officials. </li></ul><ul><li>Data collected from various NGO, UN, UNESCO, WHO, ILO, UNICEF etc. </li></ul><ul><li>Other published resources </li></ul>
8.
Classification of data <ul><li>Classification is a process of arranging data into sequences and groups according to their common characteristics or separating them into different but related parts. </li></ul><ul><li>It is a process of arranging data into various homogeneous classes and subclasses according to some common characteristics. </li></ul>
9.
Presentation of Data <ul><li>Data should be presented in such a manner, so that it may be easily understood and grasped, and the conclusion may be drawn promptly from the data presented. e.g. </li></ul><ul><li>>>> Histogram </li></ul><ul><li>>>> Frequency polygon & curve </li></ul><ul><li>>>> Pie Chart </li></ul><ul><li>>>> Ogives </li></ul><ul><li>>>> Pictogram & Cartogram </li></ul><ul><li>>>> Bar Chart </li></ul>
10.
Variables <ul><li>Discrete Variable </li></ul><ul><li>e.g. No. of books, table, chairs </li></ul><ul><li>Continuous Variable </li></ul><ul><li>e.g. Height, Weight </li></ul><ul><li>Quantitative Variable </li></ul><ul><li>That can be measured on a scale </li></ul><ul><li>Qualitative Variable </li></ul><ul><li>That can not be measured on a scale </li></ul>
11.
Frequency Distribution <ul><li>The observations can be recorded by three ways: </li></ul><ul><li>1. Individual Series </li></ul><ul><li>Data recorded for individual member. </li></ul><ul><li>2. Discrete Series </li></ul><ul><li>This variable can assume values after an interval (or jumps). </li></ul><ul><li>3. Continuous Series </li></ul><ul><li>Here the variable may be having any value, integer or fraction. </li></ul>
12.
Statistics functions & Uses <ul><li>It simplifies complex data </li></ul><ul><li>It provides techniques for comparison </li></ul><ul><li>It studies relationship </li></ul><ul><li>It helps in formulating policies </li></ul><ul><li>It helps in forecasting </li></ul><ul><li>It is helpful for common man </li></ul><ul><li>Statistical methods merges with speed of computer can make wonders; SPSS, STATA </li></ul><ul><li>MATLAB, MINITAB etc. </li></ul>
13.
Scope of Statistics <ul><li>In Business Decision Making </li></ul><ul><li>In Medical Sciences </li></ul><ul><li>In Actuarial Science </li></ul><ul><li>In Economic Planning </li></ul><ul><li>In Agricultural Sciences </li></ul><ul><li>In Banking & Insurance </li></ul><ul><li>In Politics & Social Science </li></ul>
14.
Distrust & Misuse of Statistics <ul><li>Statistics is like a clay of which one can make a God or Devil. </li></ul><ul><li>Statistics are the liers of first order. </li></ul><ul><li>Statistics can prove or disprove anything. </li></ul>
15.
Measure of Central Tendency <ul><li>It is a single value represent the entire mass of data. Generally, these are the central part of the distribution. </li></ul><ul><li>It facilitates comparison & decision-making </li></ul><ul><li>There are mainly three type of measure </li></ul><ul><li>1. Arithmetic mean </li></ul><ul><li>2. Median </li></ul><ul><li>3. Mode </li></ul>
16.
Arithmetic Mean <ul><li>This single representative value can be determined by: </li></ul><ul><li>A.M. =Sum/No. of observations </li></ul><ul><li>Properties: </li></ul><ul><li>1. The sum of the deviations from AM is always zero. </li></ul><ul><li>2. If every value of the variable increased or decreased by a constant then new AM will also change in same ratio. </li></ul>
17.
Arithmetic Mean (contd..) <ul><li>3. If every value of the variable multiplied or divide by a constant then new AM will also change in same ratio. </li></ul><ul><li>4. The sum of squares of deviations from AM is minimum. </li></ul><ul><li>5. The combined AM of two or more related group is defined as </li></ul>
18.
Median <ul><li>The median is that value of the variable which divides the group into two equal parts, one part comprising all values greater, and the other part having lesser value than median. </li></ul><ul><li>Determination of Median </li></ul><ul><li>>>> Arrange the data first </li></ul><ul><li>>>> Find the size of (N+1)/2 th item. </li></ul>
19.
Mode <ul><li>Mode is that value which occurs most often in the series. </li></ul><ul><li>It is the value around which, the items tends to be heavily concentrated. </li></ul><ul><li>It is important average when we talk about “most common size of shoe or shirt”. </li></ul>
20.
Relationship among Mean, Median & Mode <ul><li>For a symmetric distribution: </li></ul><ul><li> Mode = Median = Mean </li></ul><ul><li>The empirical relationship between mean, median and mode for asymmetric distribution is: </li></ul><ul><li>Mode = 3 Median – 2 Mean </li></ul>
21.
Skewness <ul><li>Mode: Peak of the curve. </li></ul><ul><li>Median: Divide the curve into two equal parts. </li></ul><ul><li>Mean: Center of gravity of the curve. </li></ul><ul><li>For a positively skewed distribution: </li></ul><ul><li>Mean>Median>Mode </li></ul><ul><li>For a Negatively skewed distribution: </li></ul><ul><li>Mean<Median<Mode </li></ul>
22.
Dispersion or Variation <ul><li>The average does not enable us to draw a full picture of the distribution. So a further description is necessary to get a better description. </li></ul><ul><li>The extent or degree to which data tends to spread around an average is called dispersion & Variation. </li></ul>
23.
Objectives <ul><li>For judging the reliability of averages. </li></ul><ul><li>Comparison of distributions </li></ul><ul><li>Useful for controlling variability </li></ul><ul><li>Useful in further analysis </li></ul>
24.
Measure of Dispersion <ul><li>Range </li></ul><ul><li>Inter quartile Range </li></ul><ul><li>Mean Deviation </li></ul><ul><li>Standard Deviation </li></ul>
25.
Range <ul><li>Range is the difference between the largest and the smallest observation. </li></ul><ul><li>Range = L-S </li></ul><ul><li>It is easy to calculate and provides a full picture of variation of the data quickly. </li></ul><ul><li>It is crude measure & not based on all the observations. </li></ul>
26.
Correlation Analysis <ul><li>Correlation denotes the degree of interdependence between variables or the tendency of simultaneous variation between variables. </li></ul><ul><li>Types of Correlation: </li></ul><ul><li>Positive & Negative </li></ul><ul><li>Linear & Non-linear </li></ul><ul><li>Multiple & Partial </li></ul>
27.
Positive & Negative Correlation <ul><li>Positive </li></ul><ul><li>Income Vs Expenditure </li></ul><ul><li>Agricultural Prod Vs Rainfall </li></ul><ul><li>Sales Vs Advt Expd </li></ul><ul><li>Cost of raw material Vs Cost of Industrial Prod </li></ul><ul><li>Negative </li></ul><ul><li>Price Vs Consumption </li></ul><ul><li>Day temp Vs Sale of Woolen clothes </li></ul>
28.
Measure of Correlation <ul><li>Scatter Diagram Method </li></ul><ul><li>Karl Pearson’s Coefficient of Correlation </li></ul><ul><li>Spearman’s Coefficient of Rank Correlation </li></ul><ul><li>Concurrent Deviation Method </li></ul>
29.
Scatter Diagram Method <ul><li>It is a graphical method to find the correlation between variables. </li></ul><ul><li>Here the pair of the observations are plotted on a 2-D space. </li></ul><ul><li>After joining the these points we can have the idea about the relationship between variables. </li></ul>
30.
Karl-Pearson’s coefficient of correlation (r) <ul><li>The value of r lying between -1 and +1 i.e., -1≤r ≤+1 </li></ul><ul><li>Coefficient of correlation is independent of change origin and scale. </li></ul><ul><li>Coefficient ‘r’ is symmetric r xy =r yx </li></ul><ul><li>The Probable error of ‘r’ is used to interpreting its estimated value. </li></ul>
31.
Spearman’s Coefficient of Rank Correlation <ul><li>Karl-Pearson’s method discusses the relationship between the quantitative variable where as Spearman’s coefficient suitable for qualitative variable like, rank given to the participant in any contest by two judges and we want to measure the relationship between rank given by these judges. </li></ul>
32.
Concurrent Deviation Method <ul><li>This is the simplest method in which only the direction of change is taken into consideration rather than magnitude of variation. </li></ul><ul><li>It gives a general idea about the correlation between variables quickly . </li></ul>
33.
Regression Analysis <ul><li>It is concerned with the formulation and determination of algebraic expression for the relationship between variables. </li></ul><ul><li>For this purpose we use regression lines. </li></ul><ul><li>These regression line are used for predicting the value of one variable from that of other. </li></ul>
34.
Regression Analysis contd.. <ul><li>Here the variable whose value is to be predicted is called dependent (Explained) variable and the variable used for prediction is called independent (Explanatory) variable. </li></ul><ul><li>This method first introduced by “ Sir Francis Galton ”. </li></ul><ul><li>It helps in prediction & estimation. </li></ul>
35.
Properties of Regression Lines & Coefficient <ul><li>The regression line Y on X is used to estimate the best value of Y (Dep.) for a given value of X (Indep.). </li></ul><ul><li>The regression line X on Y is used to estimate the best value of X (Dep.) for a given value of Y (Indep.). </li></ul><ul><li>Both the regression coefficients are independent of change of origin & scale. </li></ul>
36.
Properties of Regression Lines & Coefficient (contd..) <ul><li>The relation between r, b yx and b xy is </li></ul><ul><li>r = ±√ b yx b xy </li></ul><ul><li>Both the regression coefficient should have same sign. </li></ul><ul><li>Both the regression coefficient could not more than one simultaneously. </li></ul><ul><li>Regression coefficient denotes the rate of change. i.e. b yx measure the change in Y for a unit change in X. </li></ul>
37.
Properties of Regression Lines & Coefficient (contd..) <ul><li>Both lines cut each other at (X, Y). </li></ul><ul><li>If r=0, both lines perpendicular to each other. </li></ul><ul><li>If the regression lines are identical, the correlation between the variable is perfect. </li></ul>
38.
Standard Error of Estimate <ul><li>It provides us a measure of scatter of the observations about an average line, the standard error of estimate of Y on X is: </li></ul><ul><li>S Y.X = √ [ Σ (Y-Y est ) 2 / N] </li></ul>
39.
Probability <ul><li>Probability is a concept which numerically measures the degree of uncertainty or certainty of the occurrence of any event. i.e. the chance of occurrence of any event. </li></ul><ul><li>The probability of an event A is </li></ul><ul><li> No. of Favorable cases </li></ul><ul><li>P(A)= Total No. of Cases </li></ul>
40.
Probability <ul><li>If P(A)=0, Impossible Event </li></ul><ul><li>If P(A)=1, Sure Event </li></ul><ul><li>0≤P(A)≤1 </li></ul><ul><li>P(A)= Probability of occurrence </li></ul><ul><li>P(Ā)= Probability of Non-occurrence </li></ul><ul><li>P(A) + P(Ā) = 1 </li></ul>
41.
Some Keywords <ul><li>Equally Likely Events : When the chance of occurrence of all the events are same in an experiment. </li></ul><ul><li>Mutually Exclusive Events : If the occurrence of any one of them prevents the occurrence of other in the same experiment. </li></ul><ul><li>Sample Space : the set of all possible outcomes. </li></ul>
42.
Some Keywords <ul><li>Independent Events: If two or more events occur in such a way that the occurrence of one does not effect the occurrence of other. </li></ul><ul><li>Dependent Events: If the occurrence of one event influences the occurrence of the other. </li></ul>
43.
Classical or Priori Probability <ul><li>If a trial result in ‘n’ exhaustive, mutually exclusive and equally likely cases and ‘m’ of them are favorable to the happenings of an event E, then the probability ‘P’ of happening of E is given by: </li></ul><ul><li>P(E) = m / n </li></ul>
44.
Empirical or Posteriori Probability <ul><li>The classical def requires that ‘n’ is finite and that all cases are equally likely. </li></ul><ul><li>This condition is very restrictive and can not cover all situations. </li></ul><ul><li>The above conditions are not necessarily active in this case. </li></ul>
45.
Fundamental rule of counting <ul><li>If an event can occur in ‘m’ ways and following it, a second event can occur in ‘n’ ways, then these two event in succession can occur in ‘mxn’ ways. </li></ul><ul><li>E.g. A tricolor can be formed out of 6 colors in 6x5x4=120 ways. </li></ul><ul><li>No. of words of 3 characters out of 26 alphabets 26x25x24= 15600 ways. </li></ul>
46.
Permutations <ul><li>The different arrangement can be made out of a given no. of things by taking some or all at a time are called permutations. </li></ul><ul><li>P (n,r) = n! / (n-r)! </li></ul><ul><li>E.g. permutations made with letters a,b,c by taking two at a time: </li></ul><ul><li>P(3,2)=6 </li></ul><ul><li>ab, ba, ac, ca, bc, cb </li></ul>
47.
Combinations <ul><li>The combination of ‘n’ different objects taken ‘r’ at a time is a selection of ‘r’ out of ‘n’ objects with no attention given to order of arrangement </li></ul><ul><li>C (n,r) = n!/r!(n-r)! </li></ul><ul><li>e.g. From 5 boys & 6 girls a group of 3 is to be formed having 2 boys & 1 girl is C(5,2) x C(6,1) = 60 ways </li></ul>
48.
Example <ul><li>A coin is tossed three times. Find the probability of getting: </li></ul><ul><li>Exactly one head </li></ul><ul><li>Exactly two head </li></ul><ul><li>One or two head </li></ul>
49.
Example <ul><li>One card is randomly drawn from a pack of 52 cards. Find the probability that </li></ul><ul><li>Drawn card is red </li></ul><ul><li>Drawn card is an ace </li></ul><ul><li>Drawn card is red and king </li></ul><ul><li>Drawn card is red or king </li></ul>
50.
Example <ul><li>A bag contains 3 red, 6 white and 7 blue balls. Two balls are drawn at random. Find the probability that </li></ul><ul><li>Both the balls are white. </li></ul><ul><li>Both the balls are blue. </li></ul><ul><li>One ball is red & other is white. </li></ul><ul><li>One ball is white & other is blue. </li></ul>
51.
Addition Theorem <ul><li>For any two event A and B the probability for the occurrence of A or B is given by: </li></ul><ul><li>P(AUB)= P(A) + P(B) – P(A П B) </li></ul><ul><li>If A & B are mutually Exclusive then </li></ul><ul><li> P(A П B)=0 </li></ul><ul><li>P(AUB)= P(A) + P(B) </li></ul>
52.
Multiplication or Conditional Probability <ul><li>The probability of an event B when it is known that the event A has occurred already: </li></ul><ul><li> P(B/A)= P(A П B) / P(A) ;if P(A)>0 </li></ul><ul><li>ie. P(A П B)= P(A).P(B/A) </li></ul><ul><li>If A and B are Independent event: </li></ul><ul><li> P(A П B)= P(A).P(B) </li></ul>
53.
Example <ul><li>A bag contains 25 balls numbered from 1 to 25. Two balls are drawn at random from the bag with replacement. Find the probability of selecting: </li></ul><ul><li>Both odd numbers. </li></ul><ul><li>One odd & one even. </li></ul><ul><li>At least one odd. </li></ul><ul><li>No odd numbers. </li></ul><ul><li>Both even numbers. </li></ul>
54.
Example <ul><li>Five men in a company of 20 are graduate. If 3 men are picked up at random, what is the probability that they are all graduate? What is the probability that at least one is graduate. </li></ul>
55.
Example <ul><li>The probability that A hits a target is 1/3 and the probability that B hits the target is 2/5. What is the probability that the target will be hit, if each one of A and B shoots at the target. </li></ul>
56.
Expected Value of Probability <ul><li>Let X be the random variable with the following distribution: </li></ul><ul><li>X : x 1 x 2 x 3 ……….. </li></ul><ul><li>P(X) :P(x 1 ) P(x 2 ) P(x 3 )…….. </li></ul><ul><li>Expected Value is given by: </li></ul><ul><li>E(X) = Σ x i . P (x i ) </li></ul>
57.
Example <ul><li>A player tossed two coins. If two heads show he wins Rs. 4. if one head shows he wins Rs. 2, but if two tails show he pays Rs. 3 as penalty. Calculate the expected value of the game to him. </li></ul><ul><li>Solution: </li></ul><ul><li>E(X)= (-3) ¼ + (2) ½ + (4) ¼ =1.25 </li></ul>
58.
Example <ul><li>An insurance company sells a particular life insurance policy with a face value of Rs. 1000 and a yearly premium of Rs. 20. If 0.2% of the policy holder can be expected to die in the course of a year, what would be the company’s expected earning per policy holder per year. </li></ul><ul><li>E(X)= (-980) 0.002 + (20) 0.998=18 </li></ul>