Math 102- Statistics

4,329 views
4,030 views

Published on

Dec 15, 2011 with Ma'am Daisy

Math 102- Statistics

  1. 1. INTRODUCTION TO STATISTICS AND STATISTICAL INFERENCE Teaching Basic Statistics
  2. 2. Session 1. TEACHING BASIC STATISTICS
  3. 3. Realities about Statistics <ul><li>“ There are three kinds of lies: lies, damned lies, and statistics” – Mark Twaine </li></ul><ul><li>One can not go about without statistics. </li></ul><ul><li>“ Statistics are like bikinis. What they reveal is suggestive, but what they conceal is vital.” – Aaron Levenstein   </li></ul>Session 1. TEACHING BASIC STATISTICS
  4. 4. Definition of Statistics <ul><ul><li>plural sense: numerical facts, e.g. CPI, peso-dollar exchange rate </li></ul></ul><ul><ul><li>singular sense: scientific discipline consisting of theory and methods for processing numerical information that one can use when making decisions in the face of uncertainty. </li></ul></ul>Session 1. TEACHING BASIC STATISTICS
  5. 5. History of Statistics <ul><li>The term statistics came from the Latin phrase “ratio status ” which means study of practical politics or the statesman’s art. </li></ul><ul><li>In the middle of 18 th century, the term statistik (a term due to Achenwall) was used, a German term defined as “the political science of several countries” </li></ul><ul><li>From statistik it became statistics defined as a statement in figures and facts of the present condition of a state. </li></ul>Session 1. TEACHING BASIC STATISTICS
  6. 6. Application of Statistics <ul><li>Diverse applications </li></ul><ul><ul><li>“ During the 20th Century statistical thinking and methodology have become the scientific framework for literally dozens of fields including education, agriculture, economics, biology, and medicine, and with increasing influence recently on the hard sciences such as astronomy, geology, and physics. In other words, we have grown from a small obscure field into a big obscure field.” – Brad Efron </li></ul></ul>Session 1. TEACHING BASIC STATISTICS
  7. 7. Application of Statistics <ul><li>Comparing the effects of five kinds of fertilizers on the yield of a particular variety of corn </li></ul><ul><li>Determining the income distribution of Ateneo students under CHED </li></ul><ul><li>Comparing the effectiveness of two diet programs </li></ul><ul><li>Prediction of daily temperatures </li></ul><ul><li>Evaluation of student performance </li></ul>Session 1. TEACHING BASIC STATISTICS
  8. 8. Two Aims of Statistics <ul><li>Statistics aims to uncover structure in data, to explain variation… </li></ul><ul><li>Descriptive </li></ul><ul><li>Inferential </li></ul>Session 1. TEACHING BASIC STATISTICS
  9. 9. Areas of Statistics <ul><li>Descriptive statistics </li></ul><ul><li>methods concerned w/ collecting, describing, and analyzing a set of data without drawing conclusions (or inferences) about a large group </li></ul><ul><li>Inferential statistics </li></ul><ul><li>methods concerned with the analysis of a subset of data leading to predictions or inferences about the entire set of data </li></ul>Session 1. TEACHING BASIC STATISTICS
  10. 10. Examples of Descriptive Statistics <ul><li>Presenting the Philippine population by constructing a graph indicating the total number of Filipinos counted during the last census by age group and sex </li></ul><ul><li>The Department of Social Welfare and Development (DSWD) cited statistics showing an increase in the number of child abuse cases during the past five years. </li></ul>Session 1. TEACHING BASIC STATISTICS
  11. 11. Examples of Inferential Statistics <ul><li>A new milk formulation designed to improve the psychomotor development of infants was tested on randomly selected infants. Based on the results, it was concluded that the new milk formulation is effective in improving the psychomotor development of infants . </li></ul>Session 1. TEACHING BASIC STATISTICS
  12. 12. Inferential Statistics Session 1. TEACHING BASIC STATISTICS Larger Set ( N units/observations) Smaller Set ( n units/observations ) Inferences and Generalizations
  13. 13. Key Definitions <ul><li>A variable is a characteristic observed or measured on every unit of the universe. </li></ul><ul><li>A population is the set of all possible values of the variable. </li></ul>Session 1. TEACHING BASIC STATISTICS
  14. 14. Key Definitions <ul><li>Parameters are numerical measures that describe the population or universe of interest. Usually donated by Greek letters;  (mu),  (sigma),  (rho),  (lambda),  (tau),  (theta),  (alpha) and  (beta). </li></ul><ul><li>Statistics are numerical measures of a sample </li></ul>Session 1. TEACHING BASIC STATISTICS
  15. 15. Types of Variables <ul><li>Qualitative variable </li></ul><ul><ul><li>non-numerical values </li></ul></ul><ul><li>Quantitative variable </li></ul><ul><ul><li>numerical values </li></ul></ul><ul><ul><ul><li>Discrete </li></ul></ul></ul><ul><ul><ul><ul><li>countable </li></ul></ul></ul></ul><ul><ul><ul><li>Continuous </li></ul></ul></ul><ul><ul><ul><ul><li>measurable </li></ul></ul></ul></ul>Session 1. TEACHING BASIC STATISTICS
  16. 16. Levels of Measurement <ul><li>Nominal </li></ul><ul><ul><li>Numbers or symbols used to classify </li></ul></ul><ul><li>Ordinal scale </li></ul><ul><ul><li>Accounts for order; no indication of distance between positions </li></ul></ul><ul><li>Interval scale </li></ul><ul><ul><li>Equal intervals; no absolute zero </li></ul></ul><ul><li>Ratio scale </li></ul><ul><ul><li>Has absolute zero </li></ul></ul>Session 1. TEACHING BASIC STATISTICS
  17. 17. Session 1. TEACHING BASIC STATISTICS <ul><li>NOMINAL SCALE </li></ul><ul><ul><li>a nominal scale consists of a set of categories that have different names </li></ul></ul><ul><ul><li>measurements on a nominal scale label and categorize observations, but do not make any quantitative distinctions between observations.Variables measured at the nominal scale: </li></ul></ul><ul><ul><ul><ul><ul><li>Gender (1= male, 0=female) </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>ZIP code (7000=Philippines, …) </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Plate numbers of vehicles (JK3429, MC001, …) </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Course (Biology, Mathematics, History, …) </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Race (Asian, American, …) </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Eye color (Brown, Blue, …) </li></ul></ul></ul></ul></ul>
  18. 18. Session 1. TEACHING BASIC STATISTICS <ul><li>ORDINAL SCALE </li></ul><ul><ul><li>consists of a set of categories that are organized in an ordered sequence </li></ul></ul><ul><ul><li>measurements on an ordinal scale rank observations in terms of size </li></ul></ul><ul><ul><ul><li>variables that can be measured at the ordinal scale: </li></ul></ul></ul><ul><ul><ul><ul><li>Ranks in a race (first, second, third, …) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Sizes of shirts (small, medium, large, …) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Order of birth (first child, second child , third child , …) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Socio-economic status (lower, middle, upper, …) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Difficulty level of a test (easy, average, difficult, …) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Degree of agreement (SD, D, A, SA) </li></ul></ul></ul></ul>
  19. 19. Session 1. TEACHING BASIC STATISTICS <ul><li>INTERVAL SCALE </li></ul><ul><ul><li>consists of ordered categories that are all intervals of exactly the same size </li></ul></ul><ul><ul><li>equal differences between numbers on the scale reflect equal differences in magnitude, however, ratios of magnitudes are not meaningful. </li></ul></ul><ul><ul><ul><li>Variables measured at the interval scale: </li></ul></ul></ul><ul><ul><ul><ul><ul><li>Temperature (in o F or o C) </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>IQ </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>SAT scores </li></ul></ul></ul></ul></ul>
  20. 20. Session 1. TEACHING BASIC STATISTICS <ul><li>RATIO SCALE </li></ul><ul><ul><li>is an interval scale with additional feature of an absolute zero point </li></ul></ul><ul><ul><li>Ratios of numbers do reflect ratios of magnitude </li></ul></ul><ul><ul><ul><li>Variables measured at the ratio scale: </li></ul></ul></ul><ul><ul><ul><ul><ul><li>Age (16, 20, 28, …) </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Height (165cm, 154cm, 144cm, …) </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Reaction time (20sec, 43sec, 37sec, …) </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Number of siblings (2, 5, 8, …) </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Hours spent on studying for an exam (0, 2, 3, …) </li></ul></ul></ul></ul></ul>
  21. 21. Methods of Presenting Data <ul><li>Textual </li></ul><ul><li>Tabular </li></ul><ul><li>Graphical </li></ul>Session 1. TEACHING BASIC STATISTICS
  22. 22. Session 1. TEACHING BASIC STATISTICS Mean Median Mode Summary Measures Variation Variance Standard Deviation Coefficient of Variation Range Location Maximum Minimum Central Tendency Percentile Quartile Decile Interquartile Range Skewness Kurtosis
  23. 23. Measures of Location <ul><li>A Measure of Location summarizes a data set by giving a “typical value” within the range of the data values that describes its location relative to entire data set. </li></ul><ul><li>Some Common Measures: </li></ul><ul><li> Minimum, Maximum </li></ul><ul><li> Central Tendency </li></ul><ul><li> Percentiles, Deciles, Quartiles </li></ul>Session 1. TEACHING BASIC STATISTICS
  24. 24. Maximum and Minimum <ul><li>Minimum is the smallest value in the data set, denoted as MIN . </li></ul><ul><li>Maximum is the largest value in the data set, denoted as MAX . </li></ul>Session 1. TEACHING BASIC STATISTICS
  25. 25. Measure of Central Tendency <ul><li>A single value that is used to identify the “center” of the data </li></ul><ul><ul><li>it is thought of as a typical value of the distribution </li></ul></ul><ul><ul><li>precise yet simple </li></ul></ul><ul><ul><li>most representative value of the data </li></ul></ul>Session 1. TEACHING BASIC STATISTICS
  26. 26. Mean <ul><li>Most common measure of the center </li></ul><ul><li>Also known as arithmetic average </li></ul>Session 1. TEACHING BASIC STATISTICS Sample Mean Population Mean
  27. 27. Properties of the Mean <ul><li>may not be an actual observation in the data set </li></ul><ul><li>can be applied in at least interval level </li></ul><ul><li>easy to compute </li></ul><ul><li>every observation contributes to the value of the mean </li></ul>Session 1. TEACHING BASIC STATISTICS
  28. 28. Properties of the Mean <ul><li>subgroup means can be combined to come up with a group mean </li></ul><ul><li>easily affected by extreme values </li></ul>Session 1. TEACHING BASIC STATISTICS 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14 Mean = 6 Mean = 5
  29. 29. Median <ul><li>Divides the observations into two equal parts </li></ul><ul><ul><li>If n is odd, the median is the middle number. </li></ul></ul><ul><ul><li>If n is even, the median is the average of the 2 middle numbers. </li></ul></ul><ul><li>Sample median denoted as </li></ul><ul><li>while population median is denoted as </li></ul>Session 1. TEACHING BASIC STATISTICS
  30. 30. Properties of a Median <ul><li>may not be an actual observation in the data set </li></ul><ul><li>can be applied in at least ordinal level </li></ul><ul><li>a positional measure; not affected by extreme values </li></ul>Session 1. TEACHING BASIC STATISTICS 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14 Median = 5
  31. 31. Mode <ul><li>occurs most frequently </li></ul><ul><li>nominal average </li></ul><ul><li>computation of the mode for ungrouped or raw data </li></ul>Session 1. TEACHING BASIC STATISTICS 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Mode = 9 0 1 2 3 4 5 6 No Mode
  32. 32. Properties of a Mode <ul><li>can be used for qualitative as well as quantitative data </li></ul><ul><li>may not be unique </li></ul><ul><li>not affected by extreme values </li></ul><ul><li>may not exist </li></ul>Session 1. TEACHING BASIC STATISTICS
  33. 33. Mean, Median & Mode <ul><li>Use the mean when: </li></ul><ul><li>sampling stability is desired </li></ul><ul><li>other measures are to be computed </li></ul>Session 1. TEACHING BASIC STATISTICS
  34. 34. Mean, Median & Mode <ul><li>Use the median when: </li></ul><ul><li>the exact midpoint of the distribution is desired </li></ul><ul><li>there are extreme observations </li></ul>Session 1. TEACHING BASIC STATISTICS
  35. 35. Mean, Median & Mode <ul><li>Use the mode when: </li></ul><ul><li>when the &quot;typical&quot; value is desired </li></ul><ul><li>when the dataset is measured on a nominal scale </li></ul>Session 1. TEACHING BASIC STATISTICS
  36. 36. Percentiles <ul><li>Numerical measures that give the relative position of a data value relative to the entire data set. </li></ul><ul><li>Divide an array ( raw data arranged in increasing or decreasing order of magnitude ) into 100 equal parts. </li></ul><ul><li>The j th percentile, denoted as P j , is the data value in the the data set that separates the bottom j % of the data from the top (100- j )%. </li></ul>Session 1. TEACHING BASIC STATISTICS
  37. 37. EXAMPLE <ul><li>Suppose LJ was told that relative to the other scores on a certain test, his score was the 95 th percentile. </li></ul><ul><li> This means that 95% of those who took the test had scores less than or equal to LJ’s score, while 5% had scores higher than LJ’s. </li></ul>Session 1. TEACHING BASIC STATISTICS
  38. 38. Deciles <ul><li>Divide an array into ten equal parts, each part having ten percent of the distribution of the data values, denoted by D j . </li></ul><ul><li>The 1 st decile is the 10 th percentile; the 2 nd decile is the 20 th percentile….. </li></ul>Session 1. TEACHING BASIC STATISTICS
  39. 39. Quartiles <ul><li>Divide an array into four equal parts, each part having 25% of the distribution of the data values, denoted by Q j . </li></ul><ul><li>The 1 st quartile is the 25 th percentile; the 2 nd quartile is the 50 th percentile, also the median and the 3 rd quartile is the 75 th percentile. </li></ul>Session 1. TEACHING BASIC STATISTICS
  40. 40. Measures of Variation <ul><li>A measure of variation is a single value that is used to describe the spread of the distribution </li></ul><ul><ul><li>A measure of central tendency alone does not uniquely describe a distribution </li></ul></ul>Session 1. TEACHING BASIC STATISTICS
  41. 41. Session 1. TEACHING BASIC STATISTICS Mean = 15.5 s = 3.338 11 12 13 14 15 16 17 18 19 20 21 11 12 13 14 15 16 17 18 19 20 21 Data B Data A Mean = 15.5 s = .9258 11 12 13 14 15 16 17 18 19 20 21 Mean = 15.5 s = 4.57 Data C A look at dispersion…
  42. 42. Two Types of Measures of Dispersion <ul><li>Absolute Measures of Dispersion: </li></ul><ul><li> Range </li></ul><ul><li> Inter-quartile Range </li></ul><ul><li> Variance </li></ul><ul><li> Standard Deviation </li></ul>Session 1. TEACHING BASIC STATISTICS Relative Measure of Dispersion:  Coefficient of Variation
  43. 43. Range (R) Session 1. TEACHING BASIC STATISTICS The difference between the maximum and minimum value in a data set, i.e. R = MAX – MIN Example: Pulse rates of 15 male residents of a certain village 54 58 58 60 62 65 66 71 74 75 77 78 80 82 85 R = 85 - 54 = 31
  44. 44. Some Properties of the Range <ul><li>The larger the value of the range, the more dispersed the observations are. </li></ul><ul><li>It is quick and easy to understand. </li></ul><ul><li>A rough measure of dispersion. </li></ul>Session 1. TEACHING BASIC STATISTICS
  45. 45. Inter-Quartile Range (IQR) Session 1. TEACHING BASIC STATISTICS The difference between the third quartile and first quartile, i.e. IQR = Q 3 – Q 1 Example: Pulse rates of 15 residents of a certain village 54 58 58 60 62 65 66 71 74 75 77 78 80 82 85 IQR = 78 - 60 = 18
  46. 46. Some Properties of IQR <ul><li>Reduces the influence of extreme values. </li></ul><ul><li>Not as easy to calculate as the Range. </li></ul>Session 1. TEACHING BASIC STATISTICS
  47. 47. Variance <ul><li>important measure of variation </li></ul><ul><li>shows variation about the mean </li></ul><ul><ul><li>Population variance </li></ul></ul><ul><ul><li>Sample variance </li></ul></ul>Session 1. TEACHING BASIC STATISTICS
  48. 48. Standard Deviation (SD) <ul><li>most important measure of variation </li></ul><ul><li>square root of Variance </li></ul><ul><li>has the same units as the original data </li></ul><ul><li>Population SD </li></ul><ul><ul><ul><li>Sample SD </li></ul></ul></ul>Session 1. TEACHING BASIC STATISTICS
  49. 49. Session 1. TEACHING BASIC STATISTICS Data: 10 12 14 15 17 18 18 24 n = 8 Mean =16 Computation of Standard Deviation
  50. 50. Session 1. TEACHING BASIC STATISTICS Remarks on Standard Deviation <ul><li>If there is a large amount of variation, then on average, the data values will be far from the mean. Hence, the SD will be large. </li></ul><ul><li>If there is only a small amount of variation, then on average, the data values will be close to the mean. Hence, the SD will be small. </li></ul>
  51. 51. Comparing Standard Deviation Session 1. TEACHING BASIC STATISTICS Mean = 15.5 s = 3.338 11 12 13 14 15 16 17 18 19 20 21 11 12 13 14 15 16 17 18 19 20 21 Data B Data A Mean = 15.5 s = .9258 11 12 13 14 15 16 17 18 19 20 21 Mean = 15.5 s = 4.57 Data C
  52. 52. Comparing Standard Deviation Session 1. TEACHING BASIC STATISTICS Example: Team A - Heights of five marathon players in inches 65” 65 “ 65 “ 65 “ 65 “ 65 “ Mean = 65 S = 0
  53. 53. Comparing Standard Deviation Session 1. TEACHING BASIC STATISTICS Example: Team B - Heights of five marathon players in inches 62 “ 67 “ 66 “ 70 “ 60 “ Mean = 65” s = 4.0”
  54. 54. Properties of Standard Deviation <ul><li>It is the most widely used measure of dispersion. (Chebychev’s Inequality) </li></ul><ul><li>It is based on all the items and is rigidly defined. </li></ul><ul><li>It is used to test the reliability of measures calculated from samples. </li></ul><ul><li>The standard deviation is sensitive to the presence of extreme values. </li></ul><ul><li>It is not easy to calculate by hand (unlike the range). </li></ul>Session 1. TEACHING BASIC STATISTICS
  55. 55. Coefficient of Variation (CV) <ul><li>measure of relative variation </li></ul><ul><li>usually expressed in percent </li></ul><ul><li>shows variation relative to mean </li></ul><ul><li>used to compare 2 or more groups </li></ul><ul><li>Formula : </li></ul>Session 1. TEACHING BASIC STATISTICS
  56. 56. Comparing CVs <ul><li>Stock A: Average Price = P50 </li></ul><ul><li> SD = P5 </li></ul><ul><li> CV = 10% </li></ul><ul><li>Stock B: Average Price = P100 </li></ul><ul><li> SD = P5 </li></ul><ul><li> CV = 5% </li></ul>Session 1. TEACHING BASIC STATISTICS
  57. 57. Measure of Skewness <ul><li>Describes the degree of departures of the distribution of the data from symmetry. </li></ul><ul><li>The degree of skewness is measured by the coefficient of skewness, denoted as SK and computed as, </li></ul>Session 1. TEACHING BASIC STATISTICS
  58. 58. What is Symmetry? <ul><li>A distribution is said to be symmetric about the mean, if the distribution to the left of mean is the “mirror image” of the distribution to the right of the mean. Likewise, a symmetric distribution has SK=0 since its mean is equal to its median and its mode. </li></ul>Session 1. TEACHING BASIC STATISTICS
  59. 59. <ul><li>positively skewed </li></ul>Measure of Skewness <ul><li>negatively skewed </li></ul>Session 1. TEACHING BASIC STATISTICS
  60. 60. Measure of Kurtosis <ul><li>Describes the extent of peakedness or flatness of the distribution of the data. </li></ul><ul><li>Measured by coefficient of kurtosis ( K ) computed as, </li></ul>Session 1. TEACHING BASIC STATISTICS
  61. 61. Measure of Kurtosis Session 1. TEACHING BASIC STATISTICS K = 0 mesokurtic K > 0 leptokurtic K < 0 platykurtic

×