Math 102- Statistics
Upcoming SlideShare
Loading in...5
×
 

Math 102- Statistics

on

  • 3,178 views

Dec 15, 2011 with Ma'am Daisy

Dec 15, 2011 with Ma'am Daisy

Statistics

Views

Total Views
3,178
Views on SlideShare
3,178
Embed Views
0

Actions

Likes
1
Downloads
180
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs LicenseCC Attribution-NonCommercial-NoDerivs License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Math 102- Statistics Math 102- Statistics Presentation Transcript

  • INTRODUCTION TO STATISTICS AND STATISTICAL INFERENCE Teaching Basic Statistics
  • Session 1. TEACHING BASIC STATISTICS
  • Realities about Statistics
    • “ There are three kinds of lies: lies, damned lies, and statistics” – Mark Twaine
    • One can not go about without statistics.
    • “ Statistics are like bikinis. What they reveal is suggestive, but what they conceal is vital.” – Aaron Levenstein  
    Session 1. TEACHING BASIC STATISTICS
  • Definition of Statistics
      • plural sense: numerical facts, e.g. CPI, peso-dollar exchange rate
      • singular sense: scientific discipline consisting of theory and methods for processing numerical information that one can use when making decisions in the face of uncertainty.
    Session 1. TEACHING BASIC STATISTICS
  • History of Statistics
    • The term statistics came from the Latin phrase “ratio status ” which means study of practical politics or the statesman’s art.
    • In the middle of 18 th century, the term statistik (a term due to Achenwall) was used, a German term defined as “the political science of several countries”
    • From statistik it became statistics defined as a statement in figures and facts of the present condition of a state.
    Session 1. TEACHING BASIC STATISTICS
  • Application of Statistics
    • Diverse applications
      • “ During the 20th Century statistical thinking and methodology have become the scientific framework for literally dozens of fields including education, agriculture, economics, biology, and medicine, and with increasing influence recently on the hard sciences such as astronomy, geology, and physics. In other words, we have grown from a small obscure field into a big obscure field.” – Brad Efron
    Session 1. TEACHING BASIC STATISTICS
  • Application of Statistics
    • Comparing the effects of five kinds of fertilizers on the yield of a particular variety of corn
    • Determining the income distribution of Ateneo students under CHED
    • Comparing the effectiveness of two diet programs
    • Prediction of daily temperatures
    • Evaluation of student performance
    Session 1. TEACHING BASIC STATISTICS
  • Two Aims of Statistics
    • Statistics aims to uncover structure in data, to explain variation…
    • Descriptive
    • Inferential
    Session 1. TEACHING BASIC STATISTICS
  • Areas of Statistics
    • Descriptive statistics
    • methods concerned w/ collecting, describing, and analyzing a set of data without drawing conclusions (or inferences) about a large group
    • Inferential statistics
    • methods concerned with the analysis of a subset of data leading to predictions or inferences about the entire set of data
    Session 1. TEACHING BASIC STATISTICS
  • Examples of Descriptive Statistics
    • Presenting the Philippine population by constructing a graph indicating the total number of Filipinos counted during the last census by age group and sex
    • The Department of Social Welfare and Development (DSWD) cited statistics showing an increase in the number of child abuse cases during the past five years.
    Session 1. TEACHING BASIC STATISTICS
  • Examples of Inferential Statistics
    • A new milk formulation designed to improve the psychomotor development of infants was tested on randomly selected infants. Based on the results, it was concluded that the new milk formulation is effective in improving the psychomotor development of infants .
    Session 1. TEACHING BASIC STATISTICS
  • Inferential Statistics Session 1. TEACHING BASIC STATISTICS Larger Set ( N units/observations) Smaller Set ( n units/observations ) Inferences and Generalizations
  • Key Definitions
    • A variable is a characteristic observed or measured on every unit of the universe.
    • A population is the set of all possible values of the variable.
    Session 1. TEACHING BASIC STATISTICS
  • Key Definitions
    • Parameters are numerical measures that describe the population or universe of interest. Usually donated by Greek letters;  (mu),  (sigma),  (rho),  (lambda),  (tau),  (theta),  (alpha) and  (beta).
    • Statistics are numerical measures of a sample
    Session 1. TEACHING BASIC STATISTICS
  • Types of Variables
    • Qualitative variable
      • non-numerical values
    • Quantitative variable
      • numerical values
        • Discrete
          • countable
        • Continuous
          • measurable
    Session 1. TEACHING BASIC STATISTICS
  • Levels of Measurement
    • Nominal
      • Numbers or symbols used to classify
    • Ordinal scale
      • Accounts for order; no indication of distance between positions
    • Interval scale
      • Equal intervals; no absolute zero
    • Ratio scale
      • Has absolute zero
    Session 1. TEACHING BASIC STATISTICS
  • Session 1. TEACHING BASIC STATISTICS
    • NOMINAL SCALE
      • a nominal scale consists of a set of categories that have different names
      • measurements on a nominal scale label and categorize observations, but do not make any quantitative distinctions between observations.Variables measured at the nominal scale:
            • Gender (1= male, 0=female)
            • ZIP code (7000=Philippines, …)
            • Plate numbers of vehicles (JK3429, MC001, …)
            • Course (Biology, Mathematics, History, …)
            • Race (Asian, American, …)
            • Eye color (Brown, Blue, …)
  • Session 1. TEACHING BASIC STATISTICS
    • ORDINAL SCALE
      • consists of a set of categories that are organized in an ordered sequence
      • measurements on an ordinal scale rank observations in terms of size
        • variables that can be measured at the ordinal scale:
          • Ranks in a race (first, second, third, …)
          • Sizes of shirts (small, medium, large, …)
          • Order of birth (first child, second child , third child , …)
          • Socio-economic status (lower, middle, upper, …)
          • Difficulty level of a test (easy, average, difficult, …)
          • Degree of agreement (SD, D, A, SA)
  • Session 1. TEACHING BASIC STATISTICS
    • INTERVAL SCALE
      • consists of ordered categories that are all intervals of exactly the same size
      • equal differences between numbers on the scale reflect equal differences in magnitude, however, ratios of magnitudes are not meaningful.
        • Variables measured at the interval scale:
            • Temperature (in o F or o C)
            • IQ
            • SAT scores
  • Session 1. TEACHING BASIC STATISTICS
    • RATIO SCALE
      • is an interval scale with additional feature of an absolute zero point
      • Ratios of numbers do reflect ratios of magnitude
        • Variables measured at the ratio scale:
            • Age (16, 20, 28, …)
            • Height (165cm, 154cm, 144cm, …)
            • Reaction time (20sec, 43sec, 37sec, …)
            • Number of siblings (2, 5, 8, …)
            • Hours spent on studying for an exam (0, 2, 3, …)
  • Methods of Presenting Data
    • Textual
    • Tabular
    • Graphical
    Session 1. TEACHING BASIC STATISTICS
  • Session 1. TEACHING BASIC STATISTICS Mean Median Mode Summary Measures Variation Variance Standard Deviation Coefficient of Variation Range Location Maximum Minimum Central Tendency Percentile Quartile Decile Interquartile Range Skewness Kurtosis
  • Measures of Location
    • A Measure of Location summarizes a data set by giving a “typical value” within the range of the data values that describes its location relative to entire data set.
    • Some Common Measures:
    •  Minimum, Maximum
    •  Central Tendency
    •  Percentiles, Deciles, Quartiles
    Session 1. TEACHING BASIC STATISTICS
  • Maximum and Minimum
    • Minimum is the smallest value in the data set, denoted as MIN .
    • Maximum is the largest value in the data set, denoted as MAX .
    Session 1. TEACHING BASIC STATISTICS
  • Measure of Central Tendency
    • A single value that is used to identify the “center” of the data
      • it is thought of as a typical value of the distribution
      • precise yet simple
      • most representative value of the data
    Session 1. TEACHING BASIC STATISTICS
  • Mean
    • Most common measure of the center
    • Also known as arithmetic average
    Session 1. TEACHING BASIC STATISTICS Sample Mean Population Mean
  • Properties of the Mean
    • may not be an actual observation in the data set
    • can be applied in at least interval level
    • easy to compute
    • every observation contributes to the value of the mean
    Session 1. TEACHING BASIC STATISTICS
  • Properties of the Mean
    • subgroup means can be combined to come up with a group mean
    • easily affected by extreme values
    Session 1. TEACHING BASIC STATISTICS 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14 Mean = 6 Mean = 5
  • Median
    • Divides the observations into two equal parts
      • If n is odd, the median is the middle number.
      • If n is even, the median is the average of the 2 middle numbers.
    • Sample median denoted as
    • while population median is denoted as
    Session 1. TEACHING BASIC STATISTICS
  • Properties of a Median
    • may not be an actual observation in the data set
    • can be applied in at least ordinal level
    • a positional measure; not affected by extreme values
    Session 1. TEACHING BASIC STATISTICS 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14 Median = 5
  • Mode
    • occurs most frequently
    • nominal average
    • computation of the mode for ungrouped or raw data
    Session 1. TEACHING BASIC STATISTICS 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Mode = 9 0 1 2 3 4 5 6 No Mode
  • Properties of a Mode
    • can be used for qualitative as well as quantitative data
    • may not be unique
    • not affected by extreme values
    • may not exist
    Session 1. TEACHING BASIC STATISTICS
  • Mean, Median & Mode
    • Use the mean when:
    • sampling stability is desired
    • other measures are to be computed
    Session 1. TEACHING BASIC STATISTICS
  • Mean, Median & Mode
    • Use the median when:
    • the exact midpoint of the distribution is desired
    • there are extreme observations
    Session 1. TEACHING BASIC STATISTICS
  • Mean, Median & Mode
    • Use the mode when:
    • when the "typical" value is desired
    • when the dataset is measured on a nominal scale
    Session 1. TEACHING BASIC STATISTICS
  • Percentiles
    • Numerical measures that give the relative position of a data value relative to the entire data set.
    • Divide an array ( raw data arranged in increasing or decreasing order of magnitude ) into 100 equal parts.
    • The j th percentile, denoted as P j , is the data value in the the data set that separates the bottom j % of the data from the top (100- j )%.
    Session 1. TEACHING BASIC STATISTICS
  • EXAMPLE
    • Suppose LJ was told that relative to the other scores on a certain test, his score was the 95 th percentile.
    •  This means that 95% of those who took the test had scores less than or equal to LJ’s score, while 5% had scores higher than LJ’s.
    Session 1. TEACHING BASIC STATISTICS
  • Deciles
    • Divide an array into ten equal parts, each part having ten percent of the distribution of the data values, denoted by D j .
    • The 1 st decile is the 10 th percentile; the 2 nd decile is the 20 th percentile…..
    Session 1. TEACHING BASIC STATISTICS
  • Quartiles
    • Divide an array into four equal parts, each part having 25% of the distribution of the data values, denoted by Q j .
    • The 1 st quartile is the 25 th percentile; the 2 nd quartile is the 50 th percentile, also the median and the 3 rd quartile is the 75 th percentile.
    Session 1. TEACHING BASIC STATISTICS
  • Measures of Variation
    • A measure of variation is a single value that is used to describe the spread of the distribution
      • A measure of central tendency alone does not uniquely describe a distribution
    Session 1. TEACHING BASIC STATISTICS
  • Session 1. TEACHING BASIC STATISTICS Mean = 15.5 s = 3.338 11 12 13 14 15 16 17 18 19 20 21 11 12 13 14 15 16 17 18 19 20 21 Data B Data A Mean = 15.5 s = .9258 11 12 13 14 15 16 17 18 19 20 21 Mean = 15.5 s = 4.57 Data C A look at dispersion…
  • Two Types of Measures of Dispersion
    • Absolute Measures of Dispersion:
    •  Range
    •  Inter-quartile Range
    •  Variance
    •  Standard Deviation
    Session 1. TEACHING BASIC STATISTICS Relative Measure of Dispersion:  Coefficient of Variation
  • Range (R) Session 1. TEACHING BASIC STATISTICS The difference between the maximum and minimum value in a data set, i.e. R = MAX – MIN Example: Pulse rates of 15 male residents of a certain village 54 58 58 60 62 65 66 71 74 75 77 78 80 82 85 R = 85 - 54 = 31
  • Some Properties of the Range
    • The larger the value of the range, the more dispersed the observations are.
    • It is quick and easy to understand.
    • A rough measure of dispersion.
    Session 1. TEACHING BASIC STATISTICS
  • Inter-Quartile Range (IQR) Session 1. TEACHING BASIC STATISTICS The difference between the third quartile and first quartile, i.e. IQR = Q 3 – Q 1 Example: Pulse rates of 15 residents of a certain village 54 58 58 60 62 65 66 71 74 75 77 78 80 82 85 IQR = 78 - 60 = 18
  • Some Properties of IQR
    • Reduces the influence of extreme values.
    • Not as easy to calculate as the Range.
    Session 1. TEACHING BASIC STATISTICS
  • Variance
    • important measure of variation
    • shows variation about the mean
      • Population variance
      • Sample variance
    Session 1. TEACHING BASIC STATISTICS
  • Standard Deviation (SD)
    • most important measure of variation
    • square root of Variance
    • has the same units as the original data
    • Population SD
        • Sample SD
    Session 1. TEACHING BASIC STATISTICS
  • Session 1. TEACHING BASIC STATISTICS Data: 10 12 14 15 17 18 18 24 n = 8 Mean =16 Computation of Standard Deviation
  • Session 1. TEACHING BASIC STATISTICS Remarks on Standard Deviation
    • If there is a large amount of variation, then on average, the data values will be far from the mean. Hence, the SD will be large.
    • If there is only a small amount of variation, then on average, the data values will be close to the mean. Hence, the SD will be small.
  • Comparing Standard Deviation Session 1. TEACHING BASIC STATISTICS Mean = 15.5 s = 3.338 11 12 13 14 15 16 17 18 19 20 21 11 12 13 14 15 16 17 18 19 20 21 Data B Data A Mean = 15.5 s = .9258 11 12 13 14 15 16 17 18 19 20 21 Mean = 15.5 s = 4.57 Data C
  • Comparing Standard Deviation Session 1. TEACHING BASIC STATISTICS Example: Team A - Heights of five marathon players in inches 65” 65 “ 65 “ 65 “ 65 “ 65 “ Mean = 65 S = 0
  • Comparing Standard Deviation Session 1. TEACHING BASIC STATISTICS Example: Team B - Heights of five marathon players in inches 62 “ 67 “ 66 “ 70 “ 60 “ Mean = 65” s = 4.0”
  • Properties of Standard Deviation
    • It is the most widely used measure of dispersion. (Chebychev’s Inequality)
    • It is based on all the items and is rigidly defined.
    • It is used to test the reliability of measures calculated from samples.
    • The standard deviation is sensitive to the presence of extreme values.
    • It is not easy to calculate by hand (unlike the range).
    Session 1. TEACHING BASIC STATISTICS
  • Coefficient of Variation (CV)
    • measure of relative variation
    • usually expressed in percent
    • shows variation relative to mean
    • used to compare 2 or more groups
    • Formula :
    Session 1. TEACHING BASIC STATISTICS
  • Comparing CVs
    • Stock A: Average Price = P50
    • SD = P5
    • CV = 10%
    • Stock B: Average Price = P100
    • SD = P5
    • CV = 5%
    Session 1. TEACHING BASIC STATISTICS
  • Measure of Skewness
    • Describes the degree of departures of the distribution of the data from symmetry.
    • The degree of skewness is measured by the coefficient of skewness, denoted as SK and computed as,
    Session 1. TEACHING BASIC STATISTICS
  • What is Symmetry?
    • A distribution is said to be symmetric about the mean, if the distribution to the left of mean is the “mirror image” of the distribution to the right of the mean. Likewise, a symmetric distribution has SK=0 since its mean is equal to its median and its mode.
    Session 1. TEACHING BASIC STATISTICS
    • positively skewed
    Measure of Skewness
    • negatively skewed
    Session 1. TEACHING BASIC STATISTICS
  • Measure of Kurtosis
    • Describes the extent of peakedness or flatness of the distribution of the data.
    • Measured by coefficient of kurtosis ( K ) computed as,
    Session 1. TEACHING BASIC STATISTICS
  • Measure of Kurtosis Session 1. TEACHING BASIC STATISTICS K = 0 mesokurtic K > 0 leptokurtic K < 0 platykurtic