SlideShare a Scribd company logo
1 of 29
Download to read offline
General Studies
              Community Dentistry 1
                Statistical Inference
                          Lecture 6
             Dr Nizam Abdullah




    Contents

              Review of descriptive statistics
              The normal curve
              Introduction to inferential
              statistics



          © The University of Adelaide, School of Dentistry




1
Descriptive statistics

    Central tendency
               Mean
               Median (50th Percentile)
               Mode
    Dispersion
        Standard deviation (SD) / Variance
        Inter-quartile range (IQR) (3rd quartile – 1st
        quartile)
        Range (Maximum – Minimum)
             © The University of Adelaide, School of Dentistry




      Distribution of a variable
         Another important aspect of the description of a
         variable is the shape of its distribution, which tells
         you the frequency of values from different ranges
         of the variable.
         Typically, a researcher is interested in how well
         the distribution can be approximated by the normal
         distribution.
         The normal distribution can be used to determine
         how far the sample is likely to be off from the
         overall population, i.e. how big a ‘margin of error’
         there is likely to be.
         Simple descriptive statistics can provide some
         information relevant to this issue.
             © The University of Adelaide, School of Dentistry




2
Distribution of a variable (cont.)
       A variable is said to be a normally distributed
       variable or to have a normal distribution if its
       distribution has the shape of a normal curve - the
       normal curve is a kind of bell-shaped curve.
       A normal distribution (and hence a normal curve) is
       completely determined by its mean and standard
       deviation - the mean and standard deviation are
       called the parameters of the normal curve.
       The normal curve is symmetric and centered about
       the mean.
       The standard deviation determines the spread of
       the curve. The larger the standard deviation, the
       flatter and more spread out the curve will be.
            © The University of Adelaide, School of Dentistry




     Normal curve (cont.)
    The mean, median, and mode all have the same value.




            © The University of Adelaide, School of Dentistry




3
Different shapes of the Normal curve

                    Standard deviation changes the relative width of the
                    distribution; the larger the standard deviation, the wider
                    the curve.




                                                  © The University of Adelaide, School of Dentistry




                Properties of normal distribution
                  Age distribution of Village A             • Bell-shaped curve
    45                                                      • Symmetrical about its mean (mirror image
    40
    35                                                        to each side)
    30
    25
                            68%                                    Mean and median are equal.
    20
    15
                   50%
                          95%50%
                         99.7%
                                                                   One side of the mean is 50% of the area.
    10
     5
     0
                                                                   The area between mean-1SD and
                                                                   mean+1SD is 68% (Mean±1SD=30, 50).
            80+
             0 -4
             5 -9
         1 0 -1 4
         1 5 -1 9
         2 0 -2 4
         2 5 -2 9
         3 0 -3 4
         3 5 -3 9
         4 0 -4 4
         4 5 -4 9
         5 0 -5 4
         5 5 -5 9
         6 0 -6 4
         6 5 -6 9
         7 0 -7 4
         7 5 -7 9




                                                                   The area between mean-2SD and
         e.g. : (Age) Mean = 40, SD = 10                           mean+2SD is 95% (Mean±2SD=20, 60).
         Therefore, Mean±1SD = 30, 50
         Between 30 yr and 50 yr old,                              The area between mean-3SD and
         there will be 68% of the group.                           mean+3SD is 99.7% (Mean±3SD=10, 70).
                                                  © The University of Adelaide, School of Dentistry




4
Normal curve: 68-95-99.7 rule

      68% of the
      observations fall within
                                                               68%
      one standard deviation
      of the mean                                        -σ    µ       +σ

      95% of the
      observations fall within
      two standard deviations                                  95%

      of the mean
                                                -2σ            µ             +2σ
      99.7% of the
      observations fall within
      three standard                                           99.7%
      deviations of the mean
                                             -3σ                   µ               +3σ
           © The University of Adelaide, School of Dentistry




     Distributions: Negative
     In a negatively skewed distribution, the mode is at the
     top of the curve, the median is lower than it, and the
     mean is lower than the median.
     The result is a ‘tail’ towards the more negative side of
     the graph.


                                                               Negative skewness
                                                               (tail to left = left skewed)
                                                               Median < Mode
                                                               Mean < Median




           © The University of Adelaide, School of Dentistry




5
Distributions: Positive
     In a positively skewed distribution, the mode is at the
     top of the curve, the median is higher than it, and the
     mean is higher than the median.
     The result is a ‘tail’ towards the more positive side of
     the graph.



                                                               Positive skewness :
                                                               (tail to right = right skewed)
                                                               Median > Mode
                                                               Mean > Median




           © The University of Adelaide, School of Dentistry




    Example dataset
      First Year BDS students enrolled in EBD1.
      Response to survey: n= 90 (out of 119), or 76%.
      Variables:
       – Age: quantitative variable measured on a ratio scale
       – Sex: qualitative variable measured on a nominal
         scale, i.e. variable with categories male or female
       – Height: quantitative variable measured on a ratio scale
       – Weight: quantitative variable measured on a ratio
         scale

      Variables measured at a higher level can always be
      converted to a lower level, but not vice versa.
      For example, observations of actual age (ratio
      scale) can be converted to categories of older and
      younger (ordinal scale). Similarly for height and
      weight.The University of Adelaide, School of Dentistry
           ©




6
Data spreadsheet
                            Case           Age             Sex           Height           Weight
                               1             21               2               165             45
                               2             22               2               170             53
                               3             18               1.                              74
                               4             20               2               165             44
                               5             19               1               175             70
                               6             19               2               163             53
                               7             24               2               163             49
                               8             18               1               170             60
                               9             29               2               178             70
                              10             28               2               163             58
                              11             18               2               177             72
                              12             38               2               164             65
                              13             23               2               161             65
                              14             20               2               178             63
                              15             29               2               159             54
                                :               :              :                :              :
                                : University of Adelaide, School: of Dentistry :
                             © The              :                                              :




               Frequency distribution of height
               variable

    Height (cm) Frequency
                                                                                  Mode = 165cm,
    150-155             4
                                                                                  170cm
    155-160             9
                                                                                  Mean = 169.3cm
    160-165            21                                                         Median = 168cm
    165-170            24

    170-175            10

    175-180            10

    180-185             7

    185-190             3

    190-195             1

    Total              89


                                      150    155    160     165    170     175      180   185   190   195
                              © The University of Adelaide, School of Dentistry




7
Frequency distribution of weight
                  variable
    Weight (kg)   Freq.
    40-45            4
    45-50            9                                                        Mean = 62.6 kg
    50-55           15
                                                                              Median = 60 kg
    55-60           16
    60-65           13
    65-70            7
    70-75           11
    75-80            4
    80-85            3
    85-90            3
    90-95            2
    95-100           1
    100-105          1
    105-110          0
    110-115          0
    115-120          0
    120-125          1
                          40   45   50   55   60   65   70   75   80   85     90   95 100 105   120 125
    Total           90
                          © The University of Adelaide, School of Dentistry




                  Frequency distribution of age variable

                                                        Mode      <    Median <        Mean
                                                        18 yrs         19 yrs          20 yrs




                          © The University of Adelaide, School of Dentistry




8
Descriptive statistics


        Variable       Freq          Min        Max      Range     SD
        Age             90           17.0       38.0        21.0   3.4

        Height          89          152.0     191.0         39.0   8.8

        Weight          90           40.0     120.0         80.0 14.5



              Variable       Category         Freq            %
              Sex            Male              32          35.6

                             Female            58          64.4

                             Total             90        100.0

              © The University of Adelaide, School of Dentistry




        What is Inferential
        Statistics ?
    It is the Statistical Technique/Method used to
    infer the result of the sample (statistic) to the
    population (parameter).

                      Population (Village A)
                           µ=?

                                              The technique is called
                                              “Inferential Statistics”
                 Sample
              x = 10.14
              © The University of Adelaide, School of Dentistry




9
Statistical inference

     Inferential statistics are used to draw
     inferences about a population from a sample.

     For example, the average number of decayed
     teeth in children aged 5 years can be
     estimated using observations from a sample
     of 5-year-olds.




           © The University of Adelaide, School of Dentistry




     Selecting a sample from a population

     How can a sample that is representative of the
     population of interest be selected?

     Answer: by random selection

     When a random sample is drawn from the population
     of interest, every member of the population has the
     same probability, or chance, of being selected in the
     sample.

     For this reason, random samples are considered to
     be unbiased.

           © The University of Adelaide, School of Dentistry




10
Two types of Inferential Statistics

              Parameter Estimation

              Hypothesis testing




             © The University of Adelaide, School of Dentistry




         1. Parameter estimation

         • Parameter estimation takes two
           forms:
             • 1.       Point estimation
             • 2.       Interval estimation




             © The University of Adelaide, School of Dentistry




11
Definition

     • A point estimate is a single numerical
       value used to estimate the
       corresponding population parameter
     • An interval estimate consists of two
       numerical values defining a range of
       values that, with a specific degree of
       confidence, we feel includes the
       parameter being estimated
           © The University of Adelaide, School of Dentistry




     Parameter estimation

      Point estimate is when an estimate of the population
      parameter is given as a single number, e.g. sample
      mean, median, variance, standard deviation.
      Interval estimation involves more than one point; it
      consists of a range of values within which the
      population parameter is thought to be, a confidence
      interval which contains the upper and lower limits of
      the range of values.
      Point and interval estimates let us infer the true
      value of an unknown population parameter using
      information from a random sample of that
      population.

           © The University of Adelaide, School of Dentistry




12
Confidence intervals (cont.)
        Example
           Suppose a paper reports that, among a sample of 2,823
           5–6-year-old children living in Sharjah, the mean number of
           decayed teeth is 0.81 (SD = 1.66) with a 95% confidence
           interval of (0.75, 0.87).

        Interpretation
            The 95% confidence interval is the range in the mean number
            of decayed teeth we would expect in a population of 6-year-old
            children living in Sharjah.
            Because only a sample of children were used, the exact
            population mean cannot be known for certain.
            Hence, the 95% confidence interval indicates the margin of
            imprecision due to sampling error.
            Or, alternatively, you could think of it as the range in which
            there is a 95% chance that the true population mean lies.

                    © The University of Adelaide, School of Dentistry




           1. Estimation (CI)
                 Population
                 µ=?

                                    CI = x ± {ta/2 * (Standard Error)}

     Sample
     x = 10.14              95% CI = x ± { t0.025 * ( S.E )}

                            95% CI = 10.14 ± {1.96 * (0.43)}



                    © The University of Adelaide, School of Dentistry




13
1. Estimation (CI)
                     Population
                        µ=?

                                        CI = x ± {tα/2 * (Standard Error)}
                                95% CI = x ± { t0.025 * ( S.E )}
         Sample                 95% CI = 10.14 ± {1.96 * (0.43)}
         x = 10.14
                                95% CI = 10.14 ± 0.8514
        s.d = 4.3
        n = 100
               s.d                       95% CI = 9.29, 10.99
      S.E =
                  n
              4.3
     S.E =          = 0.43
              100
                        © The University of Adelaide, School of Dentistry




               1. Estimation (CI)
                     Population
                        µ=?



                                 95% CI = 9.29, 10.99
         Sample
         x = 10.14    We are 95% sure that mean of the population will lie
                      between 9.29 and 10.99.
                                 99% CI = 9.02, 11.26
                              For 99% replace
                              1.96 with 2.58
                        © The University of Adelaide, School of Dentistry




14
95% Confidence interval formula

                                 ⎛ Std . Dev ⎞
               Estimate ± 1.96 ∗ ⎜           ⎟                                   Std. error

     e.g. Mean
                                 ⎝       n ⎠

      Standard deviation vs. Standard error of the statistic
            These two statistics are used for very different
            purposes.
            Standard deviation is a measure of spread of a set of
            observations.
            Standard error measures sampling error and is used
            to indicate the precision of a statistic, i.e. how close
            the statistic is to of Adelaide, School of Dentistry estimating.
                   © The University
                                    the parameter it is




        Standard error example

                                    ⎛ Std . Dev ⎞
     0.81         Estimate ± 1.96 ∗ ⎜           ⎟
                                    ⎝       n ⎠
                                                                                  1.66
                                                                   Std Error =         = 0.03
      Standard error of the mean                                                  2823
            In a sample of 2,823 5–6-year-old children living in Sharjah, the
            mean number of decayed teeth is 0.81 and std deviation is 1.66.
            The standard error is approximately 0.03.
            So, we expect, on average, observed sample means of 0.81,
            but, when we’re wrong, we expect to be off by about 0.03
            points, on average.
            Standard error of the sample mean gives an indication of the
            extent to which the sample mean deviates from the population
            mean.
                   © The University of Adelaide, School of Dentistry




15
2. Hypothesis Testing




                 © The University of Adelaide, School of Dentistry




              What is hypothesis
              testing?
     In Estimation, we estimate a population parameter
     from a sample statistic

     In Hypothesis testing, we answer to a specific
     question related to a population parameter




                 © The University of Adelaide, School of Dentistry




16
Hypothesis testing

     • A (statistical) hypothesis is a statement of
       belief about population parameters
     • It is a predominant feature of quantitative
       research in oral health & health care
       research in general
     • Researchers can test a hypothesis to see
       whether the collected data support or
       refute such hypothesis


           © The University of Adelaide, School of Dentistry




     2 types of hypotheses

     • The null hypothesis, symbolized by
       Ho; proposes no relationship
       between 2 variables or no effect in
       the population
     • The alternative hypothesis,
       symbolized by Ha; is a statement
       that disagrees with the null
       hypothesis.
           © The University of Adelaide, School of Dentistry




17
• If the null hypothesis is rejected as a result
       of sample evidence, then the alternative
       hypothesis is concluded
     • If the evidence is insufficient to reject, the
       null hypothesis is retained, but not
       accepted
     • Traditionally researches do not accept the
       null hypothesis from current evidence;
       they state that it cannot be rejected

           © The University of Adelaide, School of Dentistry




     Example

     A toothpaste company claims that their toothpaste
     contains, on average, 1100 ppm of fluoride.
     Suppose we are interested in testing this claim. We
     will randomly sample 100 tubes (i.e., n=100) of
     toothpaste from this company and under identical
     conditions calculate the average fluoride content (in
     ppm) for this sample.
     From the sample of 100 tubes of toothpaste, the
     average ppm was found to be 1035 (= X ).
     Could this sample have been drawn from a
     population with mean fluoride content of µ=1,100
     (known variance σ2=200).

           © The University of Adelaide, School of Dentistry




18
Basic steps in hypothesis testing
     1.   Propose a research question (identify the parameter
          of interest).
     2.   State the null hypothesis, H0 and alternative
          hypotheses, HA
     3.   Define a threshold value for declaring a P-value
          significant. The threshold is called the significance
          level of the test is denoted by alpha (α) and is
          commonly set to 0.05.
     4.   Select the appropriate statistical test to compute the
          P-value.



               © The University of Adelaide, School of Dentistry




      Basic steps in hypothesis testing (cont.)

          5.   Compare the P-value of your test to the chosen
               level of significance. Can the null hypothesis be
               rejected?
          6.   If P-value < α , conclude that the difference is
               statistically significant and decide to reject the
               null hypothesis.
               If P-value ≥ α, conclude that the difference is
               not statistically significant and decide not to
               reject the null hypothesis.




               © The University of Adelaide, School of Dentistry




19
Example

      A toothpaste company (X) claims that their
      toothpaste contains, on average, 1100 ppm of
      fluoride.
      What is the research question?




                                                                X




            © The University of Adelaide, School of Dentistry




     What is hypothesis testing?
     Research Q: Is the mean fluoride content in
     toothpaste X 1100ppm?
        Ans: Yes or No
     1) Null hypothesis: The mean fluoride content
     in toothpaste X is equal to 1100ppm
          Ho: µ = 1100

      2) Alternative hypothesis : The mean fluoride
      content in toothpaste X is not equal to 1100ppm
                    Ha: µ ≠ 1100



            © The University of Adelaide, School of Dentistry




20
Define the p value (commonly
         set at 0.05
           Select appropriate test to compute the p value

           At the end of the hypothesis testing, we
           will get a P value.

           If the P value is less than 0.05, we
           reject the Null Hypothesis (Ho).

           If the P value is more than or equal to
           0.05, we cannot reject the Null
           Hypothesis (Ho).
                  © The University of Adelaide, School of Dentistry




     Q: Is the fluoride content in toothpaste X 1100ppm?

     Ans: Yes or No
        Ho: µ = 1100 Ha: µ ≠ 1100                   x = 1035; varince 200; n = 100
       In above example, if we get P=.01, we reject the null hypothesis
       (Ho), then ……
       We conclude as Alternative Hypothesis (Ha) … “the mean
       fluoride content in toothpaste X is different from 1100ppm”.


        Alternatively, we may report as ……
        “the mean fluoride content is significantly different from
        1100ppm”.
         Note: (1) The second conclusion is more commonly used in the
         literature.

                  © The University of Adelaide, School of Dentistry




21
Q: Is the fluoride content in toothpaste X 1100ppm?

     Ans: Yes or No
         Ho: µ = 1100 Ha: µ ≠ 1100                   x = 1035; varince 200; n = 100
     In above example, if we get P=.08, we CANNOT reject the null
     hypothesis (Ho), then ……
     We conclude as Alternative Hypothesis (Ha) … “the mean
     fluoride content in toothpaste X is NOT different from
     1100ppm”.
     Alternatively, we may report as ……
     “the mean fluoride content is NOTsignificantly different from
     1100ppm”.



                   © The University of Adelaide, School of Dentistry




         What is P value?
        Q: Is the mean fluoride content in toothpaste X
        1100ppm?
     Ans: Yes or No
         Ho: µ = 1100 Ha: µ ≠ 1100                 x = 1035 = variance 200; n = 100
     If the P value is less than 0.05, we reject the Null Hypothesis.

     P value is the probability of error if you reject the Null
     Hypothesis and conclude as the Alternative Hypothesis.

     Example: P value=0.01. It means that …
     There is 1% probability of error in our conclusion, if we conclude
     as Alternative Hypothesis (“significantly different”).

     We, normally, allow less than 5% error.
     That is why the cut-off point for P value is 0.05.
                   © The University of Adelaide, School of Dentistry




22
What is P value?
     Q: Is the mean fluoride content in toothpaste X
        1100ppm?
     Ans: Yes or No
          Ho: = 1100 Ha: µ ≠ 1100
            µ                                       x = 1035; variance200; n = 100
     If the P value is less than 0.05, we reject the Null Hypothesis.

     P value is the probability of error if you reject the Null
     Hypothesis and conclude as the Alternative Hypothesis.

     Example: P value=0.2. It means that …
     There is 20% probability of error in our conclusion if we
     conclude as Alternative Hypothesis (“significant difference”).

     Therefore, we can’t conclude as it is “significantly different”. We
     have to conclude as “the difference is not significant”.
                   © The University of Adelaide, School of Dentistry




        What is P value?
          Q: Is the mean fluoride content in toothpaste X
          1100ppm?
     Ans: Yes or No
          Ho: = 1100 Ha: µ ≠ 1100
            µ                                        x = 1035; variance 200; n = 100
     If the P value is less than 0.05, we reject the Null Hypothesis.

     It means that we have set the cut-off point at P less than 0.05 to
     reject the Ho.

     We say this as …
     We set the “Alpha” at 0.05.

     Because the type of error that we have been talking about, is
     called “Type I error” or “Alpha error”.

                   © The University of Adelaide, School of Dentistry




23
The use of P-values in hypothesis testing

     Definition
         The P-value is the smallest level of significance
         that would lead to rejection of the null hypothesis
         H0. (The p-value is the observed significance level.)
         All statistical tests produce a P-value.

         P-values answers the question: ‘Is there a
         statistically significant difference between study
         groups?’




              © The University of Adelaide, School of Dentistry




      P-values
        Most scientific articles report a P-value associated
        with a test. Generally, the P-value is compared to a
        significance level (α) of 0.05 or 0.01 in order to
        determine whether or not the result is statistically
        significant.
        Decision rules:
        If P-value ≤ α then reject H0 at level α (a statistically
        significant result).
        If P-value > α then do not reject H0 at level α (not
        statistically significant).
        Example:
        If P-value<0.05, this indicates that there is a less than
        5% chance that the results observed occurred due to
        chance. We reject H0 and conclude that the result is
        significant.
              © The University of Adelaide, School of Dentistry




24
Example (cont.)
     So, our hypotheses are:
        H0: µ = 1,100
        HA: µ ≠ 1,100
     The P-value for this test was found to be 0.0006.

     What is your conclusion?
     Since P-value < 0.05, we reject H0 in favour of HA, i.e.
     we reject the original assumption that the sample was
     drawn from a population where µ=1,100 and σ2=200.
      We say that there is a significant difference between
      the sample mean and the population mean at the 5%
      level, i.e. there is a less than 5% chance (or 0.06%
      chance) that the result observed occurred due to
      chance.
             © The University of Adelaide, School of Dentistry




      Types of error
      When we sample, we select cases from a population
      of interest. Due to chance variations in selecting the
      sample’s few cases from the population’s many
      possible cases, the sample will deviate from the
      defined population’s true nature by a certain amount.
      This is called sampling error.
      Therefore, inferences from samples to populations
      are always probabilistic, meaning we can never be
      100% certain that our inference was correct.
      Drawing the wrong conclusion is called an error of
      inference.
      There are two types of errors of inference defined in
      terms of The Universityhypothesis: Type 1 error and Type 2
             ©
               the null of Adelaide, School of Dentistry
      error.




25
Types of error (cont.)
     Possibilities related to decisions about H0:
                                                               Actual situation




                                                         H0 true             H0 false




                                  Accept H0                                Type II Error
                                                    (correct decision)
         Investigator’s
            decision

                                                      Type I Error
                                  Reject H0
                                                      probability= α     (correct decision)




                © The University of Adelaide, School of Dentistry




      Types of error (cont.)


        Type 1 and Type 2 errors can be quite difficult to
        understand, so let’s look at a few examples to help
        you grasp the concept.
        Let’s hypothesise that two groups of dental patients
        are equal in their knowledge of preventive hygiene
        behaviours.
        Now consider the following four scenarios. For
        each, determine whether or not an error has been
        made and, if so, what type of error.


                © The University of Adelaide, School of Dentistry




26
Types of error (cont.)
     1. You accept the null hypothesis when the groups are
        really equal in oral self-care knowledge.
     Answer:
     2. You reject the null hypothesis when the groups are
        really equal in oral self-care knowledge.
     Answer:
     3. You reject the null hypothesis when the groups are
        really different in their oral self-care knowledge.
     Answer:
     4. Accepts the null hypothesis when one group has
        much more oral self-care knowledge than the other.
     Answer:
                © The University of Adelaide, School of Dentistry




          Types of error (cont.)
     1. You accept the null hypothesis when the groups are
        really equal in oral self-care knowledge.
     Answer: Correct decision
     2. You reject the null hypothesis when the groups are
        really equal in oral self-care knowledge.
     Answer: Type 1 error
     3. You reject the null hypothesis when the groups are
        really different in their oral self-care knowledge.
     Answer: Correct decision
     4. Accepts the null hypothesis when one group has
        much more oral self-care knowledge than the other.
     Answer: Type 2 error
                © The University of Adelaide, School of Dentistry




27
Statistical vs Practical significance

     Statistical significance does not necessarily imply that
     the true difference in population means is of sufficient
     magnitude to be of clinical importance.
     Significance tests tell us whether a difference is
     statistically significant but significance tests do not tell
     us whether the difference is of practical importance.
     In clinical practice we usually need to know the
     presence and size of any difference.




             © The University of Adelaide, School of Dentistry




     Statistical vs Practical significance (cont.)

     P-values only inform you on the likelihood of a
     difference being attributable to chance.
     As the sample size increases and the variance
     decreases, small differences in mean values may
     provide statistically significant results.
     Whether these ‘statistically significant’ differences are
     of any practical or clinical significance requires
     judgement on the part of the clinician.




             © The University of Adelaide, School of Dentistry




28
Statistical vs. Practical significance
     example
     Consider a study comparing a new hypertensive medication (A)
     with a standard hypertensive medication (B).
     (Suppose drug A has additional side effects and is more expensive
     than drug B.)
     Results
      1.Blood  pressures of patients receiving A were significantly lower
      than those on B (p-value=0.0001).
      2. Difference in blood pressure between the groups was 5mmHg.

     Interpretation
      1.   Probability that the difference found, or bigger, being
           attributable to chance is less than 0.01% or 1 in 10,000.
      2.   But, given the small difference found between the groups,
           might consider this difference to be too small to offset the
           difficulties, side effects and expense associated with drug       A.
      3.   The effect is smaller than clinically meaningful, so we have
           statistical significance but not clinical/practical significance.
                  © The University of Adelaide, School of Dentistry




                  © The University of Adelaide, School of Dentistry




29

More Related Content

Similar to Ebd1 lecture 6&7 2010

Community Medicine Presentation
Community Medicine PresentationCommunity Medicine Presentation
Community Medicine PresentationDr Ritesh Malik
 
Epidemological methods
Epidemological methodsEpidemological methods
Epidemological methodsKundan Singh
 
Ch 8 NORMAL DIST..doc
Ch 8 NORMAL DIST..docCh 8 NORMAL DIST..doc
Ch 8 NORMAL DIST..docAbedurRahman5
 
Normal distribution
Normal distributionNormal distribution
Normal distributionGlobal Polis
 
Lesson 5 - Chebyshev and Normal.ppt
Lesson 5 - Chebyshev and Normal.pptLesson 5 - Chebyshev and Normal.ppt
Lesson 5 - Chebyshev and Normal.pptlokeshgupta130
 
Statistics for Journalists
Statistics for JournalistsStatistics for Journalists
Statistics for Journalistswritethinking
 
Normal Distribution
Normal DistributionNormal Distribution
Normal DistributionNevIlle16
 
Research methodology3
Research methodology3Research methodology3
Research methodology3Tosif Ahmad
 
scope and need of biostatics
scope and need of  biostaticsscope and need of  biostatics
scope and need of biostaticsdr_sharmajyoti01
 
1Basic Statistics.ppt
1Basic Statistics.ppt1Basic Statistics.ppt
1Basic Statistics.pptSnehamurali18
 
Bio statistics
Bio statisticsBio statistics
Bio statisticsNc Das
 
Malimu descriptive statistics.
Malimu descriptive statistics.Malimu descriptive statistics.
Malimu descriptive statistics.Miharbi Ignasm
 
best for normal distribution.ppt
best for normal distribution.pptbest for normal distribution.ppt
best for normal distribution.pptDejeneDay
 
statical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.pptstatical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.pptNazarudinManik1
 
Introduction to Statistics23122223.ppt
Introduction to Statistics23122223.pptIntroduction to Statistics23122223.ppt
Introduction to Statistics23122223.pptpathianithanaidu
 

Similar to Ebd1 lecture 6&7 2010 (20)

Community Medicine Presentation
Community Medicine PresentationCommunity Medicine Presentation
Community Medicine Presentation
 
Chapter3bps
Chapter3bpsChapter3bps
Chapter3bps
 
Chapter3bps
Chapter3bpsChapter3bps
Chapter3bps
 
Measures of relationship
Measures of relationshipMeasures of relationship
Measures of relationship
 
Epidemological methods
Epidemological methodsEpidemological methods
Epidemological methods
 
Ch 8 NORMAL DIST..doc
Ch 8 NORMAL DIST..docCh 8 NORMAL DIST..doc
Ch 8 NORMAL DIST..doc
 
Normal distribution
Normal distributionNormal distribution
Normal distribution
 
Lesson 5 - Chebyshev and Normal.ppt
Lesson 5 - Chebyshev and Normal.pptLesson 5 - Chebyshev and Normal.ppt
Lesson 5 - Chebyshev and Normal.ppt
 
Statistics for Journalists
Statistics for JournalistsStatistics for Journalists
Statistics for Journalists
 
Normal Distribution
Normal DistributionNormal Distribution
Normal Distribution
 
Research methodology3
Research methodology3Research methodology3
Research methodology3
 
Sampling distributions
Sampling distributionsSampling distributions
Sampling distributions
 
scope and need of biostatics
scope and need of  biostaticsscope and need of  biostatics
scope and need of biostatics
 
1Basic Statistics.ppt
1Basic Statistics.ppt1Basic Statistics.ppt
1Basic Statistics.ppt
 
Bio statistics
Bio statisticsBio statistics
Bio statistics
 
Malimu descriptive statistics.
Malimu descriptive statistics.Malimu descriptive statistics.
Malimu descriptive statistics.
 
best for normal distribution.ppt
best for normal distribution.pptbest for normal distribution.ppt
best for normal distribution.ppt
 
statical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.pptstatical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.ppt
 
The Normal Distribution
The Normal DistributionThe Normal Distribution
The Normal Distribution
 
Introduction to Statistics23122223.ppt
Introduction to Statistics23122223.pptIntroduction to Statistics23122223.ppt
Introduction to Statistics23122223.ppt
 

More from Reko Kemo

Ebd1 lecture 3 2010
Ebd1 lecture 3  2010Ebd1 lecture 3  2010
Ebd1 lecture 3 2010Reko Kemo
 
Ebd1 lecture 3 2010
Ebd1 lecture 3  2010Ebd1 lecture 3  2010
Ebd1 lecture 3 2010Reko Kemo
 
Ebd1 lecture 3 2010
Ebd1 lecture 3  2010Ebd1 lecture 3  2010
Ebd1 lecture 3 2010Reko Kemo
 
Ebd1 lecture 8 2010
Ebd1  lecture 8 2010Ebd1  lecture 8 2010
Ebd1 lecture 8 2010Reko Kemo
 
Ebd1 lecture7 2010
Ebd1 lecture7 2010Ebd1 lecture7 2010
Ebd1 lecture7 2010Reko Kemo
 
Mand. molars
Mand. molarsMand. molars
Mand. molarsReko Kemo
 

More from Reko Kemo (7)

Dhs2
Dhs2Dhs2
Dhs2
 
Ebd1 lecture 3 2010
Ebd1 lecture 3  2010Ebd1 lecture 3  2010
Ebd1 lecture 3 2010
 
Ebd1 lecture 3 2010
Ebd1 lecture 3  2010Ebd1 lecture 3  2010
Ebd1 lecture 3 2010
 
Ebd1 lecture 3 2010
Ebd1 lecture 3  2010Ebd1 lecture 3  2010
Ebd1 lecture 3 2010
 
Ebd1 lecture 8 2010
Ebd1  lecture 8 2010Ebd1  lecture 8 2010
Ebd1 lecture 8 2010
 
Ebd1 lecture7 2010
Ebd1 lecture7 2010Ebd1 lecture7 2010
Ebd1 lecture7 2010
 
Mand. molars
Mand. molarsMand. molars
Mand. molars
 

Recently uploaded

Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 

Recently uploaded (20)

Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 

Ebd1 lecture 6&7 2010

  • 1. General Studies Community Dentistry 1 Statistical Inference Lecture 6 Dr Nizam Abdullah Contents Review of descriptive statistics The normal curve Introduction to inferential statistics © The University of Adelaide, School of Dentistry 1
  • 2. Descriptive statistics Central tendency Mean Median (50th Percentile) Mode Dispersion Standard deviation (SD) / Variance Inter-quartile range (IQR) (3rd quartile – 1st quartile) Range (Maximum – Minimum) © The University of Adelaide, School of Dentistry Distribution of a variable Another important aspect of the description of a variable is the shape of its distribution, which tells you the frequency of values from different ranges of the variable. Typically, a researcher is interested in how well the distribution can be approximated by the normal distribution. The normal distribution can be used to determine how far the sample is likely to be off from the overall population, i.e. how big a ‘margin of error’ there is likely to be. Simple descriptive statistics can provide some information relevant to this issue. © The University of Adelaide, School of Dentistry 2
  • 3. Distribution of a variable (cont.) A variable is said to be a normally distributed variable or to have a normal distribution if its distribution has the shape of a normal curve - the normal curve is a kind of bell-shaped curve. A normal distribution (and hence a normal curve) is completely determined by its mean and standard deviation - the mean and standard deviation are called the parameters of the normal curve. The normal curve is symmetric and centered about the mean. The standard deviation determines the spread of the curve. The larger the standard deviation, the flatter and more spread out the curve will be. © The University of Adelaide, School of Dentistry Normal curve (cont.) The mean, median, and mode all have the same value. © The University of Adelaide, School of Dentistry 3
  • 4. Different shapes of the Normal curve Standard deviation changes the relative width of the distribution; the larger the standard deviation, the wider the curve. © The University of Adelaide, School of Dentistry Properties of normal distribution Age distribution of Village A • Bell-shaped curve 45 • Symmetrical about its mean (mirror image 40 35 to each side) 30 25 68% Mean and median are equal. 20 15 50% 95%50% 99.7% One side of the mean is 50% of the area. 10 5 0 The area between mean-1SD and mean+1SD is 68% (Mean±1SD=30, 50). 80+ 0 -4 5 -9 1 0 -1 4 1 5 -1 9 2 0 -2 4 2 5 -2 9 3 0 -3 4 3 5 -3 9 4 0 -4 4 4 5 -4 9 5 0 -5 4 5 5 -5 9 6 0 -6 4 6 5 -6 9 7 0 -7 4 7 5 -7 9 The area between mean-2SD and e.g. : (Age) Mean = 40, SD = 10 mean+2SD is 95% (Mean±2SD=20, 60). Therefore, Mean±1SD = 30, 50 Between 30 yr and 50 yr old, The area between mean-3SD and there will be 68% of the group. mean+3SD is 99.7% (Mean±3SD=10, 70). © The University of Adelaide, School of Dentistry 4
  • 5. Normal curve: 68-95-99.7 rule 68% of the observations fall within 68% one standard deviation of the mean -σ µ +σ 95% of the observations fall within two standard deviations 95% of the mean -2σ µ +2σ 99.7% of the observations fall within three standard 99.7% deviations of the mean -3σ µ +3σ © The University of Adelaide, School of Dentistry Distributions: Negative In a negatively skewed distribution, the mode is at the top of the curve, the median is lower than it, and the mean is lower than the median. The result is a ‘tail’ towards the more negative side of the graph. Negative skewness (tail to left = left skewed) Median < Mode Mean < Median © The University of Adelaide, School of Dentistry 5
  • 6. Distributions: Positive In a positively skewed distribution, the mode is at the top of the curve, the median is higher than it, and the mean is higher than the median. The result is a ‘tail’ towards the more positive side of the graph. Positive skewness : (tail to right = right skewed) Median > Mode Mean > Median © The University of Adelaide, School of Dentistry Example dataset First Year BDS students enrolled in EBD1. Response to survey: n= 90 (out of 119), or 76%. Variables: – Age: quantitative variable measured on a ratio scale – Sex: qualitative variable measured on a nominal scale, i.e. variable with categories male or female – Height: quantitative variable measured on a ratio scale – Weight: quantitative variable measured on a ratio scale Variables measured at a higher level can always be converted to a lower level, but not vice versa. For example, observations of actual age (ratio scale) can be converted to categories of older and younger (ordinal scale). Similarly for height and weight.The University of Adelaide, School of Dentistry © 6
  • 7. Data spreadsheet Case Age Sex Height Weight 1 21 2 165 45 2 22 2 170 53 3 18 1. 74 4 20 2 165 44 5 19 1 175 70 6 19 2 163 53 7 24 2 163 49 8 18 1 170 60 9 29 2 178 70 10 28 2 163 58 11 18 2 177 72 12 38 2 164 65 13 23 2 161 65 14 20 2 178 63 15 29 2 159 54 : : : : : : University of Adelaide, School: of Dentistry : © The : : Frequency distribution of height variable Height (cm) Frequency Mode = 165cm, 150-155 4 170cm 155-160 9 Mean = 169.3cm 160-165 21 Median = 168cm 165-170 24 170-175 10 175-180 10 180-185 7 185-190 3 190-195 1 Total 89 150 155 160 165 170 175 180 185 190 195 © The University of Adelaide, School of Dentistry 7
  • 8. Frequency distribution of weight variable Weight (kg) Freq. 40-45 4 45-50 9 Mean = 62.6 kg 50-55 15 Median = 60 kg 55-60 16 60-65 13 65-70 7 70-75 11 75-80 4 80-85 3 85-90 3 90-95 2 95-100 1 100-105 1 105-110 0 110-115 0 115-120 0 120-125 1 40 45 50 55 60 65 70 75 80 85 90 95 100 105 120 125 Total 90 © The University of Adelaide, School of Dentistry Frequency distribution of age variable Mode < Median < Mean 18 yrs 19 yrs 20 yrs © The University of Adelaide, School of Dentistry 8
  • 9. Descriptive statistics Variable Freq Min Max Range SD Age 90 17.0 38.0 21.0 3.4 Height 89 152.0 191.0 39.0 8.8 Weight 90 40.0 120.0 80.0 14.5 Variable Category Freq % Sex Male 32 35.6 Female 58 64.4 Total 90 100.0 © The University of Adelaide, School of Dentistry What is Inferential Statistics ? It is the Statistical Technique/Method used to infer the result of the sample (statistic) to the population (parameter). Population (Village A) µ=? The technique is called “Inferential Statistics” Sample x = 10.14 © The University of Adelaide, School of Dentistry 9
  • 10. Statistical inference Inferential statistics are used to draw inferences about a population from a sample. For example, the average number of decayed teeth in children aged 5 years can be estimated using observations from a sample of 5-year-olds. © The University of Adelaide, School of Dentistry Selecting a sample from a population How can a sample that is representative of the population of interest be selected? Answer: by random selection When a random sample is drawn from the population of interest, every member of the population has the same probability, or chance, of being selected in the sample. For this reason, random samples are considered to be unbiased. © The University of Adelaide, School of Dentistry 10
  • 11. Two types of Inferential Statistics Parameter Estimation Hypothesis testing © The University of Adelaide, School of Dentistry 1. Parameter estimation • Parameter estimation takes two forms: • 1. Point estimation • 2. Interval estimation © The University of Adelaide, School of Dentistry 11
  • 12. Definition • A point estimate is a single numerical value used to estimate the corresponding population parameter • An interval estimate consists of two numerical values defining a range of values that, with a specific degree of confidence, we feel includes the parameter being estimated © The University of Adelaide, School of Dentistry Parameter estimation Point estimate is when an estimate of the population parameter is given as a single number, e.g. sample mean, median, variance, standard deviation. Interval estimation involves more than one point; it consists of a range of values within which the population parameter is thought to be, a confidence interval which contains the upper and lower limits of the range of values. Point and interval estimates let us infer the true value of an unknown population parameter using information from a random sample of that population. © The University of Adelaide, School of Dentistry 12
  • 13. Confidence intervals (cont.) Example Suppose a paper reports that, among a sample of 2,823 5–6-year-old children living in Sharjah, the mean number of decayed teeth is 0.81 (SD = 1.66) with a 95% confidence interval of (0.75, 0.87). Interpretation The 95% confidence interval is the range in the mean number of decayed teeth we would expect in a population of 6-year-old children living in Sharjah. Because only a sample of children were used, the exact population mean cannot be known for certain. Hence, the 95% confidence interval indicates the margin of imprecision due to sampling error. Or, alternatively, you could think of it as the range in which there is a 95% chance that the true population mean lies. © The University of Adelaide, School of Dentistry 1. Estimation (CI) Population µ=? CI = x ± {ta/2 * (Standard Error)} Sample x = 10.14 95% CI = x ± { t0.025 * ( S.E )} 95% CI = 10.14 ± {1.96 * (0.43)} © The University of Adelaide, School of Dentistry 13
  • 14. 1. Estimation (CI) Population µ=? CI = x ± {tα/2 * (Standard Error)} 95% CI = x ± { t0.025 * ( S.E )} Sample 95% CI = 10.14 ± {1.96 * (0.43)} x = 10.14 95% CI = 10.14 ± 0.8514 s.d = 4.3 n = 100 s.d 95% CI = 9.29, 10.99 S.E = n 4.3 S.E = = 0.43 100 © The University of Adelaide, School of Dentistry 1. Estimation (CI) Population µ=? 95% CI = 9.29, 10.99 Sample x = 10.14 We are 95% sure that mean of the population will lie between 9.29 and 10.99. 99% CI = 9.02, 11.26 For 99% replace 1.96 with 2.58 © The University of Adelaide, School of Dentistry 14
  • 15. 95% Confidence interval formula ⎛ Std . Dev ⎞ Estimate ± 1.96 ∗ ⎜ ⎟ Std. error e.g. Mean ⎝ n ⎠ Standard deviation vs. Standard error of the statistic These two statistics are used for very different purposes. Standard deviation is a measure of spread of a set of observations. Standard error measures sampling error and is used to indicate the precision of a statistic, i.e. how close the statistic is to of Adelaide, School of Dentistry estimating. © The University the parameter it is Standard error example ⎛ Std . Dev ⎞ 0.81 Estimate ± 1.96 ∗ ⎜ ⎟ ⎝ n ⎠ 1.66 Std Error = = 0.03 Standard error of the mean 2823 In a sample of 2,823 5–6-year-old children living in Sharjah, the mean number of decayed teeth is 0.81 and std deviation is 1.66. The standard error is approximately 0.03. So, we expect, on average, observed sample means of 0.81, but, when we’re wrong, we expect to be off by about 0.03 points, on average. Standard error of the sample mean gives an indication of the extent to which the sample mean deviates from the population mean. © The University of Adelaide, School of Dentistry 15
  • 16. 2. Hypothesis Testing © The University of Adelaide, School of Dentistry What is hypothesis testing? In Estimation, we estimate a population parameter from a sample statistic In Hypothesis testing, we answer to a specific question related to a population parameter © The University of Adelaide, School of Dentistry 16
  • 17. Hypothesis testing • A (statistical) hypothesis is a statement of belief about population parameters • It is a predominant feature of quantitative research in oral health & health care research in general • Researchers can test a hypothesis to see whether the collected data support or refute such hypothesis © The University of Adelaide, School of Dentistry 2 types of hypotheses • The null hypothesis, symbolized by Ho; proposes no relationship between 2 variables or no effect in the population • The alternative hypothesis, symbolized by Ha; is a statement that disagrees with the null hypothesis. © The University of Adelaide, School of Dentistry 17
  • 18. • If the null hypothesis is rejected as a result of sample evidence, then the alternative hypothesis is concluded • If the evidence is insufficient to reject, the null hypothesis is retained, but not accepted • Traditionally researches do not accept the null hypothesis from current evidence; they state that it cannot be rejected © The University of Adelaide, School of Dentistry Example A toothpaste company claims that their toothpaste contains, on average, 1100 ppm of fluoride. Suppose we are interested in testing this claim. We will randomly sample 100 tubes (i.e., n=100) of toothpaste from this company and under identical conditions calculate the average fluoride content (in ppm) for this sample. From the sample of 100 tubes of toothpaste, the average ppm was found to be 1035 (= X ). Could this sample have been drawn from a population with mean fluoride content of µ=1,100 (known variance σ2=200). © The University of Adelaide, School of Dentistry 18
  • 19. Basic steps in hypothesis testing 1. Propose a research question (identify the parameter of interest). 2. State the null hypothesis, H0 and alternative hypotheses, HA 3. Define a threshold value for declaring a P-value significant. The threshold is called the significance level of the test is denoted by alpha (α) and is commonly set to 0.05. 4. Select the appropriate statistical test to compute the P-value. © The University of Adelaide, School of Dentistry Basic steps in hypothesis testing (cont.) 5. Compare the P-value of your test to the chosen level of significance. Can the null hypothesis be rejected? 6. If P-value < α , conclude that the difference is statistically significant and decide to reject the null hypothesis. If P-value ≥ α, conclude that the difference is not statistically significant and decide not to reject the null hypothesis. © The University of Adelaide, School of Dentistry 19
  • 20. Example A toothpaste company (X) claims that their toothpaste contains, on average, 1100 ppm of fluoride. What is the research question? X © The University of Adelaide, School of Dentistry What is hypothesis testing? Research Q: Is the mean fluoride content in toothpaste X 1100ppm? Ans: Yes or No 1) Null hypothesis: The mean fluoride content in toothpaste X is equal to 1100ppm Ho: µ = 1100 2) Alternative hypothesis : The mean fluoride content in toothpaste X is not equal to 1100ppm Ha: µ ≠ 1100 © The University of Adelaide, School of Dentistry 20
  • 21. Define the p value (commonly set at 0.05 Select appropriate test to compute the p value At the end of the hypothesis testing, we will get a P value. If the P value is less than 0.05, we reject the Null Hypothesis (Ho). If the P value is more than or equal to 0.05, we cannot reject the Null Hypothesis (Ho). © The University of Adelaide, School of Dentistry Q: Is the fluoride content in toothpaste X 1100ppm? Ans: Yes or No Ho: µ = 1100 Ha: µ ≠ 1100 x = 1035; varince 200; n = 100 In above example, if we get P=.01, we reject the null hypothesis (Ho), then …… We conclude as Alternative Hypothesis (Ha) … “the mean fluoride content in toothpaste X is different from 1100ppm”. Alternatively, we may report as …… “the mean fluoride content is significantly different from 1100ppm”. Note: (1) The second conclusion is more commonly used in the literature. © The University of Adelaide, School of Dentistry 21
  • 22. Q: Is the fluoride content in toothpaste X 1100ppm? Ans: Yes or No Ho: µ = 1100 Ha: µ ≠ 1100 x = 1035; varince 200; n = 100 In above example, if we get P=.08, we CANNOT reject the null hypothesis (Ho), then …… We conclude as Alternative Hypothesis (Ha) … “the mean fluoride content in toothpaste X is NOT different from 1100ppm”. Alternatively, we may report as …… “the mean fluoride content is NOTsignificantly different from 1100ppm”. © The University of Adelaide, School of Dentistry What is P value? Q: Is the mean fluoride content in toothpaste X 1100ppm? Ans: Yes or No Ho: µ = 1100 Ha: µ ≠ 1100 x = 1035 = variance 200; n = 100 If the P value is less than 0.05, we reject the Null Hypothesis. P value is the probability of error if you reject the Null Hypothesis and conclude as the Alternative Hypothesis. Example: P value=0.01. It means that … There is 1% probability of error in our conclusion, if we conclude as Alternative Hypothesis (“significantly different”). We, normally, allow less than 5% error. That is why the cut-off point for P value is 0.05. © The University of Adelaide, School of Dentistry 22
  • 23. What is P value? Q: Is the mean fluoride content in toothpaste X 1100ppm? Ans: Yes or No Ho: = 1100 Ha: µ ≠ 1100 µ x = 1035; variance200; n = 100 If the P value is less than 0.05, we reject the Null Hypothesis. P value is the probability of error if you reject the Null Hypothesis and conclude as the Alternative Hypothesis. Example: P value=0.2. It means that … There is 20% probability of error in our conclusion if we conclude as Alternative Hypothesis (“significant difference”). Therefore, we can’t conclude as it is “significantly different”. We have to conclude as “the difference is not significant”. © The University of Adelaide, School of Dentistry What is P value? Q: Is the mean fluoride content in toothpaste X 1100ppm? Ans: Yes or No Ho: = 1100 Ha: µ ≠ 1100 µ x = 1035; variance 200; n = 100 If the P value is less than 0.05, we reject the Null Hypothesis. It means that we have set the cut-off point at P less than 0.05 to reject the Ho. We say this as … We set the “Alpha” at 0.05. Because the type of error that we have been talking about, is called “Type I error” or “Alpha error”. © The University of Adelaide, School of Dentistry 23
  • 24. The use of P-values in hypothesis testing Definition The P-value is the smallest level of significance that would lead to rejection of the null hypothesis H0. (The p-value is the observed significance level.) All statistical tests produce a P-value. P-values answers the question: ‘Is there a statistically significant difference between study groups?’ © The University of Adelaide, School of Dentistry P-values Most scientific articles report a P-value associated with a test. Generally, the P-value is compared to a significance level (α) of 0.05 or 0.01 in order to determine whether or not the result is statistically significant. Decision rules: If P-value ≤ α then reject H0 at level α (a statistically significant result). If P-value > α then do not reject H0 at level α (not statistically significant). Example: If P-value<0.05, this indicates that there is a less than 5% chance that the results observed occurred due to chance. We reject H0 and conclude that the result is significant. © The University of Adelaide, School of Dentistry 24
  • 25. Example (cont.) So, our hypotheses are: H0: µ = 1,100 HA: µ ≠ 1,100 The P-value for this test was found to be 0.0006. What is your conclusion? Since P-value < 0.05, we reject H0 in favour of HA, i.e. we reject the original assumption that the sample was drawn from a population where µ=1,100 and σ2=200. We say that there is a significant difference between the sample mean and the population mean at the 5% level, i.e. there is a less than 5% chance (or 0.06% chance) that the result observed occurred due to chance. © The University of Adelaide, School of Dentistry Types of error When we sample, we select cases from a population of interest. Due to chance variations in selecting the sample’s few cases from the population’s many possible cases, the sample will deviate from the defined population’s true nature by a certain amount. This is called sampling error. Therefore, inferences from samples to populations are always probabilistic, meaning we can never be 100% certain that our inference was correct. Drawing the wrong conclusion is called an error of inference. There are two types of errors of inference defined in terms of The Universityhypothesis: Type 1 error and Type 2 © the null of Adelaide, School of Dentistry error. 25
  • 26. Types of error (cont.) Possibilities related to decisions about H0: Actual situation H0 true H0 false Accept H0 Type II Error (correct decision) Investigator’s decision Type I Error Reject H0 probability= α (correct decision) © The University of Adelaide, School of Dentistry Types of error (cont.) Type 1 and Type 2 errors can be quite difficult to understand, so let’s look at a few examples to help you grasp the concept. Let’s hypothesise that two groups of dental patients are equal in their knowledge of preventive hygiene behaviours. Now consider the following four scenarios. For each, determine whether or not an error has been made and, if so, what type of error. © The University of Adelaide, School of Dentistry 26
  • 27. Types of error (cont.) 1. You accept the null hypothesis when the groups are really equal in oral self-care knowledge. Answer: 2. You reject the null hypothesis when the groups are really equal in oral self-care knowledge. Answer: 3. You reject the null hypothesis when the groups are really different in their oral self-care knowledge. Answer: 4. Accepts the null hypothesis when one group has much more oral self-care knowledge than the other. Answer: © The University of Adelaide, School of Dentistry Types of error (cont.) 1. You accept the null hypothesis when the groups are really equal in oral self-care knowledge. Answer: Correct decision 2. You reject the null hypothesis when the groups are really equal in oral self-care knowledge. Answer: Type 1 error 3. You reject the null hypothesis when the groups are really different in their oral self-care knowledge. Answer: Correct decision 4. Accepts the null hypothesis when one group has much more oral self-care knowledge than the other. Answer: Type 2 error © The University of Adelaide, School of Dentistry 27
  • 28. Statistical vs Practical significance Statistical significance does not necessarily imply that the true difference in population means is of sufficient magnitude to be of clinical importance. Significance tests tell us whether a difference is statistically significant but significance tests do not tell us whether the difference is of practical importance. In clinical practice we usually need to know the presence and size of any difference. © The University of Adelaide, School of Dentistry Statistical vs Practical significance (cont.) P-values only inform you on the likelihood of a difference being attributable to chance. As the sample size increases and the variance decreases, small differences in mean values may provide statistically significant results. Whether these ‘statistically significant’ differences are of any practical or clinical significance requires judgement on the part of the clinician. © The University of Adelaide, School of Dentistry 28
  • 29. Statistical vs. Practical significance example Consider a study comparing a new hypertensive medication (A) with a standard hypertensive medication (B). (Suppose drug A has additional side effects and is more expensive than drug B.) Results 1.Blood pressures of patients receiving A were significantly lower than those on B (p-value=0.0001). 2. Difference in blood pressure between the groups was 5mmHg. Interpretation 1. Probability that the difference found, or bigger, being attributable to chance is less than 0.01% or 1 in 10,000. 2. But, given the small difference found between the groups, might consider this difference to be too small to offset the difficulties, side effects and expense associated with drug A. 3. The effect is smaller than clinically meaningful, so we have statistical significance but not clinical/practical significance. © The University of Adelaide, School of Dentistry © The University of Adelaide, School of Dentistry 29