SlideShare a Scribd company logo
1 of 31
Basic Statistics Overview
Dr. N. Mohana
Assistant Professor Senior
Division of Mathematics, School of Advanced Sciences
Vellore Institute of Technology, Chennai
Preface
• The purpose of this presentation is to help you
determine which statistical tests are appropriate
for analyzing your data for your resident research
project. It does not represent a comprehensive
overview of all statistical tests and methods.
• Your data may need to be analyzed using
different statistical tests than are presented here,
but this presentation focuses on the most
common techniques.
Outline
• Descriptive Statistics
– Frequencies & percentages
– Means & standard deviations
• Inferential Statistics
– Correlation
– T-tests
– Chi-square
– Logistic Regression
Types of Statistics/Analyses
Descriptive Statistics
– Frequencies
– Basic measurements
Inferential Statistics
– Hypothesis Testing
– Correlation
– Confidence Intervals
– Significance Testing
– Prediction
Describing a phenomena
How many? How much?
BP, HR, BMI, IQ, etc.
Inferences about a phenomena
Proving or disproving theories
Associations between phenomena
If sample relates to the larger
population
E.g., Diet and health
Descriptive Statistics
Descriptive statistics can be used to summarize
and describe a single variable
• Frequencies (counts) & Percentages
– Use with categorical (nominal) data
• Levels, types, groupings, yes/no, Drug A vs. Drug B
• Means & Standard Deviations
– Use with continuous (interval/ratio) data
• Height, weight, cholesterol, scores on a test
Frequencies & Percentages
Look at the different ways we can display frequencies and
percentages for this data:
Table
Bar chart
Pie chart
Good if more
than 20
observations
AKA frequency
distributions –
good if more
than 20
observations
Distributions
The distribution of scores or values can also be
displayed using Box and Whiskers Plots and Histograms
Continuous  Categorical
It is possible to take
continuous data
(such as hemoglobin
levels) and turn it
into categorical data
by grouping values
together. Then we
can calculate
frequencies and
percentages for each
group.
Continuous  Categorical
Distribution of
Glasgow Coma
Scale Scores
Even though
this is
continuous
data, it is
being treated
as “nominal”
as it is broken
down into
groups or
categories
Tip: It is usually better to collect continuous data and then break it
down into categories for data analysis as opposed to collecting data
that fits into preconceived categories.
Ordinal Level Data
Frequencies and percentages can be computed
for ordinal data
– Examples: High School/Some College/College
Graduate/Graduate School
0
10
20
30
40
50
60
Strongly
Agree
Agree Disagree Strongly
Disagree
Interval/Ratio Data
We can compute frequencies and percentages
for interval and ratio level data as well
– Examples: Age, Temperature, Height, Weight,
Many Clinical Serum Levels
Distribution of Injury Severity
Score in a population of patients
Interval/Ratio Distributions
The distribution of interval/ratio data often
forms a “bell shaped” curve.
– Many phenomena in life are normally
distributed (age, height, weight, IQ).
Interval & Ratio Data
Measures of central tendency and measures of dispersion are often
computed with interval/ratio data
• Measures of Central Tendency (aka, the “Middle Point”)
– Mean, Median, Mode
– If your frequency distribution shows outliers, you might want to use
the median instead of the mean
• Measures of Dispersion (aka, How “spread out” the data are)
― Variance, standard deviation, standard error of the mean
― Describe how “spread out” a distribution of scores is
― High numbers for variance and standard deviation may mean that
scores are “all over the place” and do not necessarily fall close to the
mean
In research, means are usually presented along with standard deviations or
standard errors.
INFERENTIAL STATISTICS
Inferential statistics can be used to prove or
disprove theories, determine associations between
variables, and determine if findings are significant
and whether or not we can generalize from our
sample to the entire population
The types of inferential statistics we will go over:
• Correlation
• T-tests/ANOVA
• Chi-square
• Logistic Regression
Type of Data & Analysis
• Analysis of Categorical/Nominal Data
– Correlation T-tests
– T-tests
• Analysis of Continuous Data
– Chi-square
– Logistic Regression
Correlation
• When to use it?
– When you want to know about the association or relationship
between two continuous variables
• Ex) food intake and weight; drug dosage and blood pressure; air temperature and
metabolic rate, etc.
• What does it tell you?
– If a linear relationship exists between two variables, and how strong that
relationship is
• What do the results look like?
– The correlation coefficient = Pearson’s r
– Ranges from -1 to +1
– See next slide for examples of correlation results
Correlation
Guide for interpreting
strength of correlations:
 0 – 0.25 = Little or no
relationship
 0.25 – 0.50 = Fair degree of
relationship
 0.50 - 0.75 = Moderate
degree of relationship
 0.75 – 1.0 = Strong
relationship
 1.0 = perfect correlation
Correlation
• How do you interpret it?
– If r is positive, high values of one variable are associated with high values
of the other variable (both go in SAME direction - ↑↑ OR ↓↓)
• Ex) Diastolic blood pressure tends to rise with age, thus the two variables are
positively correlated
– If r is negative, low values of one variable are associated with high values
of the other variable (opposite direction - ↑↓ OR ↓ ↑)
• Ex) Heart rate tends to be lower in persons who exercise frequently,
the two variables correlate negatively
– Correlation of 0 indicates NO linear relationship
• How do you report it?
– “Diastolic blood pressure was positively correlated with age (r = .75, p < . 05).”
Tip: Correlation does NOT equal causation!!! Just because two variables are highly correlated, this
does NOT mean that one CAUSES the other!!!
T-tests
• When to use them?
– Paired t-tests: When comparing the MEANS of a continuous variable in
two non-independent samples (i.e., measurements on the same people
before and after a treatment)
• Ex) Is diet X effective in lowering serum cholesterol levels in a sample of 12
people?
• Ex) Do patients who receive drug X have lower blood pressure after
treatment then they did before treatment?
– Independent samples t-tests: To compare the MEANS of a
continuous variable in TWO independent samples (i.e., two different
groups of people)
• Ex) Do people with diabetes have the same Systolic Blood Pressure as
people without diabetes?
• Ex) Do patients who receive a new drug treatment have lower blood
pressure than those who receive a placebo?
Tip: if you have > 2 different groups, you use ANOVA, which compares the means of 3 or more groups
T-tests
• What does a t-test tell you?
– If there is a statistically significant difference
between the mean score (or value) of two groups
(either the same group of people before and after
or two different groups of people)
• How do you interpret it?
– By looking at corresponding p-value
• If p < .05, means are significantly different from each
other
• If p > 0.05, means are not significantly different from
each other
How do you report t-tests results?
“As can be seen in Figure 1, specialty candidates had significantly
higher scores on questions dealing with treatment than residency
candidates (t = [insert t-value from stats output], p < .001).
“As can be seen in Figure 1, children’s mean reading
performance was significantly higher on the post-tests in
all four grades, ( t = [insert from stats output], p < .05)”
Chi-square
• When to use it?
– When you want to know if there is an association between two
categorical (nominal) variables (i.e., between an exposure and
outcome)
• Ex) Smoking (yes/no) and lung cancer (yes/no)
• Ex) Obesity (yes/no) and diabetes (yes/no)
• What does a chi-square test tell you?
– If the observed frequencies of occurrence in each group are
significantly different from expected frequencies (i.e., a
difference of proportions)
Chi-square
• What do the results look like?
– Chi-square test statistics = X2
• How do you interpret it?
– Usually, the higher the chi-square statistic, the
greater likelihood the finding is significant, but you
must look at the corresponding p-value to
determine significance
Tip: Chi square requires that there be 5 or more in each cell of a 2x2 table and 5 or more in 80% of
cells in larger tables. No cells can have a zero count.
How do you report chi-square?
“Distribution of obesity by gender showed
that 171 (38.9%) and 75 (17%) of women
were overweight and obese (Type I &II),
respectively. Whilst 118 (37.3%) and 12
(3.8%) of men were overweight and obese
(Type I & II), respectively (Table-II).
The Chi square test shows that these
differences are statistically significant
(p<0.001).”
Logistic Regression
• When to use it?
– When you want to measure the strength and direction of
the association between two variables, where the
dependent or outcome variable is categorical (e.g., yes/no)
– When you want to predict the likelihood of an outcome
while controlling for confounders
• Ex) examine the relationship between health behavior (smoking,
exercise, low-fat diet) and arthritis (arthritis vs. no arthritis)
• Ex) Predict the probability of stroke in relation to gender while
controlling for age or hypertension
• What does it tell you?
– The odds of an event occurring The probability of the
outcome event occurring divided by the probability of it
not occurring
Logistic Regression
• What do the results look like?
• Odds Ratios (OR) & 95% Confidence Intervals (CI)
• How do you interpret the results?
– Significance can be inferred using by looking at confidence intervals:
• If the confidence interval does not cross 1 (e.g., 0.04 – 0.08 or 1.50 – 3.49),
then the result is significant
– If OR > 1  The outcome is that many times MORE likely to occur
• The independent variable may be a RISK FACTOR
• 1.50 = 50% more likely to experience event or 50% more at risk
• 2.0 = twice as likely
• 1.33 = 33% more likely
– If OR < 1  The outcome is that many times LESS likely to occur
• The independent variable may be a PROTECTIVE FACTOR
• 0.50 = 50% less likely to experience the event
• 0.75 = 25% less likely
How do you report Logistic Regression?
“Table 3 shows the effects of both statins and fibrates adjusted for the concomitant
conditions on the risk of peripheral neuropathy. With the exception of connective tissue
disease, significant increased risks were observed for all the other concomitant
conditions. Odds ratios associated with both statins and fibrates were also significant.”
Confidence Interval crosses
1  NOT SIGNIFICANT !!!
49% increased risk
Those taking lipid lowering
drugs had greater risk for
neuropathy
control
variables
Summary of Statistical Tests
Statistic Test Type of Data Needed Test Statistic Example
Correlation Two continuous
variables
Pearson’s r Are blood pressure and
weight correlated?
T-tests/ANOVA Means from a
continuous variable
taken from two or
more groups
Student’s t Do normal weight (group 1)
patients have lower blood
pressure than obese
patients (group 2)?
Chi-square Two categorical
variables
Chi-square X2 Are obese individuals
(obese vs. not obese)
significantly more likely to
have a stroke (stroke vs. no
stroke)?
Logistic
Regression
A dichotomous
variable as the
outcome
Odds Ratios (OR)
& 95%
Confidence
Intervals (CI)
Does obesity predict stroke
(stroke vs. no stroke) when
controlling for other
variables?
Summary
• Descriptive statistics can be used with nominal, ordinal, interval
and ratio data
• Frequencies and percentages describe categorical data and
means and standard deviations describe continuous variables
• Inferential statistics can be used to determine associations
between variables and predict the likelihood of outcomes or
events
• Inferential statistics tell us if our findings are significant and if we
can infer from our sample to the larger population
Next Steps
• Think about the data that you have collected
or will collect as part of your research project
– What is your research question?
– What are you trying to get your data to “say”?
– Which statistical tests will best help you answer
your research question?
– Contact the research coordinator to discuss how
to analyze your data!
References
• Essential Medical Statistics. Kirkwood & Sterne, 2nd Edition.
2003
• http://ocw.tufts.edu/Content/1/lecturenotes/193325
• http://stattrek.com/AP-Statistics-
1/Association.aspx?Tutorial=AP
• http://udel.edu/~mcdonald/statcentral.html
• Background to Statistics for Non-Statisticians. Powerpoint
Lecture. Dr. Craig Jackson , Prof. Occupational Health
Psychology , Faculty of Education, Law & Social Sciences, BCU.
ww.hcc.uce.ac.uk/craigjackson/Basic%20Statistics.ppt.

More Related Content

Similar to Statistics.pptx

Statistics.pdf.pdf for Research Physiotherapy and Occupational Therapy
Statistics.pdf.pdf for Research Physiotherapy and Occupational TherapyStatistics.pdf.pdf for Research Physiotherapy and Occupational Therapy
Statistics.pdf.pdf for Research Physiotherapy and Occupational Therapy
SakhileKhoza2
 
Need a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docxNeed a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docx
lea6nklmattu
 
Statistics  What you Need to KnowIntroductionOften, when peop.docx
Statistics  What you Need to KnowIntroductionOften, when peop.docxStatistics  What you Need to KnowIntroductionOften, when peop.docx
Statistics  What you Need to KnowIntroductionOften, when peop.docx
dessiechisomjj4
 
statistics in pharmaceutical sciences
statistics in pharmaceutical sciencesstatistics in pharmaceutical sciences
statistics in pharmaceutical sciences
Techmasi
 
BASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdf
BASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdfBASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdf
BASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdf
Adamu Mohammad
 

Similar to Statistics.pptx (20)

Overview of different statistical tests used in epidemiological
Overview of different  statistical tests used in epidemiologicalOverview of different  statistical tests used in epidemiological
Overview of different statistical tests used in epidemiological
 
Epidemological methods
Epidemological methodsEpidemological methods
Epidemological methods
 
Understanding statistics in research
Understanding statistics in researchUnderstanding statistics in research
Understanding statistics in research
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
 
Statistics.pdf.pdf for Research Physiotherapy and Occupational Therapy
Statistics.pdf.pdf for Research Physiotherapy and Occupational TherapyStatistics.pdf.pdf for Research Physiotherapy and Occupational Therapy
Statistics.pdf.pdf for Research Physiotherapy and Occupational Therapy
 
Epidemiological study design and it's significance
Epidemiological study design and it's significanceEpidemiological study design and it's significance
Epidemiological study design and it's significance
 
Biostatistics
Biostatistics Biostatistics
Biostatistics
 
1. complete stats notes
1. complete stats notes1. complete stats notes
1. complete stats notes
 
83341 ch08 jacobsen
83341 ch08 jacobsen83341 ch08 jacobsen
83341 ch08 jacobsen
 
Clinical research ( Medical stat. concepts)
Clinical research ( Medical stat. concepts)Clinical research ( Medical stat. concepts)
Clinical research ( Medical stat. concepts)
 
INTRODUCTION TO HEALTHCARE RESEARCH METHODS: Correlational Studies, Case Seri...
INTRODUCTION TO HEALTHCARE RESEARCH METHODS: Correlational Studies, Case Seri...INTRODUCTION TO HEALTHCARE RESEARCH METHODS: Correlational Studies, Case Seri...
INTRODUCTION TO HEALTHCARE RESEARCH METHODS: Correlational Studies, Case Seri...
 
Need a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docxNeed a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docx
 
Basics for beginners in statistics
Basics for beginners in statistics Basics for beginners in statistics
Basics for beginners in statistics
 
ANOVA test and correlation
ANOVA test and correlationANOVA test and correlation
ANOVA test and correlation
 
Statistics  What you Need to KnowIntroductionOften, when peop.docx
Statistics  What you Need to KnowIntroductionOften, when peop.docxStatistics  What you Need to KnowIntroductionOften, when peop.docx
Statistics  What you Need to KnowIntroductionOften, when peop.docx
 
statistics in pharmaceutical sciences
statistics in pharmaceutical sciencesstatistics in pharmaceutical sciences
statistics in pharmaceutical sciences
 
BASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdf
BASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdfBASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdf
BASIC STATISTICS AND THEIR INTERPRETATION AND USE IN EPIDEMIOLOGY 050822.pdf
 
P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...P-values the gold measure of statistical validity are not as reliable as many...
P-values the gold measure of statistical validity are not as reliable as many...
 
INTRODUCTION TO BIO STATISTICS
INTRODUCTION TO BIO STATISTICS INTRODUCTION TO BIO STATISTICS
INTRODUCTION TO BIO STATISTICS
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
 

Recently uploaded

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
EADTU
 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
AnaAcapella
 

Recently uploaded (20)

FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdfFICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Details on CBSE Compartment Exam.pptx1111
Details on CBSE Compartment Exam.pptx1111Details on CBSE Compartment Exam.pptx1111
Details on CBSE Compartment Exam.pptx1111
 
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
Transparency, Recognition and the role of eSealing - Ildiko Mazar and Koen No...
 
How to Add a Tool Tip to a Field in Odoo 17
How to Add a Tool Tip to a Field in Odoo 17How to Add a Tool Tip to a Field in Odoo 17
How to Add a Tool Tip to a Field in Odoo 17
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & Systems
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Simple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdfSimple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdf
 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
UGC NET Paper 1 Unit 7 DATA INTERPRETATION.pdf
UGC NET Paper 1 Unit 7 DATA INTERPRETATION.pdfUGC NET Paper 1 Unit 7 DATA INTERPRETATION.pdf
UGC NET Paper 1 Unit 7 DATA INTERPRETATION.pdf
 
PANDITA RAMABAI- Indian political thought GENDER.pptx
PANDITA RAMABAI- Indian political thought GENDER.pptxPANDITA RAMABAI- Indian political thought GENDER.pptx
PANDITA RAMABAI- Indian political thought GENDER.pptx
 
How to Manage Call for Tendor in Odoo 17
How to Manage Call for Tendor in Odoo 17How to Manage Call for Tendor in Odoo 17
How to Manage Call for Tendor in Odoo 17
 

Statistics.pptx

  • 1. Basic Statistics Overview Dr. N. Mohana Assistant Professor Senior Division of Mathematics, School of Advanced Sciences Vellore Institute of Technology, Chennai
  • 2. Preface • The purpose of this presentation is to help you determine which statistical tests are appropriate for analyzing your data for your resident research project. It does not represent a comprehensive overview of all statistical tests and methods. • Your data may need to be analyzed using different statistical tests than are presented here, but this presentation focuses on the most common techniques.
  • 3. Outline • Descriptive Statistics – Frequencies & percentages – Means & standard deviations • Inferential Statistics – Correlation – T-tests – Chi-square – Logistic Regression
  • 4. Types of Statistics/Analyses Descriptive Statistics – Frequencies – Basic measurements Inferential Statistics – Hypothesis Testing – Correlation – Confidence Intervals – Significance Testing – Prediction Describing a phenomena How many? How much? BP, HR, BMI, IQ, etc. Inferences about a phenomena Proving or disproving theories Associations between phenomena If sample relates to the larger population E.g., Diet and health
  • 5. Descriptive Statistics Descriptive statistics can be used to summarize and describe a single variable • Frequencies (counts) & Percentages – Use with categorical (nominal) data • Levels, types, groupings, yes/no, Drug A vs. Drug B • Means & Standard Deviations – Use with continuous (interval/ratio) data • Height, weight, cholesterol, scores on a test
  • 6. Frequencies & Percentages Look at the different ways we can display frequencies and percentages for this data: Table Bar chart Pie chart Good if more than 20 observations AKA frequency distributions – good if more than 20 observations
  • 7. Distributions The distribution of scores or values can also be displayed using Box and Whiskers Plots and Histograms
  • 8. Continuous  Categorical It is possible to take continuous data (such as hemoglobin levels) and turn it into categorical data by grouping values together. Then we can calculate frequencies and percentages for each group.
  • 9. Continuous  Categorical Distribution of Glasgow Coma Scale Scores Even though this is continuous data, it is being treated as “nominal” as it is broken down into groups or categories Tip: It is usually better to collect continuous data and then break it down into categories for data analysis as opposed to collecting data that fits into preconceived categories.
  • 10. Ordinal Level Data Frequencies and percentages can be computed for ordinal data – Examples: High School/Some College/College Graduate/Graduate School 0 10 20 30 40 50 60 Strongly Agree Agree Disagree Strongly Disagree
  • 11. Interval/Ratio Data We can compute frequencies and percentages for interval and ratio level data as well – Examples: Age, Temperature, Height, Weight, Many Clinical Serum Levels Distribution of Injury Severity Score in a population of patients
  • 12. Interval/Ratio Distributions The distribution of interval/ratio data often forms a “bell shaped” curve. – Many phenomena in life are normally distributed (age, height, weight, IQ).
  • 13. Interval & Ratio Data Measures of central tendency and measures of dispersion are often computed with interval/ratio data • Measures of Central Tendency (aka, the “Middle Point”) – Mean, Median, Mode – If your frequency distribution shows outliers, you might want to use the median instead of the mean • Measures of Dispersion (aka, How “spread out” the data are) ― Variance, standard deviation, standard error of the mean ― Describe how “spread out” a distribution of scores is ― High numbers for variance and standard deviation may mean that scores are “all over the place” and do not necessarily fall close to the mean In research, means are usually presented along with standard deviations or standard errors.
  • 14. INFERENTIAL STATISTICS Inferential statistics can be used to prove or disprove theories, determine associations between variables, and determine if findings are significant and whether or not we can generalize from our sample to the entire population The types of inferential statistics we will go over: • Correlation • T-tests/ANOVA • Chi-square • Logistic Regression
  • 15. Type of Data & Analysis • Analysis of Categorical/Nominal Data – Correlation T-tests – T-tests • Analysis of Continuous Data – Chi-square – Logistic Regression
  • 16. Correlation • When to use it? – When you want to know about the association or relationship between two continuous variables • Ex) food intake and weight; drug dosage and blood pressure; air temperature and metabolic rate, etc. • What does it tell you? – If a linear relationship exists between two variables, and how strong that relationship is • What do the results look like? – The correlation coefficient = Pearson’s r – Ranges from -1 to +1 – See next slide for examples of correlation results
  • 17. Correlation Guide for interpreting strength of correlations:  0 – 0.25 = Little or no relationship  0.25 – 0.50 = Fair degree of relationship  0.50 - 0.75 = Moderate degree of relationship  0.75 – 1.0 = Strong relationship  1.0 = perfect correlation
  • 18. Correlation • How do you interpret it? – If r is positive, high values of one variable are associated with high values of the other variable (both go in SAME direction - ↑↑ OR ↓↓) • Ex) Diastolic blood pressure tends to rise with age, thus the two variables are positively correlated – If r is negative, low values of one variable are associated with high values of the other variable (opposite direction - ↑↓ OR ↓ ↑) • Ex) Heart rate tends to be lower in persons who exercise frequently, the two variables correlate negatively – Correlation of 0 indicates NO linear relationship • How do you report it? – “Diastolic blood pressure was positively correlated with age (r = .75, p < . 05).” Tip: Correlation does NOT equal causation!!! Just because two variables are highly correlated, this does NOT mean that one CAUSES the other!!!
  • 19. T-tests • When to use them? – Paired t-tests: When comparing the MEANS of a continuous variable in two non-independent samples (i.e., measurements on the same people before and after a treatment) • Ex) Is diet X effective in lowering serum cholesterol levels in a sample of 12 people? • Ex) Do patients who receive drug X have lower blood pressure after treatment then they did before treatment? – Independent samples t-tests: To compare the MEANS of a continuous variable in TWO independent samples (i.e., two different groups of people) • Ex) Do people with diabetes have the same Systolic Blood Pressure as people without diabetes? • Ex) Do patients who receive a new drug treatment have lower blood pressure than those who receive a placebo? Tip: if you have > 2 different groups, you use ANOVA, which compares the means of 3 or more groups
  • 20. T-tests • What does a t-test tell you? – If there is a statistically significant difference between the mean score (or value) of two groups (either the same group of people before and after or two different groups of people) • How do you interpret it? – By looking at corresponding p-value • If p < .05, means are significantly different from each other • If p > 0.05, means are not significantly different from each other
  • 21. How do you report t-tests results? “As can be seen in Figure 1, specialty candidates had significantly higher scores on questions dealing with treatment than residency candidates (t = [insert t-value from stats output], p < .001). “As can be seen in Figure 1, children’s mean reading performance was significantly higher on the post-tests in all four grades, ( t = [insert from stats output], p < .05)”
  • 22. Chi-square • When to use it? – When you want to know if there is an association between two categorical (nominal) variables (i.e., between an exposure and outcome) • Ex) Smoking (yes/no) and lung cancer (yes/no) • Ex) Obesity (yes/no) and diabetes (yes/no) • What does a chi-square test tell you? – If the observed frequencies of occurrence in each group are significantly different from expected frequencies (i.e., a difference of proportions)
  • 23. Chi-square • What do the results look like? – Chi-square test statistics = X2 • How do you interpret it? – Usually, the higher the chi-square statistic, the greater likelihood the finding is significant, but you must look at the corresponding p-value to determine significance Tip: Chi square requires that there be 5 or more in each cell of a 2x2 table and 5 or more in 80% of cells in larger tables. No cells can have a zero count.
  • 24. How do you report chi-square? “Distribution of obesity by gender showed that 171 (38.9%) and 75 (17%) of women were overweight and obese (Type I &II), respectively. Whilst 118 (37.3%) and 12 (3.8%) of men were overweight and obese (Type I & II), respectively (Table-II). The Chi square test shows that these differences are statistically significant (p<0.001).”
  • 25. Logistic Regression • When to use it? – When you want to measure the strength and direction of the association between two variables, where the dependent or outcome variable is categorical (e.g., yes/no) – When you want to predict the likelihood of an outcome while controlling for confounders • Ex) examine the relationship between health behavior (smoking, exercise, low-fat diet) and arthritis (arthritis vs. no arthritis) • Ex) Predict the probability of stroke in relation to gender while controlling for age or hypertension • What does it tell you? – The odds of an event occurring The probability of the outcome event occurring divided by the probability of it not occurring
  • 26. Logistic Regression • What do the results look like? • Odds Ratios (OR) & 95% Confidence Intervals (CI) • How do you interpret the results? – Significance can be inferred using by looking at confidence intervals: • If the confidence interval does not cross 1 (e.g., 0.04 – 0.08 or 1.50 – 3.49), then the result is significant – If OR > 1  The outcome is that many times MORE likely to occur • The independent variable may be a RISK FACTOR • 1.50 = 50% more likely to experience event or 50% more at risk • 2.0 = twice as likely • 1.33 = 33% more likely – If OR < 1  The outcome is that many times LESS likely to occur • The independent variable may be a PROTECTIVE FACTOR • 0.50 = 50% less likely to experience the event • 0.75 = 25% less likely
  • 27. How do you report Logistic Regression? “Table 3 shows the effects of both statins and fibrates adjusted for the concomitant conditions on the risk of peripheral neuropathy. With the exception of connective tissue disease, significant increased risks were observed for all the other concomitant conditions. Odds ratios associated with both statins and fibrates were also significant.” Confidence Interval crosses 1  NOT SIGNIFICANT !!! 49% increased risk Those taking lipid lowering drugs had greater risk for neuropathy control variables
  • 28. Summary of Statistical Tests Statistic Test Type of Data Needed Test Statistic Example Correlation Two continuous variables Pearson’s r Are blood pressure and weight correlated? T-tests/ANOVA Means from a continuous variable taken from two or more groups Student’s t Do normal weight (group 1) patients have lower blood pressure than obese patients (group 2)? Chi-square Two categorical variables Chi-square X2 Are obese individuals (obese vs. not obese) significantly more likely to have a stroke (stroke vs. no stroke)? Logistic Regression A dichotomous variable as the outcome Odds Ratios (OR) & 95% Confidence Intervals (CI) Does obesity predict stroke (stroke vs. no stroke) when controlling for other variables?
  • 29. Summary • Descriptive statistics can be used with nominal, ordinal, interval and ratio data • Frequencies and percentages describe categorical data and means and standard deviations describe continuous variables • Inferential statistics can be used to determine associations between variables and predict the likelihood of outcomes or events • Inferential statistics tell us if our findings are significant and if we can infer from our sample to the larger population
  • 30. Next Steps • Think about the data that you have collected or will collect as part of your research project – What is your research question? – What are you trying to get your data to “say”? – Which statistical tests will best help you answer your research question? – Contact the research coordinator to discuss how to analyze your data!
  • 31. References • Essential Medical Statistics. Kirkwood & Sterne, 2nd Edition. 2003 • http://ocw.tufts.edu/Content/1/lecturenotes/193325 • http://stattrek.com/AP-Statistics- 1/Association.aspx?Tutorial=AP • http://udel.edu/~mcdonald/statcentral.html • Background to Statistics for Non-Statisticians. Powerpoint Lecture. Dr. Craig Jackson , Prof. Occupational Health Psychology , Faculty of Education, Law & Social Sciences, BCU. ww.hcc.uce.ac.uk/craigjackson/Basic%20Statistics.ppt.