SlideShare a Scribd company logo
1 of 38
STATISTICAL ANALYSIS
 Statistical analysis means investigating trends,
patterns, and relationships using quantitative data. It is
an important research tool used by scientists,
governments, businesses, and other organizations.
 To draw valid conclusions, statistical analysis requires
careful planning from the very start of the research
process. You need to specify your hypotheses and make
decisions about your research design, sample size, and
sampling procedure.
 After collecting data from your sample, you can organize
and summarize the data using descriptive statistics.
Then, you can use inferential statistics to formally test
hypotheses and make estimates about the population.
Finally, you can interpret and generalize your findings.
STEP 1: WRITE YOUR HYPOTHESES AND PLAN
YOUR RESEARCH DESIGN
 The goal of research is often to investigate a
relationship between variables within a population.
You start with a prediction, and use statistical
analysis to test that prediction.
 A statistical hypothesis is a formal way of writing a
prediction about a population. Every research
prediction is rephrased into null and alternative
hypotheses that can be tested using sample data.
 While the null hypothesis always predicts no effect
or no relationship between variables, the alternative
hypothesis states your research prediction of an
effect or relationship.
 Example: Statistical hypotheses to test an
effectNull hypothesis: A 5-minute meditation
exercise will have no effect on math test scores in
teenagers.
 Alternative hypothesis: A 5-minute meditation
exercise will improve math test scores in teenagers.
PLANNING YOUR RESEARCH DESIGN
A research design is your overall strategy for data
collection and analysis. It determines the statistical tests
you can use to test your hypothesis later on.
 In an experimental design, you can assess a cause-
and-effect relationship (e.g., the effect of meditation on
test scores) using statistical tests of comparison or
regression.
 In a correlational design, you can explore relationships
between variables (e.g., parental income and GPA)
without any assumption of causality using correlation
coefficients and significance tests.
 In a descriptive design, you can study the
characteristics of a population or phenomenon (e.g., the
prevalence of anxiety in U.S. college students) using
statistical tests to draw inferences from sample data.
Your research design also concerns whether you’ll
compare participants at the group level or individual
level, or both.
 In a between-subjects design, you compare the
group-level outcomes of participants who have
been exposed to different treatments (e.g., those
who performed a meditation exercise vs those who
didn’t).
 In a within-subjects design, you compare
repeated measures from participants who have
participated in all treatments of a study (e.g., scores
from before and after performing a meditation
exercise).
MEASURING VARIABLES
When planning a research design, you
should operationalize your variables and decide
exactly how you will measure them.
For statistical analysis, it’s important to consider
the level of measurement of your variables, which
tells you what kind of data they contain:
 Categorical data represents groupings. These may
be nominal (e.g., gender) or ordinal (e.g. level of
language ability).
 Quantitative data represents amounts. These may
be on an interval scale (e.g. test score) or a ratio
scale (e.g. age).
 Many variables can be measured at different levels of
precision. For example, age data can be quantitative (8
years old) or categorical (young). If a variable is coded
numerically (e.g., level of agreement from 1–5), it
doesn’t automatically mean that it’s quantitative instead
of categorical.
 Identifying the measurement level is important for
choosing appropriate statistics and hypothesis tests. For
example, you can calculate a mean score with
quantitative data, but not with categorical data.
 In a research study, along with measures of your
variables of interest, you’ll often collect data on relevant
participant characteristics.
STEP 2: COLLECT DATA FROM A SAMPLE
Sampling for statistical analysis
There are two main approaches to selecting a
sample.
 Probability sampling: every member of the
population has a chance of being selected for the
study through random selection.
 Non-probability sampling: some members of the
population are more likely than others to be
selected for the study because of criteria such as
convenience or voluntary self-selection.
Create an appropriate sampling procedure
Based on the resources available for your research,
decide on how you’ll recruit participants.
 Will you have resources to advertise your study
widely, including outside of your university setting?
 Will you have the means to recruit a diverse sample
that represents a broad population?
 Do you have time to contact and follow up with
members of hard-to-reach groups?
Calculate sufficient sample size
 Significance level (alpha): the risk of rejecting a
true null hypothesis that you are willing to take,
usually set at 5%.
 Statistical power: the probability of your study
detecting an effect of a certain size if there is one,
usually 80% or higher.
 Expected effect size: a standardized indication of
how large the expected result of your study will be,
usually based on other similar studies.
 Population standard deviation: an estimate of the
population parameter based on a previous study or
a pilot study of your own.
STEP 3: SUMMARIZE YOUR DATA WITH
DESCRIPTIVE STATISTICS
Inspect your data
There are various ways to inspect your data,
including the following:
 Organizing data from each variable in frequency
distribution tables.
 Displaying data from a key variable in a bar
chart to view the distribution of responses.
 Visualizing the relationship between two variables
using a scatter plot.
 By visualizing your data in tables and graphs, you
can assess whether your data follow a skewed or
normal distribution and whether there are any
outliers or missing data.
 A normal distribution means that your data are
symmetrically distributed around a center where
most values lie, with the values tapering off at the
tail ends.
 In contrast, a skewed distribution is asymmetric
and has more values on one end than the other.
The shape of the distribution is important to keep in
mind because only some descriptive statistics
should be used with skewed distributions.
 Extreme outliers can also produce misleading
statistics, so you may need a systematic approach
to dealing with these values.
CALCULATE MEASURES OF CENTRAL
TENDENCY
 Measures of central tendency describe where most
of the values in a data set lie. Three main measures
of central tendency are often reported:
 Mode: the most popular response or value in the
data set.
 Median: the value in the exact middle of the data
set when ordered from low to high.
 Mean: the sum of all values divided by the number
of values.
CALCULATE MEASURES OF VARIABILITY
Measures of variability tell you how spread out the
values in a data set are. Four main measures of
variability are often reported:
 Range: the highest value minus the lowest value of
the data set.
 Interquartile range: the range of the middle half of
the data set.
 Standard deviation: the average distance between
each value in your data set and the mean.
 Variance: the square of the standard deviation.
STEP 4: TEST HYPOTHESES OR MAKE ESTIMATES
WITH INFERENTIAL STATISTICS
 A number that describes a sample is called
a statistic, while a number describing a population
is called a parameter. Using inferential statistics,
you can make conclusions about population
parameters based on sample statistics.
Researchers often use two main methods
(simultaneously) to make inferences in statistics.
 Estimation: calculating population parameters
based on sample statistics.
 Hypothesis testing: a formal process for testing
research predictions about the population using
samples.
ESTIMATION
You can make two types of estimates of population
parameters from sample statistics:
 A point estimate: a value that represents your best
guess of the exact parameter.
 An interval estimate: a range of values that
represent your best guess of where the parameter
lies.
 If your aim is to infer and report population characteristics
from sample data, it’s best to use both point and interval
estimates in your paper.
 You can consider a sample statistic a point estimate for the
population parameter when you have a representative sample
(e.g., in a wide public opinion poll, the proportion of a sample
that supports the current government is taken as the
population proportion of government supporters).
 There’s always error involved in estimation, so you should
also provide a confidence interval as an interval estimate to
show the variability around a point estimate.
 A confidence interval uses the standard error and the z score
from the standard normal distribution to convey where you’d
generally expect to find the population parameter most of the
time.
HYPOTHESIS TESTING
 Using data from a sample, you can test
hypotheses about relationships between variables
in the population. Hypothesis testing starts with the
assumption that the null hypothesis is true in the
population, and you use statistical tests to assess
whether the null hypothesis can be rejected or not.
Statistical tests determine where your sample data
would lie on an expected distribution of sample data if
the null hypothesis were true. These tests give two
main outputs:
 A test statistic tells you how much your data differs
from the null hypothesis of the test.
 A p value tells you the likelihood of obtaining your
results if the null hypothesis is actually true in the
population.
Statistical tests come in three main varieties:
 Comparison tests assess group differences in
outcomes.
 Regression tests assess cause-and-effect
relationships between variables.
 Correlation tests assess relationships between
variables without assuming causation.
Parametric tests
 Parametric tests make powerful inferences about
the population based on sample data. But to use
them, some assumptions must be met, and only
some types of variables can be used. If your data
violate these assumptions, you can perform
appropriate data transformations or use alternative
non-parametric tests instead.
 A regression models the extent to which changes
in a predictor variable results in changes in
outcome variable(s).
 A simple linear regression includes one predictor
variable and one outcome variable.
 A multiple linear regression includes two or more
predictor variables and one outcome variable.
REGRESSION MODELS
 Regression models describe the relationship
between variables by fitting a line to the observed
data. Linear regression models use a straight line,
while logistic and nonlinear regression models use
a curved line. Regression allows you to estimate
how a dependent variable changes as the
independent variable(s) change.
Simple linear regression is used to estimate the
relationship between two quantitative variables.
You can use simple linear regression when you want
to know:
 How strong the relationship is between two
variables (e.g. the relationship between rainfall and
soil erosion).
 The value of the dependent variable at a certain
value of the independent variable (e.g. the amount
of soil erosion at a certain level of rainfall).
ASSUMPTIONS OF SIMPLE LINEAR REGRESSION
 Simple linear regression is a parametric test, meaning
that it makes certain assumptions about the data. These
assumptions are:
 Homogeneity of variance (homoscedasticity): the
size of the error in our prediction doesn’t change
significantly across the values of the independent
variable.
 Independence of observations: the observations in
the dataset were collected using statistically valid
sampling methods, and there are no hidden
relationships among observations.
 Normality: The data follows a normal distribution.
Comparison tests usually compare the means of
groups. These may be the means of different groups
within a sample (e.g., a treatment and control group),
the means of one sample group taken at different
times (e.g., pretest and posttest scores), or a sample
mean and a population mean.
 A t test is for exactly 1 or 2 groups when the sample
is small (30 or less).
 A z test is for exactly 1 or 2 groups when the
sample is large.
 An ANOVA is for 3 or more groups.
STEP 5: INTERPRET YOUR RESULTS
Statistical significance
 In hypothesis testing, statistical significance is the
main criteria for forming conclusions. You compare
your p value to a set significance level (usually
0.05) to decide whether your results are statistically
significant or non-significant.
 Statistically significant results are considered
unlikely to have arisen solely due to chance. There
is only a very low chance of such a result occurring
if the null hypothesis is true in the population.

Effect size
 A statistically significant result doesn’t necessarily
mean that there are important real life applications
or clinical outcomes for a finding.
 In contrast, the effect size indicates the practical
significance of your results. It’s important to report
effect sizes along with your inferential statistics for
a complete picture of your results. You should also
report interval estimates of effect sizes if you’re
writing an APA style paper.
Decision errors
 Type I and Type II errors are mistakes made in
research conclusions. A Type I error means
rejecting the null hypothesis when it’s actually true,
while a Type II error means failing to reject the null
hypothesis when it’s false.
 You can aim to minimize the risk of these errors by
selecting an optimal significance level and ensuring
high power. However, there’s a trade-off between
the two errors, so a fine balance is necessary.
FREQUENTIST VERSUS BAYESIAN STATISTICS
 Traditionally, frequentist statistics emphasizes null
hypothesis significance testing and always starts
with the assumption of a true null hypothesis.
 However, Bayesian statistics has grown in
popularity as an alternative approach in the last few
decades. In this approach, you use previous
research to continually update your hypotheses
based on your expectations and observations.
 Bayes factor compares the relative strength of
evidence for the null versus the alternative
hypothesis rather than making a conclusion about
rejecting the null hypothesis or not.
Linear regression makes one additional assumption:
 The relationship between the independent and
dependent variable is linear: the line of best fit
through the data points is a straight line (rather than
a curve or some sort of grouping factor).
 Multiple linear regression is used to estimate the
relationship between two or more independent
variables and one dependent variable. You can
use multiple linear regression when you want to
know:
 How strong the relationship is between two or more
independent variables and one dependent variable
(e.g. how rainfall, temperature, and amount of
fertilizer added affect crop growth).
 The value of the dependent variable at a certain
value of the independent variables (e.g. the
expected yield of a crop at certain levels of rainfall,
temperature, and fertilizer addition).
ASSUMPTIONS OF MULTIPLE LINEAR
REGRESSION
Multiple linear regression makes all of the same
assumptions as simple linear regression:
 Homogeneity of variance (homoscedasticity):
the size of the error in our prediction doesn’t
change significantly across the values of the
independent variable.
 Independence of observations: the observations
in the dataset were collected using statistically valid
methods, and there are no hidden relationships
among variables.
In multiple linear regression, it is possible that some
of the independent variables are actually correlated
with one another, so it is important to check these
before developing the regression model. If two
independent variables are too highly correlated (r2 >
~0.6), then only one of them should be used in the
regression model.
 Normality: The data follows a normal distribution.
 Linearity: the line of best fit through the data points
is a straight line, rather than a curve or some sort of
grouping factor.

More Related Content

Similar to Statistical Analysis in a Nutshell

abdi research ppt.pptx
abdi research ppt.pptxabdi research ppt.pptx
abdi research ppt.pptxAbdetaBirhanu
 
Need a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docxNeed a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docxlea6nklmattu
 
Statistics  What you Need to KnowIntroductionOften, when peop.docx
Statistics  What you Need to KnowIntroductionOften, when peop.docxStatistics  What you Need to KnowIntroductionOften, when peop.docx
Statistics  What you Need to KnowIntroductionOften, when peop.docxdessiechisomjj4
 
TREATMENT OF DATA_Scrd.pptx
TREATMENT OF DATA_Scrd.pptxTREATMENT OF DATA_Scrd.pptx
TREATMENT OF DATA_Scrd.pptxCarmela857185
 
Introduction To Statistics
Introduction To StatisticsIntroduction To Statistics
Introduction To Statisticsalbertlaporte
 
Analyzing quantitative data
Analyzing quantitative dataAnalyzing quantitative data
Analyzing quantitative datamostafasharafiye
 
Statistics as a discipline
Statistics as a disciplineStatistics as a discipline
Statistics as a disciplineRosalinaTPayumo
 
Sampling for Various Kinds of Quantitative Research.pptx
Sampling for Various Kinds of Quantitative Research.pptxSampling for Various Kinds of Quantitative Research.pptx
Sampling for Various Kinds of Quantitative Research.pptxTanzeelaBashir1
 
Data Science interview questions of Statistics
Data Science interview questions of Statistics Data Science interview questions of Statistics
Data Science interview questions of Statistics Learnbay Datascience
 
·Quantitative Data Analysis StatisticsIntroductionUnd.docx
·Quantitative Data Analysis StatisticsIntroductionUnd.docx·Quantitative Data Analysis StatisticsIntroductionUnd.docx
·Quantitative Data Analysis StatisticsIntroductionUnd.docxlanagore871
 
Methods of Statistical Analysis & Interpretation of Data..pptx
Methods of Statistical Analysis & Interpretation of Data..pptxMethods of Statistical Analysis & Interpretation of Data..pptx
Methods of Statistical Analysis & Interpretation of Data..pptxheencomm
 
Recapitulation of Basic Statistical Concepts .pptx
Recapitulation of Basic Statistical Concepts .pptxRecapitulation of Basic Statistical Concepts .pptx
Recapitulation of Basic Statistical Concepts .pptxFranCis850707
 
Respond  using one or more of the following approachesAsk a pro
Respond  using one or more of the following approachesAsk a proRespond  using one or more of the following approachesAsk a pro
Respond  using one or more of the following approachesAsk a promickietanger
 
A review of statistics
A review of statisticsA review of statistics
A review of statisticsedisonre
 
Edisons Statistics
Edisons StatisticsEdisons Statistics
Edisons Statisticsteresa_soto
 
Edison S Statistics
Edison S StatisticsEdison S Statistics
Edison S Statisticsteresa_soto
 
Data science notes for ASDS calicut 2.pptx
Data science notes for ASDS calicut 2.pptxData science notes for ASDS calicut 2.pptx
Data science notes for ASDS calicut 2.pptxswapnaraghav
 
Introduction to statistics in health care
Introduction to statistics in health care Introduction to statistics in health care
Introduction to statistics in health care Dhasarathi Kumar
 

Similar to Statistical Analysis in a Nutshell (20)

abdi research ppt.pptx
abdi research ppt.pptxabdi research ppt.pptx
abdi research ppt.pptx
 
Need a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docxNeed a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docx
 
Statistics  What you Need to KnowIntroductionOften, when peop.docx
Statistics  What you Need to KnowIntroductionOften, when peop.docxStatistics  What you Need to KnowIntroductionOften, when peop.docx
Statistics  What you Need to KnowIntroductionOften, when peop.docx
 
TREATMENT OF DATA_Scrd.pptx
TREATMENT OF DATA_Scrd.pptxTREATMENT OF DATA_Scrd.pptx
TREATMENT OF DATA_Scrd.pptx
 
Introduction To Statistics
Introduction To StatisticsIntroduction To Statistics
Introduction To Statistics
 
Descriptive Analysis.pptx
Descriptive Analysis.pptxDescriptive Analysis.pptx
Descriptive Analysis.pptx
 
Analyzing quantitative data
Analyzing quantitative dataAnalyzing quantitative data
Analyzing quantitative data
 
Statistics as a discipline
Statistics as a disciplineStatistics as a discipline
Statistics as a discipline
 
Sampling for Various Kinds of Quantitative Research.pptx
Sampling for Various Kinds of Quantitative Research.pptxSampling for Various Kinds of Quantitative Research.pptx
Sampling for Various Kinds of Quantitative Research.pptx
 
Data Science interview questions of Statistics
Data Science interview questions of Statistics Data Science interview questions of Statistics
Data Science interview questions of Statistics
 
·Quantitative Data Analysis StatisticsIntroductionUnd.docx
·Quantitative Data Analysis StatisticsIntroductionUnd.docx·Quantitative Data Analysis StatisticsIntroductionUnd.docx
·Quantitative Data Analysis StatisticsIntroductionUnd.docx
 
Methods of Statistical Analysis & Interpretation of Data..pptx
Methods of Statistical Analysis & Interpretation of Data..pptxMethods of Statistical Analysis & Interpretation of Data..pptx
Methods of Statistical Analysis & Interpretation of Data..pptx
 
statistics PGDM.pptx
statistics PGDM.pptxstatistics PGDM.pptx
statistics PGDM.pptx
 
Recapitulation of Basic Statistical Concepts .pptx
Recapitulation of Basic Statistical Concepts .pptxRecapitulation of Basic Statistical Concepts .pptx
Recapitulation of Basic Statistical Concepts .pptx
 
Respond  using one or more of the following approachesAsk a pro
Respond  using one or more of the following approachesAsk a proRespond  using one or more of the following approachesAsk a pro
Respond  using one or more of the following approachesAsk a pro
 
A review of statistics
A review of statisticsA review of statistics
A review of statistics
 
Edisons Statistics
Edisons StatisticsEdisons Statistics
Edisons Statistics
 
Edison S Statistics
Edison S StatisticsEdison S Statistics
Edison S Statistics
 
Data science notes for ASDS calicut 2.pptx
Data science notes for ASDS calicut 2.pptxData science notes for ASDS calicut 2.pptx
Data science notes for ASDS calicut 2.pptx
 
Introduction to statistics in health care
Introduction to statistics in health care Introduction to statistics in health care
Introduction to statistics in health care
 

More from hayatalakoum1

Healthcare and facility.pptx
Healthcare and facility.pptxHealthcare and facility.pptx
Healthcare and facility.pptxhayatalakoum1
 
Team work; problems and incentives.pptx
Team work; problems and incentives.pptxTeam work; problems and incentives.pptx
Team work; problems and incentives.pptxhayatalakoum1
 
Nurse as educator.pptx
Nurse as educator.pptxNurse as educator.pptx
Nurse as educator.pptxhayatalakoum1
 
FACTORS THAT INFLUENCES.pptx
FACTORS THAT INFLUENCES.pptxFACTORS THAT INFLUENCES.pptx
FACTORS THAT INFLUENCES.pptxhayatalakoum1
 
Allergic reactions.pptx
 Allergic reactions.pptx Allergic reactions.pptx
Allergic reactions.pptxhayatalakoum1
 
introduction to medical terminology pptx
introduction to medical terminology pptxintroduction to medical terminology pptx
introduction to medical terminology pptxhayatalakoum1
 
the body part 3 pptx
the body part 3 pptxthe body part 3 pptx
the body part 3 pptxhayatalakoum1
 
Les 8 principes du management de la qualité.pptx
Les 8 principes du management de la qualité.pptxLes 8 principes du management de la qualité.pptx
Les 8 principes du management de la qualité.pptxhayatalakoum1
 
Therapies for Blood Disorders.pptx
Therapies for Blood Disorders.pptxTherapies for Blood Disorders.pptx
Therapies for Blood Disorders.pptxhayatalakoum1
 
1.Research problem.pptx
1.Research problem.pptx1.Research problem.pptx
1.Research problem.pptxhayatalakoum1
 
8. contact dermatitis.pptx
8. contact dermatitis.pptx8. contact dermatitis.pptx
8. contact dermatitis.pptxhayatalakoum1
 
immunodeficiency_diseases.pptx
immunodeficiency_diseases.pptximmunodeficiency_diseases.pptx
immunodeficiency_diseases.pptxhayatalakoum1
 

More from hayatalakoum1 (20)

patient safety.pptx
patient safety.pptxpatient safety.pptx
patient safety.pptx
 
KPI.pptx
KPI.pptxKPI.pptx
KPI.pptx
 
Healthcare and facility.pptx
Healthcare and facility.pptxHealthcare and facility.pptx
Healthcare and facility.pptx
 
Team work; problems and incentives.pptx
Team work; problems and incentives.pptxTeam work; problems and incentives.pptx
Team work; problems and incentives.pptx
 
Nurse as educator.pptx
Nurse as educator.pptxNurse as educator.pptx
Nurse as educator.pptx
 
FACTORS THAT INFLUENCES.pptx
FACTORS THAT INFLUENCES.pptxFACTORS THAT INFLUENCES.pptx
FACTORS THAT INFLUENCES.pptx
 
Allergic reactions.pptx
 Allergic reactions.pptx Allergic reactions.pptx
Allergic reactions.pptx
 
HIV.pptx
HIV.pptxHIV.pptx
HIV.pptx
 
1. Immunity.pptx
1. Immunity.pptx1. Immunity.pptx
1. Immunity.pptx
 
introduction to medical terminology pptx
introduction to medical terminology pptxintroduction to medical terminology pptx
introduction to medical terminology pptx
 
the body part 3 pptx
the body part 3 pptxthe body part 3 pptx
the body part 3 pptx
 
medical terminology
medical terminologymedical terminology
medical terminology
 
DOTATION.pptx
DOTATION.pptxDOTATION.pptx
DOTATION.pptx
 
Les 8 principes du management de la qualité.pptx
Les 8 principes du management de la qualité.pptxLes 8 principes du management de la qualité.pptx
Les 8 principes du management de la qualité.pptx
 
instruments
instrumentsinstruments
instruments
 
Therapies for Blood Disorders.pptx
Therapies for Blood Disorders.pptxTherapies for Blood Disorders.pptx
Therapies for Blood Disorders.pptx
 
1.Research problem.pptx
1.Research problem.pptx1.Research problem.pptx
1.Research problem.pptx
 
8. contact dermatitis.pptx
8. contact dermatitis.pptx8. contact dermatitis.pptx
8. contact dermatitis.pptx
 
immunodeficiency_diseases.pptx
immunodeficiency_diseases.pptximmunodeficiency_diseases.pptx
immunodeficiency_diseases.pptx
 
Immunity.pptx
Immunity.pptxImmunity.pptx
Immunity.pptx
 

Recently uploaded

Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXDole Philippines School
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
PROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalPROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalMAESTRELLAMesa2
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
Servosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicServosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicAditi Jain
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsSérgio Sacani
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptxpallavirawat456
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 

Recently uploaded (20)

Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTXALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
ALL ABOUT MIXTURES IN GRADE 7 CLASS PPTX
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
PROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and VerticalPROJECTILE MOTION-Horizontal and Vertical
PROJECTILE MOTION-Horizontal and Vertical
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
Servosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by PetrovicServosystem Theory / Cybernetic Theory by Petrovic
Servosystem Theory / Cybernetic Theory by Petrovic
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive stars
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptx
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?
 

Statistical Analysis in a Nutshell

  • 2.  Statistical analysis means investigating trends, patterns, and relationships using quantitative data. It is an important research tool used by scientists, governments, businesses, and other organizations.  To draw valid conclusions, statistical analysis requires careful planning from the very start of the research process. You need to specify your hypotheses and make decisions about your research design, sample size, and sampling procedure.  After collecting data from your sample, you can organize and summarize the data using descriptive statistics. Then, you can use inferential statistics to formally test hypotheses and make estimates about the population. Finally, you can interpret and generalize your findings.
  • 3. STEP 1: WRITE YOUR HYPOTHESES AND PLAN YOUR RESEARCH DESIGN  The goal of research is often to investigate a relationship between variables within a population. You start with a prediction, and use statistical analysis to test that prediction.  A statistical hypothesis is a formal way of writing a prediction about a population. Every research prediction is rephrased into null and alternative hypotheses that can be tested using sample data.  While the null hypothesis always predicts no effect or no relationship between variables, the alternative hypothesis states your research prediction of an effect or relationship.
  • 4.  Example: Statistical hypotheses to test an effectNull hypothesis: A 5-minute meditation exercise will have no effect on math test scores in teenagers.  Alternative hypothesis: A 5-minute meditation exercise will improve math test scores in teenagers.
  • 5. PLANNING YOUR RESEARCH DESIGN A research design is your overall strategy for data collection and analysis. It determines the statistical tests you can use to test your hypothesis later on.  In an experimental design, you can assess a cause- and-effect relationship (e.g., the effect of meditation on test scores) using statistical tests of comparison or regression.  In a correlational design, you can explore relationships between variables (e.g., parental income and GPA) without any assumption of causality using correlation coefficients and significance tests.  In a descriptive design, you can study the characteristics of a population or phenomenon (e.g., the prevalence of anxiety in U.S. college students) using statistical tests to draw inferences from sample data.
  • 6. Your research design also concerns whether you’ll compare participants at the group level or individual level, or both.  In a between-subjects design, you compare the group-level outcomes of participants who have been exposed to different treatments (e.g., those who performed a meditation exercise vs those who didn’t).  In a within-subjects design, you compare repeated measures from participants who have participated in all treatments of a study (e.g., scores from before and after performing a meditation exercise).
  • 7. MEASURING VARIABLES When planning a research design, you should operationalize your variables and decide exactly how you will measure them. For statistical analysis, it’s important to consider the level of measurement of your variables, which tells you what kind of data they contain:  Categorical data represents groupings. These may be nominal (e.g., gender) or ordinal (e.g. level of language ability).  Quantitative data represents amounts. These may be on an interval scale (e.g. test score) or a ratio scale (e.g. age).
  • 8.  Many variables can be measured at different levels of precision. For example, age data can be quantitative (8 years old) or categorical (young). If a variable is coded numerically (e.g., level of agreement from 1–5), it doesn’t automatically mean that it’s quantitative instead of categorical.  Identifying the measurement level is important for choosing appropriate statistics and hypothesis tests. For example, you can calculate a mean score with quantitative data, but not with categorical data.  In a research study, along with measures of your variables of interest, you’ll often collect data on relevant participant characteristics.
  • 9. STEP 2: COLLECT DATA FROM A SAMPLE Sampling for statistical analysis There are two main approaches to selecting a sample.  Probability sampling: every member of the population has a chance of being selected for the study through random selection.  Non-probability sampling: some members of the population are more likely than others to be selected for the study because of criteria such as convenience or voluntary self-selection.
  • 10. Create an appropriate sampling procedure Based on the resources available for your research, decide on how you’ll recruit participants.  Will you have resources to advertise your study widely, including outside of your university setting?  Will you have the means to recruit a diverse sample that represents a broad population?  Do you have time to contact and follow up with members of hard-to-reach groups?
  • 11. Calculate sufficient sample size  Significance level (alpha): the risk of rejecting a true null hypothesis that you are willing to take, usually set at 5%.  Statistical power: the probability of your study detecting an effect of a certain size if there is one, usually 80% or higher.  Expected effect size: a standardized indication of how large the expected result of your study will be, usually based on other similar studies.  Population standard deviation: an estimate of the population parameter based on a previous study or a pilot study of your own.
  • 12. STEP 3: SUMMARIZE YOUR DATA WITH DESCRIPTIVE STATISTICS Inspect your data There are various ways to inspect your data, including the following:  Organizing data from each variable in frequency distribution tables.  Displaying data from a key variable in a bar chart to view the distribution of responses.  Visualizing the relationship between two variables using a scatter plot.
  • 13.  By visualizing your data in tables and graphs, you can assess whether your data follow a skewed or normal distribution and whether there are any outliers or missing data.  A normal distribution means that your data are symmetrically distributed around a center where most values lie, with the values tapering off at the tail ends.
  • 14.
  • 15.  In contrast, a skewed distribution is asymmetric and has more values on one end than the other. The shape of the distribution is important to keep in mind because only some descriptive statistics should be used with skewed distributions.  Extreme outliers can also produce misleading statistics, so you may need a systematic approach to dealing with these values.
  • 16. CALCULATE MEASURES OF CENTRAL TENDENCY  Measures of central tendency describe where most of the values in a data set lie. Three main measures of central tendency are often reported:  Mode: the most popular response or value in the data set.  Median: the value in the exact middle of the data set when ordered from low to high.  Mean: the sum of all values divided by the number of values.
  • 17. CALCULATE MEASURES OF VARIABILITY Measures of variability tell you how spread out the values in a data set are. Four main measures of variability are often reported:  Range: the highest value minus the lowest value of the data set.  Interquartile range: the range of the middle half of the data set.  Standard deviation: the average distance between each value in your data set and the mean.  Variance: the square of the standard deviation.
  • 18. STEP 4: TEST HYPOTHESES OR MAKE ESTIMATES WITH INFERENTIAL STATISTICS  A number that describes a sample is called a statistic, while a number describing a population is called a parameter. Using inferential statistics, you can make conclusions about population parameters based on sample statistics.
  • 19. Researchers often use two main methods (simultaneously) to make inferences in statistics.  Estimation: calculating population parameters based on sample statistics.  Hypothesis testing: a formal process for testing research predictions about the population using samples.
  • 20. ESTIMATION You can make two types of estimates of population parameters from sample statistics:  A point estimate: a value that represents your best guess of the exact parameter.  An interval estimate: a range of values that represent your best guess of where the parameter lies.
  • 21.  If your aim is to infer and report population characteristics from sample data, it’s best to use both point and interval estimates in your paper.  You can consider a sample statistic a point estimate for the population parameter when you have a representative sample (e.g., in a wide public opinion poll, the proportion of a sample that supports the current government is taken as the population proportion of government supporters).  There’s always error involved in estimation, so you should also provide a confidence interval as an interval estimate to show the variability around a point estimate.  A confidence interval uses the standard error and the z score from the standard normal distribution to convey where you’d generally expect to find the population parameter most of the time.
  • 22. HYPOTHESIS TESTING  Using data from a sample, you can test hypotheses about relationships between variables in the population. Hypothesis testing starts with the assumption that the null hypothesis is true in the population, and you use statistical tests to assess whether the null hypothesis can be rejected or not.
  • 23. Statistical tests determine where your sample data would lie on an expected distribution of sample data if the null hypothesis were true. These tests give two main outputs:  A test statistic tells you how much your data differs from the null hypothesis of the test.  A p value tells you the likelihood of obtaining your results if the null hypothesis is actually true in the population.
  • 24. Statistical tests come in three main varieties:  Comparison tests assess group differences in outcomes.  Regression tests assess cause-and-effect relationships between variables.  Correlation tests assess relationships between variables without assuming causation.
  • 25. Parametric tests  Parametric tests make powerful inferences about the population based on sample data. But to use them, some assumptions must be met, and only some types of variables can be used. If your data violate these assumptions, you can perform appropriate data transformations or use alternative non-parametric tests instead.
  • 26.  A regression models the extent to which changes in a predictor variable results in changes in outcome variable(s).  A simple linear regression includes one predictor variable and one outcome variable.  A multiple linear regression includes two or more predictor variables and one outcome variable.
  • 27. REGRESSION MODELS  Regression models describe the relationship between variables by fitting a line to the observed data. Linear regression models use a straight line, while logistic and nonlinear regression models use a curved line. Regression allows you to estimate how a dependent variable changes as the independent variable(s) change.
  • 28. Simple linear regression is used to estimate the relationship between two quantitative variables. You can use simple linear regression when you want to know:  How strong the relationship is between two variables (e.g. the relationship between rainfall and soil erosion).  The value of the dependent variable at a certain value of the independent variable (e.g. the amount of soil erosion at a certain level of rainfall).
  • 29. ASSUMPTIONS OF SIMPLE LINEAR REGRESSION  Simple linear regression is a parametric test, meaning that it makes certain assumptions about the data. These assumptions are:  Homogeneity of variance (homoscedasticity): the size of the error in our prediction doesn’t change significantly across the values of the independent variable.  Independence of observations: the observations in the dataset were collected using statistically valid sampling methods, and there are no hidden relationships among observations.  Normality: The data follows a normal distribution.
  • 30. Comparison tests usually compare the means of groups. These may be the means of different groups within a sample (e.g., a treatment and control group), the means of one sample group taken at different times (e.g., pretest and posttest scores), or a sample mean and a population mean.  A t test is for exactly 1 or 2 groups when the sample is small (30 or less).  A z test is for exactly 1 or 2 groups when the sample is large.  An ANOVA is for 3 or more groups.
  • 31. STEP 5: INTERPRET YOUR RESULTS Statistical significance  In hypothesis testing, statistical significance is the main criteria for forming conclusions. You compare your p value to a set significance level (usually 0.05) to decide whether your results are statistically significant or non-significant.  Statistically significant results are considered unlikely to have arisen solely due to chance. There is only a very low chance of such a result occurring if the null hypothesis is true in the population. 
  • 32. Effect size  A statistically significant result doesn’t necessarily mean that there are important real life applications or clinical outcomes for a finding.  In contrast, the effect size indicates the practical significance of your results. It’s important to report effect sizes along with your inferential statistics for a complete picture of your results. You should also report interval estimates of effect sizes if you’re writing an APA style paper.
  • 33. Decision errors  Type I and Type II errors are mistakes made in research conclusions. A Type I error means rejecting the null hypothesis when it’s actually true, while a Type II error means failing to reject the null hypothesis when it’s false.  You can aim to minimize the risk of these errors by selecting an optimal significance level and ensuring high power. However, there’s a trade-off between the two errors, so a fine balance is necessary.
  • 34. FREQUENTIST VERSUS BAYESIAN STATISTICS  Traditionally, frequentist statistics emphasizes null hypothesis significance testing and always starts with the assumption of a true null hypothesis.  However, Bayesian statistics has grown in popularity as an alternative approach in the last few decades. In this approach, you use previous research to continually update your hypotheses based on your expectations and observations.  Bayes factor compares the relative strength of evidence for the null versus the alternative hypothesis rather than making a conclusion about rejecting the null hypothesis or not.
  • 35. Linear regression makes one additional assumption:  The relationship between the independent and dependent variable is linear: the line of best fit through the data points is a straight line (rather than a curve or some sort of grouping factor).
  • 36.  Multiple linear regression is used to estimate the relationship between two or more independent variables and one dependent variable. You can use multiple linear regression when you want to know:  How strong the relationship is between two or more independent variables and one dependent variable (e.g. how rainfall, temperature, and amount of fertilizer added affect crop growth).  The value of the dependent variable at a certain value of the independent variables (e.g. the expected yield of a crop at certain levels of rainfall, temperature, and fertilizer addition).
  • 37. ASSUMPTIONS OF MULTIPLE LINEAR REGRESSION Multiple linear regression makes all of the same assumptions as simple linear regression:  Homogeneity of variance (homoscedasticity): the size of the error in our prediction doesn’t change significantly across the values of the independent variable.  Independence of observations: the observations in the dataset were collected using statistically valid methods, and there are no hidden relationships among variables.
  • 38. In multiple linear regression, it is possible that some of the independent variables are actually correlated with one another, so it is important to check these before developing the regression model. If two independent variables are too highly correlated (r2 > ~0.6), then only one of them should be used in the regression model.  Normality: The data follows a normal distribution.  Linearity: the line of best fit through the data points is a straight line, rather than a curve or some sort of grouping factor.