SlideShare a Scribd company logo
1 of 42
Summary of tools used for
analysis
Compiled by:
Subodh Khanal
Asst. Professor
Paklihawa Campus
Institute of Agriculture and Animal Science
Steps
ā€¢ Check your cases (individuals) and variables(characters) .
ā€¢ You may have entered cases in numeric (numbers) or string
(characters).
ā€¢ Make sure to mark as numeric if you have entered some number.
ā€¢ Check whether your variable is categorical or continuous.
ā€¢ Categorical means they might be in nominal or ordinal.
ā€¢ Continious means they might be in interval or ratio scales.
Some examples of nominal (refer level of
measurement slide for detail)
ā€¢ Treatments: 1(control), 2 (normal diet) 3 (improved diet)
ā€¢ Gender.
ā€¢ Yes/no
ā€¢ Ethnicity
ā€¢ Name of country
ā€¢ Infection: complicated
ā€¢ Color
Some examples of ordinal scales
ā€¢ Fertility: high, medium, low
ā€¢ Education level
ā€¢ Strongly agree, disagree
ā€¢ Ratings of movie (5 star, 1 star)
ā€¢ Feelings
ā€¢ Satisfaction
Some examples of interval scale
ā€¢ Temperature (degree Celsius)
ā€¢ Marks (percentage)
ā€¢ Time
Ratio
ā€¢ Yield
ā€¢ Temperature (kelvin)
ā€¢ Income
When to use frequencies?
ā€¢ Nominal or ordinal
ā€¢ Categorical options
ā€¢ See valid percent in output
When to use descriptive?
ā€¢ Open ended Continous variable
ā€¢ See mean, standard deviation, standard error, skewness and kurtosis
ā€¢ Calculate z score (skewness/its s.e. and kurtosis/se)
ā€¢ Should be within -1.96to 1.96
ā€¢ Calculate standardized value (z) in Z also
ā€¢ The value above 2.5 is outlier and above 3.2 is extreme outlier.
ā€¢ Also see histogram and normality of curve.
ā€¢ Remember to use explore data and see box plot also and Q-Q curve.
When to use chi square?
ā€¢ Non parametric
ā€¢ Both dependent and independent variables are categorical.
ā€¢ Put independent in columns.
ā€¢ Remember the variable whose effect you are seeing is independent
variable and on whom the effect is being imposed is dependent.
ā€¢ Select kappa
ā€¢ Select phi and cramers V
ā€¢ Phi for 2x2 matrix, cramers v for all others.
ā€¢ See values of kappa, phi and cramers v in next slide.
ā€¢ In output see pearsonā€™s chi square, Fischer and likelihood ratio
Data requirements for chi square test
ā€¢ Two categorical variables.
ā€¢ Two or more categories (groups) for each variable.
ā€¢ Independence of observations.
ā€¢ There is no relationship between the subjects in each group.
ā€¢ The categorical variables are not "paired" in any way (e.g. pre-test/post-test
observations).
ā€¢ Relatively large sample size.
ā€¢ Expected frequencies for each cell are at least 1.
ā€¢ Expected frequencies should be at least 5 for the majority (80%) of the cells.
When we try to compare proportions of a categorical outcome
according to different independent groups, we can consider several
statistical tests such as chi-squared test, Fisher's exact test, or z-test.
ā€¢ Fisher's exact test is practically applied only in analysis of small samples but
actually it is valid for all sample sizes.
ā€¢ While the chi-squared test relies on an approximation, Fisher's exact test is
one of exact tests.
ā€¢ Especially when more than 20% of cells have expected frequencies < 5, we
need to use Fisher's exact test because applying approximation method is
inadequate.
ā€¢ In SPSS unless you have the SPSS Exact Test Module, you can only perform
a Fisherā€™s exact test on a 2Ɨ2 table, and these results are presented by
default.
ā€¢ https://www.socscistatistics.com/tests/chisquare2/default2.aspx performs
fischer exact statistics for 5X5
ā€¢ If expected count is more than 5 see pearson chi square.
So here is how you do it
ā€¢ Go to analyse
ā€¢ Descriptive
ā€¢ Click cross tab
ā€¢ Select variables on row
(dependent) and column
(independent)
ā€¢ Click exact
ā€¢ Click exact (time per limit test)
ā€¢ Click continue
ā€¢ Click ok
ā€¢ As 100% of cells have expected count less than 5 see Fischer exact test
ā€¢ To see chi square test at least 80% of cells must have expected count
more than 5(20% cells have expected count less than 5).
ā€¢ Likelihood ratio (g-test) is also an option in this case. But Fischer exact
test is more common.
One sample t test
ā€¢ The One Sample t Test determines whether the sample mean is
statistically different from a known or hypothesized population mean.
The One Sample t Test is a parametric test.
ā€¢ Also known as single sample t test.
ā€¢ Variable used is called as test variable.
ā€¢ In a One Sample t Test, the test variable is compared against a "test
value", which is a known or hypothesized value of the mean in the
population.
It is commonly used when:
ā€¢ Statistical difference between a sample mean and a known or hypothesized
value of the mean in the population.
ā€¢ Statistical difference between the sample mean and the sample midpoint
of the test variable.
ā€¢ Statistical difference between the sample mean of the test variable and
chance.
ā€¢ This approach involves first calculating the chance level on the test variable. The
chance level is then used as the test value against which the sample mean of the test
variable is compared.
ā€¢ Statistical difference between a change score and zero.
ā€¢ This approach involves creating a change score from two variables, and then
comparing the mean change score to zero, which will indicate whether any change
occurred between the two time points for the original measures. If the mean change
score is not significantly different from zero, no significant change occurred
Requirement for one sample t test
ā€¢ Test variable that is continuous (i.e., interval or ratio level)
ā€¢ Scores on the test variable are independent (i.e., independence of observations)
ā€¢ There is no relationship between scores on the test variable
ā€¢ Violation of this assumption will yield an inaccurate p value
ā€¢ Random sample of data from the population
ā€¢ Normal distribution (approximately) of the sample and population on the test
variable
ā€¢ Non-normal population distributions, especially those that are thick-tailed or heavily skewed,
considerably reduce the power of the test
ā€¢ Among moderate or large samples, a violation of normality may still yield accurate p values
ā€¢ Homogeneity of variances (i.e., variances approximately equal in both the sample
and population)
ā€¢ No outliers
Paired sample t test
ā€¢ The Paired Samples t Test compares two means that are from the
same individual, object, or related units. The two means can
represent things like:
ļƒ¼A measurement taken at two different times (e.g., pre-test and post-
test with an intervention administered between the two time points)
ļƒ¼A measurement taken under two different conditions (e.g.,
completing a test under a "control" condition and an "experimental"
condition)
ļƒ¼Measurements taken from two halves or sides of a subject or
experimental unit (e.g., measuring hearing loss in a subject's left and
right ears).
Also known as
ā€¢ Dependent t Test
ā€¢ Paired t Test
ā€¢ Repeated Measures t Test
The variable used in this test is known as:
ā€¢ Dependent variable, or test variable (continuous), measured at two
different times or for two related conditions or units
Used for observing
ā€¢ Statistical difference between two time points
ā€¢ Statistical difference between two conditions
ā€¢ Statistical difference between two measurements
ā€¢ Statistical difference between a matched pair
Note: The Paired Samples t Test can only compare the means for two (and only
two) related (paired) units on a continuous outcome that is normally distributed.
The Paired Samples t Test is not appropriate for analyses involving the following: 1)
unpaired data; 2) comparisons between more than two units/groups; 3) a
continuous outcome that is not normally distributed; and 4) an ordinal/ranked
outcome.
Moreover,
ā€¢ To compare unpaired means between two groups on a continuous
outcome that is normally distributed, choose the Independent
Samples t Test.
ā€¢ To compare unpaired means between more than two groups on a
continuous outcome that is normally distributed, choose ANOVA.
ā€¢ To compare paired means for continuous data that are not normally
distributed, choose the nonparametric Wilcoxon Signed-Ranks Test.
ā€¢ To compare paired means for ranked data, choose the nonparametric
Wilcoxon Signed-Ranks Test.
Requirements for paired sample t test
ā€¢ Dependent variable that is continuous (i.e., interval or ratio level)
ā€¢ Note: The paired measurements must be recorded in two separate variables.
ā€¢ Related samples/groups (i.e., dependent observations)
ā€¢ The subjects in each sample, or group, are the same. This means that the subjects in the first
group are also in the second group.
ā€¢ Random sample of data from the population
ā€¢ Normal distribution (approximately) of the difference between the paired values
ā€¢ No outliers in the difference between the two related groups
ā€¢ Note: When testing assumptions related to normality and outliers, you must use a
variable that represents the difference between the paired values - not the original
variables themselves.
ā€¢ Note: When one or more of the assumptions for the Paired Samples t Test are not met,
you may want to run the nonparametric Wilcoxon Signed-Ranks Test instead.
Independent sample t test
ā€¢ The Independent Samples t Test compares the means of two independent groups in
order to determine whether there is statistical evidence that the associated population
means are significantly different. The Independent Samples t Test is a parametric test.
ā€¢ This test is also known as:
ļ±Independent t Test
ļ±Independent Measures t Test
ļ±Independent Two-sample t Test
ļ±Student t Test
ļ±Two-Sample t Test
ļ±Uncorrelated Scores t Test
ļ±Unpaired t Test
ļ±Unrelated t Test
ā€¢ The variables used in this test are known as:
ļ±Dependent variable, or test variable
ļ±Independent variable, or grouping variable
Used for testing the following
ā€¢ Statistical differences between the means of two groups
ā€¢ Statistical differences between the means of two interventions
ā€¢ Statistical differences between the means of two change scores
Note: The Independent Samples t Test can only compare the means for two
(and only two) groups.
It cannot make comparisons among more than two groups.
If you wish to compare the means across more than two groups, you will
likely want to run an ANOVA.
Requirement
ā€¢ Dependent variable that is continuous (i.e., interval or ratio level)
ā€¢ Independent variable that is categorical (i.e., two groups)
ā€¢ Cases that have values on both the dependent and independent variables
ā€¢ Independent samples/groups (i.e., independence of observations)
ā€¢ There is no relationship between the subjects in each sample. This means that:
ā€¢ Subjects in the first group cannot also be in the second group
ā€¢ No subject in either group can influence subjects in the other group
ā€¢ No group can influence the other group
ā€¢ Violation of this assumption will yield an inaccurate p value
ā€¢ Random sample of data from the population
ā€¢ Normal distribution (approximately) of the dependent variable for each group
ā€¢ Non-normal population distributions, especially those that are thick-tailed or heavily skewed, considerably reduce the power of the test
ā€¢ Among moderate or large samples, a violation of normality may still yield accurate p values
ā€¢ Homogeneity of variances (i.e., variances approximately equal across groups)
ā€¢ When this assumption is violated and the sample sizes for each group differ, the p value is not trustworthy. However, the Independent
Samples t Test output also includes an approximate t statistic that is not based on assuming equal population variances. This alternative
statistic, called the Welch t Test statistic1, may be used when equal variances among populations cannot be assumed. The Welch t Test
is also known an Unequal Variance t Test or Separate Variances t Test.
ā€¢ No outliers
ā€¢ Note: When one or more of the assumptions for the Independent Samples t Test are not met, you may want to run the nonparametric Mann-
Whitney U Test instead.
One way ANOVA
ā€¢ One-Way ANOVA ("analysis of variance") compares the means of two or
more independent groups in order to determine whether there is statistical
evidence that the associated population means are significantly different.
One-Way ANOVA is a parametric test.
ā€¢ This test is also known as:
ļ±One-Factor ANOVA
ļ±One-Way Analysis of Variance
ļ±Between Subjects ANOVA
ā€¢ The variables used in this test are known as:
ā€¢ Dependent variable
ā€¢ Independent variable (also known as the grouping variable, or factor)
ā€¢ This variable divides cases into two or more mutually exclusive levels, or groups
Used for
ā€¢ Field studies
ā€¢ Experiments
ā€¢ Quasi-experiments
ā€¢ The One-Way ANOVA is commonly used to test the following:
ļ±Statistical differences among the means of two or more groups
ļ±Statistical differences among the means of two or more interventions
ļ±Statistical differences among the means of two or more change scores
ā€¢ Note: Both the One-Way ANOVA and the Independent Samples t Test can compare the
means for two groups. However, only the One-Way ANOVA can compare the means
across three or more groups.
ā€¢ Note: If the grouping variable has only two groups, then the results of a one-way ANOVA
and the independent samples t test will be equivalent. In fact, if you run both an
independent samples t test and a one-way ANOVA in this situation, you should be able to
confirm that t2=F.
Requirement
ā€¢ Dependent variable that is continuous (i.e., interval or ratio level)
ā€¢ Independent variable that is categorical (i.e., two or more groups)
ā€¢ Cases that have values on both the dependent and independent variables
ā€¢ Independent samples/groups (i.e., independence of observations)
ā€¢ There is no relationship between the subjects in each sample. This means that:
ā€¢ subjects in the first group cannot also be in the second group
ā€¢ no subject in either group can influence subjects in the other group
ā€¢ no group can influence the other group
ā€¢ Random sample of data from the population
ā€¢ Normal distribution (approximately) of the dependent variable for each group (i.e., for
each level of the factor)
ā€¢ Non-normal population distributions, especially those that are thick-tailed or heavily skewed,
considerably reduce the power of the test
ā€¢ Among moderate or large samples, a violation of normality may yield fairly accurate p values
Continued ā€¦ā€¦ā€¦ā€¦ā€¦ā€¦ā€¦ā€¦..
ā€¢ Homogeneity of variances (i.e., variances approximately equal across
groups)
ā€¢ When this assumption is violated and the sample sizes differ among groups,
the p value for the overall F test is not trustworthy. These conditions warrant
using alternative statistics that do not assume equal variances among
populations, such as the Browne-Forsythe or Welch statistics (available
via Options in the One-Way ANOVA dialog box).
ā€¢ When this assumption is violated, regardless of whether the group sample
sizes are fairly equal, the results may not be trustworthy for post hoc tests.
When variances are unequal, post hoc tests that do not assume equal
variances should be used.
ā€¢ No outliers
Correlation
ā€¢ Pearson Correlation produces a sample correlation coefficient, r,
which measures the strength and direction of linear relationships
between pairs of continuous variables. By extension, the Pearson
Correlation evaluates whether there is statistical evidence for a linear
relationship among the same pairs of variables in the population,
represented by a population correlation coefficient, Ļ (ā€œrhoā€). The
Pearson Correlation is a parametric measure.
ā€¢ This measure is also known as:
ļ±Pearsonā€™s correlation
ļ±Pearson product-moment correlation (PPMC)
Used for
ā€¢ Correlations among pairs of variables
ā€¢ Correlations within and between sets of variables
ā€¢ The bivariate Pearson correlation indicates the following:
ļ±Whether a statistically significant linear relationship exists between two continuous
variables
ļ±The strength of a linear relationship (i.e., how close the relationship is to being a
perfectly straight line)
ļ±The direction of a linear relationship (increasing or decreasing)
ā€¢ Note: The bivariate Pearson Correlation cannot address non-linear relationships or
relationships among categorical variables. If you wish to understand relationships that
involve categorical variables and/or non-linear relationships, you will need to
choose another measure of association.
ā€¢ Note: The bivariate Pearson Correlation only reveals associations among continuous
variables. The bivariate Pearson Correlation does not provide any inferences about
causation, no matter how large the correlation coefficient is.
Requirement
ā€¢ Two or more continuous variables (i.e., interval or ratio level)
ā€¢ Cases that have values on both variables
ā€¢ Linear relationship between the variables
ā€¢ Independent cases (i.e., independence of observations)
ā€¢ There is no relationship between the values of variables between cases. This means that:
ā€¢ the values for all variables across cases are unrelated
ā€¢ for any case, the value for any variable cannot influence the value of any variable for other cases
ā€¢ no case can influence another case on any variable
ā€¢ The biviariate Pearson correlation coefficient and corresponding significance test are not robust when
independence is violated.
ā€¢ Bivariate normality
ā€¢ Each pair of variables is bivariately normally distributed
ā€¢ Each pair of variables is bivariately normally distributed at all levels of the other variable(s)
ā€¢ This assumption ensures that the variables are linearly related; violations of this assumption may indicate that
non-linear relationships among variables exist. Linearity can be assessed visually using a scatterplot of the
data.
ā€¢ Random sample of data from the population
ā€¢ No outliers
linear Regression analysis
ā€¢ Linear regression is the next step up after correlation.
ā€¢ It is used when we want to predict the value of a variable based on the
value of another variable.
ā€¢ The variable we want to predict is called the dependent variable (or
sometimes, the outcome variable).
ā€¢ The variable we are using to predict the other variable's value is called the
independent variable (or sometimes, the predictor variable).
ā€¢ For example, you could use linear regression to understand whether yield
performance can be predicted based on dose and practices of manure
application ; whether cigarette consumption can be predicted based on
smoking duration; and so forth.
ā€¢ If you have two or more independent variables, rather than just one, you
need to use multiple regression.
Used for
ā€¢ Model multiple independent variables
ā€¢ Include continuous and categorical variables
ā€¢ Use polynomial terms to model curvature
ā€¢ Assess interaction terms to determine whether the effect of one
independent variable depends on the value of another variable
Requirements
ā€¢ Your two variables should be measured at the continuous level ( interval or ratio scales)
ā€¢ There needs to be a linear relationship between the two variables. (see through scatter plots)
ļ±If the relationship displayed in your scatterplot is not linear, you will have to either run a non-
linear regression analysis, perform a polynomial regression or "transform" your data, which you
can do using SPSS Statistics.
ā€¢ There should be no significant outliers.
ā€¢ You should have independence of observations, which you can easily check using the Durbin-
Watson statistic, which is a simple test to run using SPSS Statistics.
ā€¢ Your data needs to show homoscedasticity, which is where the variances along the line of best fit
remain similar as you move along the line
ā€¢ Finally, you need to check that the residuals (errors) of the regression line are approximately
normally distributed (we explain these terms in our enhanced linear regression guide). Two
common methods to check this assumption include using either a histogram (with a
superimposed normal curve) or a Normal P-P Plot.
Binary logistic regression
ā€¢ Binary logistic regression models the relationship between a set of predictors and
a binary response variable. A binary response has only two possible values, such
as win and lose.
ā€¢ Use a binary regression model to understand how changes in the predictor values
are associated with changes in the probability of an event occurring.
ā€¢ Where the dependent variable is dichotomous or binary in nature, we cannot use
simple linear regression. Logistic regression is the statistical technique used to
predict the relationship between predictors (our independent variables) and a
predicted variable (the dependent variable) where the dependent variable is
binary (e.g., sex [male vs. female], response [yes vs. no], score [high vs. low],
etcā€¦).
ā€¢ There must be two or more independent variables, or predictors, for a logistic
regression. The IVs, or predictors, can be continuous (interval/ratio) or
categorical (ordinal/nominal).
ā€¢ All predictor variables are tested in one block to assess their predictive ability
while controlling for the effects of other predictors in the model.
Uses
ā€¢ Logistic regression is a powerful statistical way of modeling a binomial
outcome (takes the value 0 or 1 like having or not having a disease)
with one or more explanatory variables.
ā€¢ logistic regression provides a quantified value for the strength of the
association adjusting for other variables (removes confounding
effects).
ā€¢ The exponential of coefficients correspond to odd ratios for the given
factor.
Requirements
ā€¢ Dependent variable to be categorical and dichotomous.
ā€¢ The error term need to be independent.
ā€¢ Linearity of predictors, independent variables and log of odds
If odds ratio is
ā€¢ >1 subtract that value with -1. e.g if odds ratio is 4.5 then the value
has 4.5-1 times higher than the odds for other option.
ā€¢ If <1 then substract with 1 e.g. if odds ratio is 0.07, it will have 1-
0.07=0.93 i.e. 93% increase in the odds.
Thank you

More Related Content

What's hot

T-test in statistics for data science
T-test in statistics for data science T-test in statistics for data science
T-test in statistics for data science Learnbay Datascience
Ā 
non parametric statistics
non parametric statisticsnon parametric statistics
non parametric statisticsAnchal Garg
Ā 
Student T - test
Student T -  testStudent T -  test
Student T - testAfra Fathima
Ā 
t Test- Thiyagu
t Test- Thiyagut Test- Thiyagu
t Test- ThiyaguThiyagu K
Ā 
Parametric Statistical tests
Parametric Statistical testsParametric Statistical tests
Parametric Statistical testsSundar B N
Ā 
F test and ANOVA
F test and ANOVAF test and ANOVA
F test and ANOVAMEENURANJI
Ā 
T test^jsample size^j ethics
T test^jsample size^j ethicsT test^jsample size^j ethics
T test^jsample size^j ethicsAbhishek Thakur
Ā 
Statistical test
Statistical testStatistical test
Statistical testKumar Mrigesh
Ā 
Parametric test - t Test, ANOVA, ANCOVA, MANOVA
Parametric test  - t Test, ANOVA, ANCOVA, MANOVAParametric test  - t Test, ANOVA, ANCOVA, MANOVA
Parametric test - t Test, ANOVA, ANCOVA, MANOVAPrincy Francis M
Ā 
Presentation chi-square test & Anova
Presentation   chi-square test & AnovaPresentation   chi-square test & Anova
Presentation chi-square test & AnovaSonnappan Sridhar
Ā 
Statistics for Librarians, Session 3: Inferential statistics
Statistics for Librarians, Session 3: Inferential statisticsStatistics for Librarians, Session 3: Inferential statistics
Statistics for Librarians, Session 3: Inferential statisticsUniversity of North Texas
Ā 
Basics of Hypothesis testing for Pharmacy
Basics of Hypothesis testing for PharmacyBasics of Hypothesis testing for Pharmacy
Basics of Hypothesis testing for PharmacyParag Shah
Ā 

What's hot (20)

Non parametric test
Non parametric testNon parametric test
Non parametric test
Ā 
T-test in statistics for data science
T-test in statistics for data science T-test in statistics for data science
T-test in statistics for data science
Ā 
non parametric statistics
non parametric statisticsnon parametric statistics
non parametric statistics
Ā 
Student T - test
Student T -  testStudent T -  test
Student T - test
Ā 
t Test- Thiyagu
t Test- Thiyagut Test- Thiyagu
t Test- Thiyagu
Ā 
Parametric test
Parametric testParametric test
Parametric test
Ā 
Parametric Statistical tests
Parametric Statistical testsParametric Statistical tests
Parametric Statistical tests
Ā 
T test and types of t-test
T test and types of t-testT test and types of t-test
T test and types of t-test
Ā 
F test and ANOVA
F test and ANOVAF test and ANOVA
F test and ANOVA
Ā 
Student t test
Student t testStudent t test
Student t test
Ā 
T test^jsample size^j ethics
T test^jsample size^j ethicsT test^jsample size^j ethics
T test^jsample size^j ethics
Ā 
Statistical test
Statistical testStatistical test
Statistical test
Ā 
Paired t Test
Paired t TestPaired t Test
Paired t Test
Ā 
Parametric test - t Test, ANOVA, ANCOVA, MANOVA
Parametric test  - t Test, ANOVA, ANCOVA, MANOVAParametric test  - t Test, ANOVA, ANCOVA, MANOVA
Parametric test - t Test, ANOVA, ANCOVA, MANOVA
Ā 
Tests of significance z &amp; t test
Tests of significance z &amp; t testTests of significance z &amp; t test
Tests of significance z &amp; t test
Ā 
Student's T Test
Student's T TestStudent's T Test
Student's T Test
Ā 
Non parametric tests
Non parametric testsNon parametric tests
Non parametric tests
Ā 
Presentation chi-square test & Anova
Presentation   chi-square test & AnovaPresentation   chi-square test & Anova
Presentation chi-square test & Anova
Ā 
Statistics for Librarians, Session 3: Inferential statistics
Statistics for Librarians, Session 3: Inferential statisticsStatistics for Librarians, Session 3: Inferential statistics
Statistics for Librarians, Session 3: Inferential statistics
Ā 
Basics of Hypothesis testing for Pharmacy
Basics of Hypothesis testing for PharmacyBasics of Hypothesis testing for Pharmacy
Basics of Hypothesis testing for Pharmacy
Ā 

Similar to Summary of statistical tools used in spss

Parametric and non parametric test in biostatistics
Parametric and non parametric test in biostatistics Parametric and non parametric test in biostatistics
Parametric and non parametric test in biostatistics Mero Eye
Ā 
1 ANOVA.ppt
1 ANOVA.ppt1 ANOVA.ppt
1 ANOVA.pptAlemayehu70
Ā 
Some statistical concepts relevant to proteomics data analysis
Some statistical concepts relevant to proteomics data analysisSome statistical concepts relevant to proteomics data analysis
Some statistical concepts relevant to proteomics data analysisUC Davis
Ā 
tests of significance
tests of significancetests of significance
tests of significancebenita regi
Ā 
Inferential statistics quantitative data - single sample and 2 groups
Inferential statistics   quantitative data - single sample and 2 groupsInferential statistics   quantitative data - single sample and 2 groups
Inferential statistics quantitative data - single sample and 2 groupsDhritiman Chakrabarti
Ā 
When to use, What Statistical Test for data Analysis modified.pptx
When to use, What Statistical Test for data Analysis modified.pptxWhen to use, What Statistical Test for data Analysis modified.pptx
When to use, What Statistical Test for data Analysis modified.pptxAsokan R
Ā 
Statistics for Anaesthesiologists
Statistics for AnaesthesiologistsStatistics for Anaesthesiologists
Statistics for Anaesthesiologistsxeonfusion
Ā 
Biomedical statistics
Biomedical statisticsBiomedical statistics
Biomedical statisticsAbdullaAhmad6
Ā 
(Individuals With Disabilities Act Transformation Over the Years)D
(Individuals With Disabilities Act Transformation Over the Years)D(Individuals With Disabilities Act Transformation Over the Years)D
(Individuals With Disabilities Act Transformation Over the Years)DSilvaGraf83
Ā 
(Individuals With Disabilities Act Transformation Over the Years)D
(Individuals With Disabilities Act Transformation Over the Years)D(Individuals With Disabilities Act Transformation Over the Years)D
(Individuals With Disabilities Act Transformation Over the Years)DMoseStaton39
Ā 
1. complete stats notes
1. complete stats notes1. complete stats notes
1. complete stats notesBob Smullen
Ā 
Non parametric test
Non parametric testNon parametric test
Non parametric testNeetathakur3
Ā 
Non parametric-tests
Non parametric-testsNon parametric-tests
Non parametric-testsAsmita Bhagdikar
Ā 
Quantitative analysis
Quantitative analysisQuantitative analysis
Quantitative analysisRajesh Mishra
Ā 
F unit 5.pptx
F unit 5.pptxF unit 5.pptx
F unit 5.pptxagreshgupta
Ā 
Testing of hypothesis.pptx
Testing of hypothesis.pptxTesting of hypothesis.pptx
Testing of hypothesis.pptxSyedaKumail
Ā 

Similar to Summary of statistical tools used in spss (20)

Parametric and non parametric test in biostatistics
Parametric and non parametric test in biostatistics Parametric and non parametric test in biostatistics
Parametric and non parametric test in biostatistics
Ā 
UNIT 5.pptx
UNIT 5.pptxUNIT 5.pptx
UNIT 5.pptx
Ā 
1 ANOVA.ppt
1 ANOVA.ppt1 ANOVA.ppt
1 ANOVA.ppt
Ā 
Workshop on Data Analysis and Result Interpretation in Social Science Researc...
Workshop on Data Analysis and Result Interpretation in Social Science Researc...Workshop on Data Analysis and Result Interpretation in Social Science Researc...
Workshop on Data Analysis and Result Interpretation in Social Science Researc...
Ā 
Some statistical concepts relevant to proteomics data analysis
Some statistical concepts relevant to proteomics data analysisSome statistical concepts relevant to proteomics data analysis
Some statistical concepts relevant to proteomics data analysis
Ā 
tests of significance
tests of significancetests of significance
tests of significance
Ā 
Inferential statistics quantitative data - single sample and 2 groups
Inferential statistics   quantitative data - single sample and 2 groupsInferential statistics   quantitative data - single sample and 2 groups
Inferential statistics quantitative data - single sample and 2 groups
Ā 
When to use, What Statistical Test for data Analysis modified.pptx
When to use, What Statistical Test for data Analysis modified.pptxWhen to use, What Statistical Test for data Analysis modified.pptx
When to use, What Statistical Test for data Analysis modified.pptx
Ā 
Statistics for Anaesthesiologists
Statistics for AnaesthesiologistsStatistics for Anaesthesiologists
Statistics for Anaesthesiologists
Ā 
Biomedical statistics
Biomedical statisticsBiomedical statistics
Biomedical statistics
Ā 
(Individuals With Disabilities Act Transformation Over the Years)D
(Individuals With Disabilities Act Transformation Over the Years)D(Individuals With Disabilities Act Transformation Over the Years)D
(Individuals With Disabilities Act Transformation Over the Years)D
Ā 
(Individuals With Disabilities Act Transformation Over the Years)D
(Individuals With Disabilities Act Transformation Over the Years)D(Individuals With Disabilities Act Transformation Over the Years)D
(Individuals With Disabilities Act Transformation Over the Years)D
Ā 
1. complete stats notes
1. complete stats notes1. complete stats notes
1. complete stats notes
Ā 
statistical test.pptx
statistical test.pptxstatistical test.pptx
statistical test.pptx
Ā 
Non parametric test
Non parametric testNon parametric test
Non parametric test
Ā 
Non parametric-tests
Non parametric-testsNon parametric-tests
Non parametric-tests
Ā 
Quantitative analysis
Quantitative analysisQuantitative analysis
Quantitative analysis
Ā 
F unit 5.pptx
F unit 5.pptxF unit 5.pptx
F unit 5.pptx
Ā 
Testing of hypothesis.pptx
Testing of hypothesis.pptxTesting of hypothesis.pptx
Testing of hypothesis.pptx
Ā 
Workshop on Data Analysis and Result Interpretation in Social Science Researc...
Workshop on Data Analysis and Result Interpretation in Social Science Researc...Workshop on Data Analysis and Result Interpretation in Social Science Researc...
Workshop on Data Analysis and Result Interpretation in Social Science Researc...
Ā 

More from Subodh Khanal

Introduction to crop physiology
Introduction to crop physiology Introduction to crop physiology
Introduction to crop physiology Subodh Khanal
Ā 
Botanicals ....a safe solution
Botanicals ....a safe solutionBotanicals ....a safe solution
Botanicals ....a safe solutionSubodh Khanal
Ā 
Things to consider while writing scientific article
Things to consider while writing scientific articleThings to consider while writing scientific article
Things to consider while writing scientific articleSubodh Khanal
Ā 
Dream for a better world from agroecological prespective
Dream for a better world from agroecological prespectiveDream for a better world from agroecological prespective
Dream for a better world from agroecological prespectiveSubodh Khanal
Ā 
Sustainable Intensification of biodiversity in agroecosystem through conserva...
Sustainable Intensification of biodiversity in agroecosystem through conserva...Sustainable Intensification of biodiversity in agroecosystem through conserva...
Sustainable Intensification of biodiversity in agroecosystem through conserva...Subodh Khanal
Ā 
Climate smart agriculture
Climate smart agricultureClimate smart agriculture
Climate smart agricultureSubodh Khanal
Ā 
Introduction to spss
Introduction to spssIntroduction to spss
Introduction to spssSubodh Khanal
Ā 
Class note of Introductory crop physiology
Class note of Introductory crop physiologyClass note of Introductory crop physiology
Class note of Introductory crop physiologySubodh Khanal
Ā 
Level of measurement
Level of measurementLevel of measurement
Level of measurementSubodh Khanal
Ā 
Introduction to spss
Introduction to spssIntroduction to spss
Introduction to spssSubodh Khanal
Ā 
Medicinal plants of nepal
Medicinal plants of nepalMedicinal plants of nepal
Medicinal plants of nepalSubodh Khanal
Ā 

More from Subodh Khanal (11)

Introduction to crop physiology
Introduction to crop physiology Introduction to crop physiology
Introduction to crop physiology
Ā 
Botanicals ....a safe solution
Botanicals ....a safe solutionBotanicals ....a safe solution
Botanicals ....a safe solution
Ā 
Things to consider while writing scientific article
Things to consider while writing scientific articleThings to consider while writing scientific article
Things to consider while writing scientific article
Ā 
Dream for a better world from agroecological prespective
Dream for a better world from agroecological prespectiveDream for a better world from agroecological prespective
Dream for a better world from agroecological prespective
Ā 
Sustainable Intensification of biodiversity in agroecosystem through conserva...
Sustainable Intensification of biodiversity in agroecosystem through conserva...Sustainable Intensification of biodiversity in agroecosystem through conserva...
Sustainable Intensification of biodiversity in agroecosystem through conserva...
Ā 
Climate smart agriculture
Climate smart agricultureClimate smart agriculture
Climate smart agriculture
Ā 
Introduction to spss
Introduction to spssIntroduction to spss
Introduction to spss
Ā 
Class note of Introductory crop physiology
Class note of Introductory crop physiologyClass note of Introductory crop physiology
Class note of Introductory crop physiology
Ā 
Level of measurement
Level of measurementLevel of measurement
Level of measurement
Ā 
Introduction to spss
Introduction to spssIntroduction to spss
Introduction to spss
Ā 
Medicinal plants of nepal
Medicinal plants of nepalMedicinal plants of nepal
Medicinal plants of nepal
Ā 

Recently uploaded

21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptxJoelynRubio1
Ā 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
Ā 
dusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningdusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningMarc Dusseiller Dusjagr
Ā 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
Ā 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
Ā 
Simple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdfSimple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdfstareducators107
Ā 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva
Ā 
How to Add a Tool Tip to a Field in Odoo 17
How to Add a Tool Tip to a Field in Odoo 17How to Add a Tool Tip to a Field in Odoo 17
How to Add a Tool Tip to a Field in Odoo 17Celine George
Ā 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...Amil baba
Ā 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSAnaAcapella
Ā 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
Ā 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
Ā 
Tatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsTatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsNbelano25
Ā 
How to Manage Call for Tendor in Odoo 17
How to Manage Call for Tendor in Odoo 17How to Manage Call for Tendor in Odoo 17
How to Manage Call for Tendor in Odoo 17Celine George
Ā 
Basic Intentional Injuries Health Education
Basic Intentional Injuries Health EducationBasic Intentional Injuries Health Education
Basic Intentional Injuries Health EducationNeilDeclaro1
Ā 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxCeline George
Ā 
AIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptAIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptNishitharanjan Rout
Ā 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
Ā 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
Ā 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxPooja Bhuva
Ā 

Recently uploaded (20)

21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
Ā 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
Ā 
dusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningdusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learning
Ā 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
Ā 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
Ā 
Simple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdfSimple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdf
Ā 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Ā 
How to Add a Tool Tip to a Field in Odoo 17
How to Add a Tool Tip to a Field in Odoo 17How to Add a Tool Tip to a Field in Odoo 17
How to Add a Tool Tip to a Field in Odoo 17
Ā 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
Ā 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Ā 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
Ā 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Ā 
Tatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsTatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf arts
Ā 
How to Manage Call for Tendor in Odoo 17
How to Manage Call for Tendor in Odoo 17How to Manage Call for Tendor in Odoo 17
How to Manage Call for Tendor in Odoo 17
Ā 
Basic Intentional Injuries Health Education
Basic Intentional Injuries Health EducationBasic Intentional Injuries Health Education
Basic Intentional Injuries Health Education
Ā 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
Ā 
AIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptAIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.ppt
Ā 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
Ā 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
Ā 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
Ā 

Summary of statistical tools used in spss

  • 1. Summary of tools used for analysis Compiled by: Subodh Khanal Asst. Professor Paklihawa Campus Institute of Agriculture and Animal Science
  • 2. Steps ā€¢ Check your cases (individuals) and variables(characters) . ā€¢ You may have entered cases in numeric (numbers) or string (characters). ā€¢ Make sure to mark as numeric if you have entered some number. ā€¢ Check whether your variable is categorical or continuous. ā€¢ Categorical means they might be in nominal or ordinal. ā€¢ Continious means they might be in interval or ratio scales.
  • 3. Some examples of nominal (refer level of measurement slide for detail) ā€¢ Treatments: 1(control), 2 (normal diet) 3 (improved diet) ā€¢ Gender. ā€¢ Yes/no ā€¢ Ethnicity ā€¢ Name of country ā€¢ Infection: complicated ā€¢ Color
  • 4. Some examples of ordinal scales ā€¢ Fertility: high, medium, low ā€¢ Education level ā€¢ Strongly agree, disagree ā€¢ Ratings of movie (5 star, 1 star) ā€¢ Feelings ā€¢ Satisfaction
  • 5. Some examples of interval scale ā€¢ Temperature (degree Celsius) ā€¢ Marks (percentage) ā€¢ Time
  • 6. Ratio ā€¢ Yield ā€¢ Temperature (kelvin) ā€¢ Income
  • 7. When to use frequencies? ā€¢ Nominal or ordinal ā€¢ Categorical options ā€¢ See valid percent in output
  • 8. When to use descriptive? ā€¢ Open ended Continous variable ā€¢ See mean, standard deviation, standard error, skewness and kurtosis ā€¢ Calculate z score (skewness/its s.e. and kurtosis/se) ā€¢ Should be within -1.96to 1.96 ā€¢ Calculate standardized value (z) in Z also ā€¢ The value above 2.5 is outlier and above 3.2 is extreme outlier. ā€¢ Also see histogram and normality of curve. ā€¢ Remember to use explore data and see box plot also and Q-Q curve.
  • 9. When to use chi square? ā€¢ Non parametric ā€¢ Both dependent and independent variables are categorical. ā€¢ Put independent in columns. ā€¢ Remember the variable whose effect you are seeing is independent variable and on whom the effect is being imposed is dependent. ā€¢ Select kappa ā€¢ Select phi and cramers V ā€¢ Phi for 2x2 matrix, cramers v for all others. ā€¢ See values of kappa, phi and cramers v in next slide. ā€¢ In output see pearsonā€™s chi square, Fischer and likelihood ratio
  • 10. Data requirements for chi square test ā€¢ Two categorical variables. ā€¢ Two or more categories (groups) for each variable. ā€¢ Independence of observations. ā€¢ There is no relationship between the subjects in each group. ā€¢ The categorical variables are not "paired" in any way (e.g. pre-test/post-test observations). ā€¢ Relatively large sample size. ā€¢ Expected frequencies for each cell are at least 1. ā€¢ Expected frequencies should be at least 5 for the majority (80%) of the cells.
  • 11. When we try to compare proportions of a categorical outcome according to different independent groups, we can consider several statistical tests such as chi-squared test, Fisher's exact test, or z-test. ā€¢ Fisher's exact test is practically applied only in analysis of small samples but actually it is valid for all sample sizes. ā€¢ While the chi-squared test relies on an approximation, Fisher's exact test is one of exact tests. ā€¢ Especially when more than 20% of cells have expected frequencies < 5, we need to use Fisher's exact test because applying approximation method is inadequate. ā€¢ In SPSS unless you have the SPSS Exact Test Module, you can only perform a Fisherā€™s exact test on a 2Ɨ2 table, and these results are presented by default. ā€¢ https://www.socscistatistics.com/tests/chisquare2/default2.aspx performs fischer exact statistics for 5X5 ā€¢ If expected count is more than 5 see pearson chi square.
  • 12. So here is how you do it ā€¢ Go to analyse ā€¢ Descriptive ā€¢ Click cross tab ā€¢ Select variables on row (dependent) and column (independent) ā€¢ Click exact ā€¢ Click exact (time per limit test) ā€¢ Click continue ā€¢ Click ok
  • 13. ā€¢ As 100% of cells have expected count less than 5 see Fischer exact test ā€¢ To see chi square test at least 80% of cells must have expected count more than 5(20% cells have expected count less than 5). ā€¢ Likelihood ratio (g-test) is also an option in this case. But Fischer exact test is more common.
  • 14. One sample t test ā€¢ The One Sample t Test determines whether the sample mean is statistically different from a known or hypothesized population mean. The One Sample t Test is a parametric test. ā€¢ Also known as single sample t test. ā€¢ Variable used is called as test variable. ā€¢ In a One Sample t Test, the test variable is compared against a "test value", which is a known or hypothesized value of the mean in the population.
  • 15. It is commonly used when: ā€¢ Statistical difference between a sample mean and a known or hypothesized value of the mean in the population. ā€¢ Statistical difference between the sample mean and the sample midpoint of the test variable. ā€¢ Statistical difference between the sample mean of the test variable and chance. ā€¢ This approach involves first calculating the chance level on the test variable. The chance level is then used as the test value against which the sample mean of the test variable is compared. ā€¢ Statistical difference between a change score and zero. ā€¢ This approach involves creating a change score from two variables, and then comparing the mean change score to zero, which will indicate whether any change occurred between the two time points for the original measures. If the mean change score is not significantly different from zero, no significant change occurred
  • 16. Requirement for one sample t test ā€¢ Test variable that is continuous (i.e., interval or ratio level) ā€¢ Scores on the test variable are independent (i.e., independence of observations) ā€¢ There is no relationship between scores on the test variable ā€¢ Violation of this assumption will yield an inaccurate p value ā€¢ Random sample of data from the population ā€¢ Normal distribution (approximately) of the sample and population on the test variable ā€¢ Non-normal population distributions, especially those that are thick-tailed or heavily skewed, considerably reduce the power of the test ā€¢ Among moderate or large samples, a violation of normality may still yield accurate p values ā€¢ Homogeneity of variances (i.e., variances approximately equal in both the sample and population) ā€¢ No outliers
  • 17. Paired sample t test ā€¢ The Paired Samples t Test compares two means that are from the same individual, object, or related units. The two means can represent things like: ļƒ¼A measurement taken at two different times (e.g., pre-test and post- test with an intervention administered between the two time points) ļƒ¼A measurement taken under two different conditions (e.g., completing a test under a "control" condition and an "experimental" condition) ļƒ¼Measurements taken from two halves or sides of a subject or experimental unit (e.g., measuring hearing loss in a subject's left and right ears).
  • 18. Also known as ā€¢ Dependent t Test ā€¢ Paired t Test ā€¢ Repeated Measures t Test The variable used in this test is known as: ā€¢ Dependent variable, or test variable (continuous), measured at two different times or for two related conditions or units
  • 19. Used for observing ā€¢ Statistical difference between two time points ā€¢ Statistical difference between two conditions ā€¢ Statistical difference between two measurements ā€¢ Statistical difference between a matched pair Note: The Paired Samples t Test can only compare the means for two (and only two) related (paired) units on a continuous outcome that is normally distributed. The Paired Samples t Test is not appropriate for analyses involving the following: 1) unpaired data; 2) comparisons between more than two units/groups; 3) a continuous outcome that is not normally distributed; and 4) an ordinal/ranked outcome.
  • 20. Moreover, ā€¢ To compare unpaired means between two groups on a continuous outcome that is normally distributed, choose the Independent Samples t Test. ā€¢ To compare unpaired means between more than two groups on a continuous outcome that is normally distributed, choose ANOVA. ā€¢ To compare paired means for continuous data that are not normally distributed, choose the nonparametric Wilcoxon Signed-Ranks Test. ā€¢ To compare paired means for ranked data, choose the nonparametric Wilcoxon Signed-Ranks Test.
  • 21. Requirements for paired sample t test ā€¢ Dependent variable that is continuous (i.e., interval or ratio level) ā€¢ Note: The paired measurements must be recorded in two separate variables. ā€¢ Related samples/groups (i.e., dependent observations) ā€¢ The subjects in each sample, or group, are the same. This means that the subjects in the first group are also in the second group. ā€¢ Random sample of data from the population ā€¢ Normal distribution (approximately) of the difference between the paired values ā€¢ No outliers in the difference between the two related groups ā€¢ Note: When testing assumptions related to normality and outliers, you must use a variable that represents the difference between the paired values - not the original variables themselves. ā€¢ Note: When one or more of the assumptions for the Paired Samples t Test are not met, you may want to run the nonparametric Wilcoxon Signed-Ranks Test instead.
  • 22. Independent sample t test ā€¢ The Independent Samples t Test compares the means of two independent groups in order to determine whether there is statistical evidence that the associated population means are significantly different. The Independent Samples t Test is a parametric test. ā€¢ This test is also known as: ļ±Independent t Test ļ±Independent Measures t Test ļ±Independent Two-sample t Test ļ±Student t Test ļ±Two-Sample t Test ļ±Uncorrelated Scores t Test ļ±Unpaired t Test ļ±Unrelated t Test ā€¢ The variables used in this test are known as: ļ±Dependent variable, or test variable ļ±Independent variable, or grouping variable
  • 23. Used for testing the following ā€¢ Statistical differences between the means of two groups ā€¢ Statistical differences between the means of two interventions ā€¢ Statistical differences between the means of two change scores Note: The Independent Samples t Test can only compare the means for two (and only two) groups. It cannot make comparisons among more than two groups. If you wish to compare the means across more than two groups, you will likely want to run an ANOVA.
  • 24. Requirement ā€¢ Dependent variable that is continuous (i.e., interval or ratio level) ā€¢ Independent variable that is categorical (i.e., two groups) ā€¢ Cases that have values on both the dependent and independent variables ā€¢ Independent samples/groups (i.e., independence of observations) ā€¢ There is no relationship between the subjects in each sample. This means that: ā€¢ Subjects in the first group cannot also be in the second group ā€¢ No subject in either group can influence subjects in the other group ā€¢ No group can influence the other group ā€¢ Violation of this assumption will yield an inaccurate p value ā€¢ Random sample of data from the population ā€¢ Normal distribution (approximately) of the dependent variable for each group ā€¢ Non-normal population distributions, especially those that are thick-tailed or heavily skewed, considerably reduce the power of the test ā€¢ Among moderate or large samples, a violation of normality may still yield accurate p values ā€¢ Homogeneity of variances (i.e., variances approximately equal across groups) ā€¢ When this assumption is violated and the sample sizes for each group differ, the p value is not trustworthy. However, the Independent Samples t Test output also includes an approximate t statistic that is not based on assuming equal population variances. This alternative statistic, called the Welch t Test statistic1, may be used when equal variances among populations cannot be assumed. The Welch t Test is also known an Unequal Variance t Test or Separate Variances t Test. ā€¢ No outliers ā€¢ Note: When one or more of the assumptions for the Independent Samples t Test are not met, you may want to run the nonparametric Mann- Whitney U Test instead.
  • 25. One way ANOVA ā€¢ One-Way ANOVA ("analysis of variance") compares the means of two or more independent groups in order to determine whether there is statistical evidence that the associated population means are significantly different. One-Way ANOVA is a parametric test. ā€¢ This test is also known as: ļ±One-Factor ANOVA ļ±One-Way Analysis of Variance ļ±Between Subjects ANOVA ā€¢ The variables used in this test are known as: ā€¢ Dependent variable ā€¢ Independent variable (also known as the grouping variable, or factor) ā€¢ This variable divides cases into two or more mutually exclusive levels, or groups
  • 26. Used for ā€¢ Field studies ā€¢ Experiments ā€¢ Quasi-experiments ā€¢ The One-Way ANOVA is commonly used to test the following: ļ±Statistical differences among the means of two or more groups ļ±Statistical differences among the means of two or more interventions ļ±Statistical differences among the means of two or more change scores ā€¢ Note: Both the One-Way ANOVA and the Independent Samples t Test can compare the means for two groups. However, only the One-Way ANOVA can compare the means across three or more groups. ā€¢ Note: If the grouping variable has only two groups, then the results of a one-way ANOVA and the independent samples t test will be equivalent. In fact, if you run both an independent samples t test and a one-way ANOVA in this situation, you should be able to confirm that t2=F.
  • 27. Requirement ā€¢ Dependent variable that is continuous (i.e., interval or ratio level) ā€¢ Independent variable that is categorical (i.e., two or more groups) ā€¢ Cases that have values on both the dependent and independent variables ā€¢ Independent samples/groups (i.e., independence of observations) ā€¢ There is no relationship between the subjects in each sample. This means that: ā€¢ subjects in the first group cannot also be in the second group ā€¢ no subject in either group can influence subjects in the other group ā€¢ no group can influence the other group ā€¢ Random sample of data from the population ā€¢ Normal distribution (approximately) of the dependent variable for each group (i.e., for each level of the factor) ā€¢ Non-normal population distributions, especially those that are thick-tailed or heavily skewed, considerably reduce the power of the test ā€¢ Among moderate or large samples, a violation of normality may yield fairly accurate p values
  • 28. Continued ā€¦ā€¦ā€¦ā€¦ā€¦ā€¦ā€¦ā€¦.. ā€¢ Homogeneity of variances (i.e., variances approximately equal across groups) ā€¢ When this assumption is violated and the sample sizes differ among groups, the p value for the overall F test is not trustworthy. These conditions warrant using alternative statistics that do not assume equal variances among populations, such as the Browne-Forsythe or Welch statistics (available via Options in the One-Way ANOVA dialog box). ā€¢ When this assumption is violated, regardless of whether the group sample sizes are fairly equal, the results may not be trustworthy for post hoc tests. When variances are unequal, post hoc tests that do not assume equal variances should be used. ā€¢ No outliers
  • 29. Correlation ā€¢ Pearson Correlation produces a sample correlation coefficient, r, which measures the strength and direction of linear relationships between pairs of continuous variables. By extension, the Pearson Correlation evaluates whether there is statistical evidence for a linear relationship among the same pairs of variables in the population, represented by a population correlation coefficient, Ļ (ā€œrhoā€). The Pearson Correlation is a parametric measure. ā€¢ This measure is also known as: ļ±Pearsonā€™s correlation ļ±Pearson product-moment correlation (PPMC)
  • 30. Used for ā€¢ Correlations among pairs of variables ā€¢ Correlations within and between sets of variables ā€¢ The bivariate Pearson correlation indicates the following: ļ±Whether a statistically significant linear relationship exists between two continuous variables ļ±The strength of a linear relationship (i.e., how close the relationship is to being a perfectly straight line) ļ±The direction of a linear relationship (increasing or decreasing) ā€¢ Note: The bivariate Pearson Correlation cannot address non-linear relationships or relationships among categorical variables. If you wish to understand relationships that involve categorical variables and/or non-linear relationships, you will need to choose another measure of association. ā€¢ Note: The bivariate Pearson Correlation only reveals associations among continuous variables. The bivariate Pearson Correlation does not provide any inferences about causation, no matter how large the correlation coefficient is.
  • 31. Requirement ā€¢ Two or more continuous variables (i.e., interval or ratio level) ā€¢ Cases that have values on both variables ā€¢ Linear relationship between the variables ā€¢ Independent cases (i.e., independence of observations) ā€¢ There is no relationship between the values of variables between cases. This means that: ā€¢ the values for all variables across cases are unrelated ā€¢ for any case, the value for any variable cannot influence the value of any variable for other cases ā€¢ no case can influence another case on any variable ā€¢ The biviariate Pearson correlation coefficient and corresponding significance test are not robust when independence is violated. ā€¢ Bivariate normality ā€¢ Each pair of variables is bivariately normally distributed ā€¢ Each pair of variables is bivariately normally distributed at all levels of the other variable(s) ā€¢ This assumption ensures that the variables are linearly related; violations of this assumption may indicate that non-linear relationships among variables exist. Linearity can be assessed visually using a scatterplot of the data. ā€¢ Random sample of data from the population ā€¢ No outliers
  • 32. linear Regression analysis ā€¢ Linear regression is the next step up after correlation. ā€¢ It is used when we want to predict the value of a variable based on the value of another variable. ā€¢ The variable we want to predict is called the dependent variable (or sometimes, the outcome variable). ā€¢ The variable we are using to predict the other variable's value is called the independent variable (or sometimes, the predictor variable). ā€¢ For example, you could use linear regression to understand whether yield performance can be predicted based on dose and practices of manure application ; whether cigarette consumption can be predicted based on smoking duration; and so forth. ā€¢ If you have two or more independent variables, rather than just one, you need to use multiple regression.
  • 33.
  • 34. Used for ā€¢ Model multiple independent variables ā€¢ Include continuous and categorical variables ā€¢ Use polynomial terms to model curvature ā€¢ Assess interaction terms to determine whether the effect of one independent variable depends on the value of another variable
  • 35. Requirements ā€¢ Your two variables should be measured at the continuous level ( interval or ratio scales) ā€¢ There needs to be a linear relationship between the two variables. (see through scatter plots) ļ±If the relationship displayed in your scatterplot is not linear, you will have to either run a non- linear regression analysis, perform a polynomial regression or "transform" your data, which you can do using SPSS Statistics. ā€¢ There should be no significant outliers. ā€¢ You should have independence of observations, which you can easily check using the Durbin- Watson statistic, which is a simple test to run using SPSS Statistics. ā€¢ Your data needs to show homoscedasticity, which is where the variances along the line of best fit remain similar as you move along the line ā€¢ Finally, you need to check that the residuals (errors) of the regression line are approximately normally distributed (we explain these terms in our enhanced linear regression guide). Two common methods to check this assumption include using either a histogram (with a superimposed normal curve) or a Normal P-P Plot.
  • 36.
  • 37. Binary logistic regression ā€¢ Binary logistic regression models the relationship between a set of predictors and a binary response variable. A binary response has only two possible values, such as win and lose. ā€¢ Use a binary regression model to understand how changes in the predictor values are associated with changes in the probability of an event occurring. ā€¢ Where the dependent variable is dichotomous or binary in nature, we cannot use simple linear regression. Logistic regression is the statistical technique used to predict the relationship between predictors (our independent variables) and a predicted variable (the dependent variable) where the dependent variable is binary (e.g., sex [male vs. female], response [yes vs. no], score [high vs. low], etcā€¦). ā€¢ There must be two or more independent variables, or predictors, for a logistic regression. The IVs, or predictors, can be continuous (interval/ratio) or categorical (ordinal/nominal). ā€¢ All predictor variables are tested in one block to assess their predictive ability while controlling for the effects of other predictors in the model.
  • 38. Uses ā€¢ Logistic regression is a powerful statistical way of modeling a binomial outcome (takes the value 0 or 1 like having or not having a disease) with one or more explanatory variables. ā€¢ logistic regression provides a quantified value for the strength of the association adjusting for other variables (removes confounding effects). ā€¢ The exponential of coefficients correspond to odd ratios for the given factor.
  • 39.
  • 40. Requirements ā€¢ Dependent variable to be categorical and dichotomous. ā€¢ The error term need to be independent. ā€¢ Linearity of predictors, independent variables and log of odds
  • 41. If odds ratio is ā€¢ >1 subtract that value with -1. e.g if odds ratio is 4.5 then the value has 4.5-1 times higher than the odds for other option. ā€¢ If <1 then substract with 1 e.g. if odds ratio is 0.07, it will have 1- 0.07=0.93 i.e. 93% increase in the odds.