*n ≤30 & n>30
*Data should not be normally distributed
Flow of Statistical Analysis
Selection of samples/cases/observation/subjects
Modes of statistical tests
*n ≥ 30
*Data should be normally distributed
Parametric tests Non-Parametric tests
Normal Distribution Non-Normal /Free Distribution
Linearity-Homogeneous mode of variations Non-Linearity-Heterogeneity mode of variations
Probability and Non-Probability modes of
samplings and distributions
Probability and Non-Probability modes of
samplings and distributions
Continuous variables (ratios and intervals)
Continuous variables (ratios and intervals) that
are not normally distributed, ordinal and
nominal variables(Categorical variables)
Regression analysis (Y-X)(continuous variable-
categorical & continuous variables)
A) Linear regression
(i) DV=Continuous variable, and IV=Continuous
variables/categorical variables/mixed structure
[Parametric forms of assumptions]
-Simple Linear Regression (Two variable case)
*may not be applicable in reality – residuals
might be higher –efficiency reduction
-Multiple Linear Regression(Multi variable case)
*applicable in reality-residuals might be
lower-expansion of efficiency – should be tied up
with diagnostic testing
Regression Analysis
A) Non-Linear regression
(i) DV=Continuous variable, IV=Continuous
variables/categorical variables/mixed
structure
Simple version of Non-Linear regression
(i)Scatterplot smoothing
(ii)Smoothing splines
(iii) Non-linear mode of regression
Multiple version of Non-Linear regression
(i) Additive/Polynomial regression
Zero/Low forms of correlations between
independent variables. (Lower power of multi-
collinearity) Zero/Low forms of correlations between
independent variables. (Lower power of multi-
collinearity)
Ordinary Statistical Tests
Ordinary Statistical Tests
& Ordinal variables)-Categorical & Continuous variables)
Binary Probit Regression(Dichotomous variable-categorical
& continuous variables)
-residuals are normally distributed –linearity mode –
homogeneity
Multinomial Probit Regression(Multi choices/Ordinal
variable – categorical & continuous variables)
-residuals are normally distributed –linearity mode –
homogeneity
& Ordinal variables)-Categorical & Continuous variables)
Binary Logit Regression(Dichotomous variable-categorical
& continuous variables)
-residuals are not normally distributed –non-linearity mode
–heterogeneity
Multinomial Logit Regression(Multi choices/Ordinal
variable – categorical & continuous variables)
-residuals are not normally distributed –non-linearity
mode –heterogeneity
Parametric mode of paired samples(dependent
samples)test
 Involves related samples
 n ≥ 30
 measures significant differences between means
of same persons/objects/subjects at different
time points
 Indicates modes of improvements and
decrements
 Similar to repeated measures of testing
 Matched and unmatched paired samples t-test(2
levels)
 Independent variable involves two levels
(nominal)
 Normal distribution & Linearity mode
 Repeated measure of ANOVA(more than 2 levels)
Non-Parametric mode of paired sample test
 Involves related samples
 n < 30
 measures significant differences between ranks
of means of same persons/objects/subjects at
different time points
 Indicates modes of improvements and
decrements
 Similar to repeated measures of testing
 Independent variable involves two or more than
two levels
 Non-normal distribution & non-linearity mode
 Wilcoxon test, sign test, McNemar test, and
Marginal test of homogeneity.(2 levels)
 Friedman test, Kendall’s W test, and Cochran’s
test(more than two levels)
Parametric mode of independent samples test
 Involves unrelated samples
 n ≥ 30
 measures significant differences between means of
two persons/objects/subjects.
 Independent variable involves two groups and more
than two groups(categorical variable)
 Normal distribution & Linearity mode
 Dependent variables deal with continuous variables
 Independent sample t-test/Levene version of t-
test(2 groups)
 One way of ANOVA test (>2 groups)
Non-Parametric mode of independent samples test
 Involves unrelated samples
 n < 30 / n ≥ 30
 measures significant differences between ranks
of means of two persons/objects/subjects.
 Independent variable involves two or more than
two groups (categorical variable)
 Non-Normal distribution & Non-Linearity mode
 Dependent variables deal with categorical
variables
 Mann Whitney test (2 groups)
 Kruskal Wallis rank test (>2 groups)
Parametric mode of correlation
Measures the strength of association between variables
that are normally distributed (Continuous variables) -
Pearson correlation, Pearson distance, stepwise linear
regression, Auxiliary mode linear regression - No/zero
correlation (r=0.00), Weak correlation ( 0.01 ≤ r ≤ 0.39),
Moderate correlation ( 0.4 ≤ r ≤ 0.69), High correlation (0.7
≤ r ≤ 0.79), Very high correlation ( 0.8 ≤ r ≤ 0.99), and
Perfect correlation( r=1)
Parametric mode of correlation
Measures the strength of association between variables
that are normally distributed (Categorical variables) -
Spearman and Kendall’s tau b ranks correlations tests,
stepwise non-linear regression, Auxiliary mode of non-
linear regression- No/zero correlation (r=0.00), Weak
correlation ( 0.01 ≤ r ≤ 0.39), Moderate correlation ( 0.4 ≤ r
≤ 0.69), High correlation (0.7 ≤ r ≤ 0.79), Very high
correlation ( 0.8 ≤ r ≤ 0.99), and Perfect correlation( r=1)

Flow of statistical analysis full version

  • 1.
    *n ≤30 &n>30 *Data should not be normally distributed Flow of Statistical Analysis Selection of samples/cases/observation/subjects Modes of statistical tests *n ≥ 30 *Data should be normally distributed Parametric tests Non-Parametric tests Normal Distribution Non-Normal /Free Distribution Linearity-Homogeneous mode of variations Non-Linearity-Heterogeneity mode of variations Probability and Non-Probability modes of samplings and distributions Probability and Non-Probability modes of samplings and distributions Continuous variables (ratios and intervals) Continuous variables (ratios and intervals) that are not normally distributed, ordinal and nominal variables(Categorical variables) Regression analysis (Y-X)(continuous variable- categorical & continuous variables) A) Linear regression (i) DV=Continuous variable, and IV=Continuous variables/categorical variables/mixed structure [Parametric forms of assumptions] -Simple Linear Regression (Two variable case) *may not be applicable in reality – residuals might be higher –efficiency reduction -Multiple Linear Regression(Multi variable case) *applicable in reality-residuals might be lower-expansion of efficiency – should be tied up with diagnostic testing Regression Analysis A) Non-Linear regression (i) DV=Continuous variable, IV=Continuous variables/categorical variables/mixed structure Simple version of Non-Linear regression (i)Scatterplot smoothing (ii)Smoothing splines (iii) Non-linear mode of regression Multiple version of Non-Linear regression (i) Additive/Polynomial regression Zero/Low forms of correlations between independent variables. (Lower power of multi- collinearity) Zero/Low forms of correlations between independent variables. (Lower power of multi- collinearity) Ordinary Statistical Tests Ordinary Statistical Tests
  • 2.
    & Ordinal variables)-Categorical& Continuous variables) Binary Probit Regression(Dichotomous variable-categorical & continuous variables) -residuals are normally distributed –linearity mode – homogeneity Multinomial Probit Regression(Multi choices/Ordinal variable – categorical & continuous variables) -residuals are normally distributed –linearity mode – homogeneity & Ordinal variables)-Categorical & Continuous variables) Binary Logit Regression(Dichotomous variable-categorical & continuous variables) -residuals are not normally distributed –non-linearity mode –heterogeneity Multinomial Logit Regression(Multi choices/Ordinal variable – categorical & continuous variables) -residuals are not normally distributed –non-linearity mode –heterogeneity Parametric mode of paired samples(dependent samples)test  Involves related samples  n ≥ 30  measures significant differences between means of same persons/objects/subjects at different time points  Indicates modes of improvements and decrements  Similar to repeated measures of testing  Matched and unmatched paired samples t-test(2 levels)  Independent variable involves two levels (nominal)  Normal distribution & Linearity mode  Repeated measure of ANOVA(more than 2 levels) Non-Parametric mode of paired sample test  Involves related samples  n < 30  measures significant differences between ranks of means of same persons/objects/subjects at different time points  Indicates modes of improvements and decrements  Similar to repeated measures of testing  Independent variable involves two or more than two levels  Non-normal distribution & non-linearity mode  Wilcoxon test, sign test, McNemar test, and Marginal test of homogeneity.(2 levels)  Friedman test, Kendall’s W test, and Cochran’s test(more than two levels) Parametric mode of independent samples test  Involves unrelated samples  n ≥ 30  measures significant differences between means of two persons/objects/subjects.  Independent variable involves two groups and more than two groups(categorical variable)  Normal distribution & Linearity mode  Dependent variables deal with continuous variables  Independent sample t-test/Levene version of t- test(2 groups)  One way of ANOVA test (>2 groups) Non-Parametric mode of independent samples test  Involves unrelated samples  n < 30 / n ≥ 30  measures significant differences between ranks of means of two persons/objects/subjects.  Independent variable involves two or more than two groups (categorical variable)  Non-Normal distribution & Non-Linearity mode  Dependent variables deal with categorical variables  Mann Whitney test (2 groups)  Kruskal Wallis rank test (>2 groups) Parametric mode of correlation Measures the strength of association between variables that are normally distributed (Continuous variables) - Pearson correlation, Pearson distance, stepwise linear regression, Auxiliary mode linear regression - No/zero correlation (r=0.00), Weak correlation ( 0.01 ≤ r ≤ 0.39), Moderate correlation ( 0.4 ≤ r ≤ 0.69), High correlation (0.7 ≤ r ≤ 0.79), Very high correlation ( 0.8 ≤ r ≤ 0.99), and Perfect correlation( r=1) Parametric mode of correlation Measures the strength of association between variables that are normally distributed (Categorical variables) - Spearman and Kendall’s tau b ranks correlations tests, stepwise non-linear regression, Auxiliary mode of non- linear regression- No/zero correlation (r=0.00), Weak correlation ( 0.01 ≤ r ≤ 0.39), Moderate correlation ( 0.4 ≤ r ≤ 0.69), High correlation (0.7 ≤ r ≤ 0.79), Very high correlation ( 0.8 ≤ r ≤ 0.99), and Perfect correlation( r=1)