Identification and installation of user written commands stata
Flow of statistical analysis full version
1. *n ≤30 & n>30
*Data should not be normally distributed
Flow of Statistical Analysis
Selection of samples/cases/observation/subjects
Modes of statistical tests
*n ≥ 30
*Data should be normally distributed
Parametric tests Non-Parametric tests
Normal Distribution Non-Normal /Free Distribution
Linearity-Homogeneous mode of variations Non-Linearity-Heterogeneity mode of variations
Probability and Non-Probability modes of
samplings and distributions
Probability and Non-Probability modes of
samplings and distributions
Continuous variables (ratios and intervals)
Continuous variables (ratios and intervals) that
are not normally distributed, ordinal and
nominal variables(Categorical variables)
Regression analysis (Y-X)(continuous variable-
categorical & continuous variables)
A) Linear regression
(i) DV=Continuous variable, and IV=Continuous
variables/categorical variables/mixed structure
[Parametric forms of assumptions]
-Simple Linear Regression (Two variable case)
*may not be applicable in reality – residuals
might be higher –efficiency reduction
-Multiple Linear Regression(Multi variable case)
*applicable in reality-residuals might be
lower-expansion of efficiency – should be tied up
with diagnostic testing
Regression Analysis
A) Non-Linear regression
(i) DV=Continuous variable, IV=Continuous
variables/categorical variables/mixed
structure
Simple version of Non-Linear regression
(i)Scatterplot smoothing
(ii)Smoothing splines
(iii) Non-linear mode of regression
Multiple version of Non-Linear regression
(i) Additive/Polynomial regression
Zero/Low forms of correlations between
independent variables. (Lower power of multi-
collinearity) Zero/Low forms of correlations between
independent variables. (Lower power of multi-
collinearity)
Ordinary Statistical Tests
Ordinary Statistical Tests
2. & Ordinal variables)-Categorical & Continuous variables)
Binary Probit Regression(Dichotomous variable-categorical
& continuous variables)
-residuals are normally distributed –linearity mode –
homogeneity
Multinomial Probit Regression(Multi choices/Ordinal
variable – categorical & continuous variables)
-residuals are normally distributed –linearity mode –
homogeneity
& Ordinal variables)-Categorical & Continuous variables)
Binary Logit Regression(Dichotomous variable-categorical
& continuous variables)
-residuals are not normally distributed –non-linearity mode
–heterogeneity
Multinomial Logit Regression(Multi choices/Ordinal
variable – categorical & continuous variables)
-residuals are not normally distributed –non-linearity
mode –heterogeneity
Parametric mode of paired samples(dependent
samples)test
Involves related samples
n ≥ 30
measures significant differences between means
of same persons/objects/subjects at different
time points
Indicates modes of improvements and
decrements
Similar to repeated measures of testing
Matched and unmatched paired samples t-test(2
levels)
Independent variable involves two levels
(nominal)
Normal distribution & Linearity mode
Repeated measure of ANOVA(more than 2 levels)
Non-Parametric mode of paired sample test
Involves related samples
n < 30
measures significant differences between ranks
of means of same persons/objects/subjects at
different time points
Indicates modes of improvements and
decrements
Similar to repeated measures of testing
Independent variable involves two or more than
two levels
Non-normal distribution & non-linearity mode
Wilcoxon test, sign test, McNemar test, and
Marginal test of homogeneity.(2 levels)
Friedman test, Kendall’s W test, and Cochran’s
test(more than two levels)
Parametric mode of independent samples test
Involves unrelated samples
n ≥ 30
measures significant differences between means of
two persons/objects/subjects.
Independent variable involves two groups and more
than two groups(categorical variable)
Normal distribution & Linearity mode
Dependent variables deal with continuous variables
Independent sample t-test/Levene version of t-
test(2 groups)
One way of ANOVA test (>2 groups)
Non-Parametric mode of independent samples test
Involves unrelated samples
n < 30 / n ≥ 30
measures significant differences between ranks
of means of two persons/objects/subjects.
Independent variable involves two or more than
two groups (categorical variable)
Non-Normal distribution & Non-Linearity mode
Dependent variables deal with categorical
variables
Mann Whitney test (2 groups)
Kruskal Wallis rank test (>2 groups)
Parametric mode of correlation
Measures the strength of association between variables
that are normally distributed (Continuous variables) -
Pearson correlation, Pearson distance, stepwise linear
regression, Auxiliary mode linear regression - No/zero
correlation (r=0.00), Weak correlation ( 0.01 ≤ r ≤ 0.39),
Moderate correlation ( 0.4 ≤ r ≤ 0.69), High correlation (0.7
≤ r ≤ 0.79), Very high correlation ( 0.8 ≤ r ≤ 0.99), and
Perfect correlation( r=1)
Parametric mode of correlation
Measures the strength of association between variables
that are normally distributed (Categorical variables) -
Spearman and Kendall’s tau b ranks correlations tests,
stepwise non-linear regression, Auxiliary mode of non-
linear regression- No/zero correlation (r=0.00), Weak
correlation ( 0.01 ≤ r ≤ 0.39), Moderate correlation ( 0.4 ≤ r
≤ 0.69), High correlation (0.7 ≤ r ≤ 0.79), Very high
correlation ( 0.8 ≤ r ≤ 0.99), and Perfect correlation( r=1)