2. Basic Terms and Definitions
⚫Descriptive Statistics - are the methods that help
collect, summarise, present and analyse a set of data.
⚫Inferential Statistics - are methods that use the data
collected from a small group to draw conclusions about a
larger group.
Jayanth Jacob, DoMS, Anna University 2
3. Basic Terms - Contd
⚫A variable is a characteristic of an item or individual
⚫Data are different values associated with a variable.
⚫Variable values are meaningless unless their
corresponding variables have operational definitions.
These definitions are universally accepted meanings
that are clear for the analysis and the objective.
Jayanth Jacob, DoMS, Anna University 3
4. Basic Terms - Contd
⚫A population consists of all the items or individuals
about which we want to draw a conclusion.
⚫A parameter is a measure that describes a
characteristic of a population.
Jayanth Jacob, DoMS, Anna University 4
5. Basic Terms and Definitions
⚫A sample is the portion of a population selected for
analysis
⚫ A statistic is a measure that describes a characteristic
of a sample.
Jayanth Jacob, DoMS, Anna University 5
6. Basic Terms - Contd
⚫Degrees of Freedom (df)
◦ df = Total no of observations – Number of
estimated parameters
◦ Lower df lesser predictability
⚫Level of Significance (α)
◦ Probability of wrongly rejecting null hypothesis H0, when it is true
◦ Generally chosen to be 0.05 (5%)
Jayanth Jacob, DoMS, Anna University 6
7. Steps in Research (DCOVA)
⚫Define the variables for the study
⚫Collect the data from appropriate sources
⚫Organize the data collected by developing tables
⚫Visualize the data through charts
⚫Analyze the data by examining the relevant charts and
tables to draw inferences.
Jayanth Jacob, DoMS, Anna University 7
8. Types of Variables
⚫Categorical/Qualitative variable
◦ Nominal variable
◦ Dichotomous variable
◦ Ordinal variable
⚫Continuous variable
◦ Interval variable
◦ Ratio variable
Jayanth Jacob, DoMS, Anna University 8
9. Types of Analysis
⚫Univariate analysis
◦ Single variable in terms of the applicable unit
of analysis
⚫Bivariate analysis
◦ Analysis of two variables
◦ Covariance is a measure to see if the
variables are related to one another
◦ Help in testing simple hypotheses of
association
⚫Multivariate analysis
◦ Analysis of more than one statistical outcome
variable at a time
Jayanth Jacob, DoMS, Anna University 9
10. Variance
⚫Variance measures how far a set of numbers is spread
out. A variance of zero indicates that all the values are
identical. Variance is always non-negative: a small
variance indicates that the data points tend to be very
close to the mean (expected value) and hence to each
other, while a high variance indicates that the data
points are very spread out around the mean and from
each other.
Jayanth Jacob, DoMS, Anna University 10
11. Homoscedascity and
Heteroscedascity
⚫In statistics, a sequence of random variables is
homoscedastic if all random variables in the
sequence have the same finite variance. This is
also known as homogeneity of variance. The lack
of homogeneity of variance results in
heteroscedasticity.
Jayanth Jacob, DoMS, Anna University 11
12. p-value
◦ Probability of obtaining a test statistic result at least as extreme
as the one that was actually observed
◦ p ≤ 0.01 very strong presumption against H0
◦ 0.01 < p ≤ 0.05 strong presumption against H0
◦ 0.05 < p ≤ 0.1 low presumption against H0
◦ p > 0.1 no presumption against H0
Jayanth Jacob, DoMS, Anna University 12
13. Hypothesis
◦ A proposed explanation for a phenomenon
◦ Null Hypothesis H0
◦ Alternate Hypothesis H1
●One-tailed directional
●Two-tailed directional
Jayanth Jacob, DoMS, Anna University 13
ERROR TYPE H0 H1
I True but rejected False but accepted
II False but accepted True but rejected
14. Chi-Square Test
⚫Used to assess two types of comparisons
◦ Test of goodness of fit
◦ Test of independence
⚫Assumptions
◦ Simple random sample
◦ Sample size
◦ Expected cell count
◦ Independence
Jayanth Jacob, DoMS, Anna University 14
15. Analysis of Variance (ANOVA)
⚫Test whether there are any significant differences
between the means of two or more independent groups
⚫Assumptions
◦ DV – interval or ratio scale
◦ IDV – two or more categorical variable
◦ Independence of observation
◦ No significant outliers
◦ DV normally distributed (Shapiro-Wilk test)
◦ Homogeneity of variance (Levene's test)
Jayanth Jacob, DoMS, Anna University 15
16. Independent t-test
⚫Compares the means between two unrelated
groups on the same continuous, dependent
variable
⚫Assumptions
◦ DV – interval or ratio scale
◦ IDV – two or more categorical variable
◦ Independence of observation
◦ No significant outliers
◦ DV normally distributed
◦ Homogeneity of variance (Levene's test)
Jayanth Jacob, DoMS, Anna University 16