Choosing the Right Statistical Techniques

19-Oct-18
1
Bodhiya Wijaya Mulya, S.Sos., M.M.
Research aims to discover phenomenon that we
assume actually exist
Whatever the phenomenon we desire to explain, we
collect data from the real world to test our hypotheses
about the phenomenon
Hypotheses testing must involve statistical technique

19-Oct-18
2
Descriptive
Statistics
Inferential
Statistics
• collecting
• organizing
• Summarizing
• presenting data
• making inferences
• hypothesis testing
• determining
relationships
• making predictions
Statistical
Techniques

19-Oct-18
3
Statistic that describe numerical data
Frequency Distribution
Central Tendency
Variation
Data Shape
 Easiest way to describe numerical data
 Can present in table or graph
Poor
Below Average
Average
Above Average
Excellent
2
3
5
9
1
Total 20
Rating Frequency
PoorPoor Below
Average
Below
Average
AverageAverageAbove
Average
Above
Average
ExcellentExcellent
FrequencyFrequency
RatingRating
11
22
33
44
55
66
77
88
99
1010
Toyota Quality Ratings

19-Oct-18
4
Mode:
 The most common or frequently occurring number
 Can be used with nominal, ordinal, interval and
ratio
Median:
 The middle point of data. Also known as 50th
Percentile or Second Quartile
 Can be used with ordinal, interval, and ratio
Mean:
 Arithmetic average
 The most widely used measure of central tendency
 Can be used with interval and ratio

19-Oct-18
5
Measuring how data spread and vary
Range
 Substraction between Maximum Value with Minimun
Value
 Easiest way to measure variability
 Very sensitive with outlier
Percentile
 Divided data into 10 part
 Tell us the score at a specific position within
distribution
 Calculation of percentile is similar with median

19-Oct-18
6
Standard Deviation
 Gives “average distance” between all data and the
mean
 The most comprehensive and widely used
 Can be used only with interval/ratio data
 Tell us the score at a specific position within
distribution
Skewness
Kurtosis

19-Oct-18
7
An important measure of the shape of a distribution
Pearson coefficient of Skewness (sk)
sk = 3(Mean-Median)
Standard Dev.
Symmetric:
 The mean and median are equal and the data values are evenly spread around
these values
 Value of Pearson skewness is zero
 Value in software skewness is near zero
Right-SkewedLeft-Skewed Symmetric
Mean = Median = ModeMean Median Mode Mode Median Mean

19-Oct-18
8
Positive/Right Skewed:
 Mean will usually be more than the median Value of Pearson
skewness 0 < sk < 3
 Value in software skewness is positive
Negative/Left Skewed:
 Median will usually be more than the mean
 Value of Pearson skewness -3 < sk < 0
 Value in software skewness is negative

19-Oct-18
9
Measure provides information about the peakedness
of the distribution
Zero or near zero kurtosis values indicate that the
distribution is normal
Sometimes it’s called mesokurtic

19-Oct-18
10
Positive kurtosis values indicate that the distribution is
rather peaked because many cases clustered in the
centre)
Sometimes it’s called leptokurtic
Negative kurtosis values indicate that the distribution
is a distribution that is relatively flat because too many
cases in the extremes)
Sometimes it’s called platykurtic

19-Oct-18
11
In large sample skewness and kurtosis could be
inaccurate
inspecting the shape of the distribution using a
histogram
Introduction to Social Science Statistics
Normality tests are used to determine if a data set is
well-modeled by a normal distribution
Normality test help us to determine what statistics
technique we should use
Most of Inferential technique require normal
distribution
Introduction to Social Science Statistics

19-Oct-18
12
Test of Relationship
Test of Difference

19-Oct-18
13
Correlation
 Pearson or Spearman Correlation can be used to explore the
strength of relationship between two continuous variable
 Pearson for normal distribution, while Spearman for not
 This test will give us direction and strength of relationship
 Positive correlations: If one variable increase, the other
would also increase
 Negative correlations: If one variable increase, the other
would decrease
Correlation
 Lind, Marchal, and Wathen (2012, p. 465)
0 = No Correlation
0,10 – 0,49 = Weak Correlation
0,5 = Medium Correlation
0,51 – 0,99 = Strong Correlation
1 = Perfect Correlation

19-Oct-18
14
Correlation
 Cohen (1988, p. 79-81) in Pallant (2007, p. 132)
0.10 – 0.29 = Weak Correlation
0,30 – 0,49 = Medium Correlation
0,50 – 1 = Strong Corellation
Partial Correlation
 Extension of Correlation that allows us to control another
variable
 This test allows us to get more accurate picture about
relationship between two variable

19-Oct-18
15
Simple/Multiple Regression
 Sophisticated extension of correlation
 These test is used when we want to predict ability of a set of
independent variable(s) on one continuous dependent
variable
Logistic Regression
 Similar with multiple regression but use categorical dependent
variable
 Dependent variable better to be dichotomous
 It doesn’t have the requirements of the independent
variables to be normally distributed, linearly related, nor
equal variance within each group
 Minimum 50 sample per one independent variable

19-Oct-18
16
Factor Analysis
 Not designed to test hypotheses
 This teqhnique is used extensively by researcher to develop
and evaluate test or scales
 Reduce and refine large number of question items into
smaller and more manageable
 Example:What is the factor that makes people religious
Discriminant Analysis
 Explore predictive ability of a set of independent variables
on one categorical dependent variable
 Used to differentiate group based on several
predictors/Independent variables
 Can have more than two groups on dependent variable
 Example: Differentiate student failure/passed/standout
based on Mid-Exam and Paper score

19-Oct-18
17
Canonical Correlation
 This test is used when we want to analyze two sets of
variable
 Usually used for exploratory research
 Example: a set of variables measuring medical compliance
(willingness to buy drugs, to make return office visits, to use
drugs, to restrict activity) and a set of demographic
characteristics (educational level, religious affiliation,
income, medical insurance)
Structural Equation Modeling
 Very sophisticated technique
 Combine multiple regression with factor analysis
 Evaluate model fit with our data

19-Oct-18
18
T-Test
 Test to comparing mean score difference between two
group
 Consist of Independent, Paired, and One-sample with
different situation
Mann-Whitney U Test and Wilcoxon Signed Rank Test
 Non-parametric version of T-Test
One Way Analysis of Variance (One Way ANOVA)
 Test to comparing mean score difference between two or
more group
 In ANOVA, independent variable is “group” and dependent
variable is score
 Can be used for same group or different group
Kruskal-Wallis Test and Friedman Test
 Non-parametric version of One Way ANOVA

19-Oct-18
19
Two Way Analysis of Variance (Two Way ANOVA)
 Test to comparing mean score difference from two
independent variables (group)
 Independent variables are categorical such as comparing
GPA between student enrollment group and gender
 Can be used for same group or different group
Multivariate Analysis of Variance (MANOVA)
 Test to comparing mean score difference with several
dependent variables
 Can be one way or two way
 Example: Comparing leadership skills and GPA based on
Religion and Gender of student

19-Oct-18
20
Analysis of Covariance (ANCOVA)
 Test to comparing mean score difference with “controlled”
covariate variable to minimize error
 Similar with partial correlation
 Can be one way, two way, or multivariate
 Example: Comparing leadership skills based on gender
with controlled GPA
Research questions that you want to address
Find the questionnaire items and scales that you will
use to address these questions
Identify the nature of each of your variables
Draw diagram for each of your research questions (if
Possible)
Decide wheter a parametric or a non-parametric
statistical technique is appropriate

19-Oct-18
21
 Field, A. (2009). Discovering statistics using SPSS (3rd ed.). London: Sage Publications
Ltd.
 Lind, D.A., Marchal,W.G., and Wathen, S.A. (2012). Statistical techniques in business
and economics. New York: McGraw-Hill.
 Pallant, J. (2007). SPSS survival manual. Berkshire: McGraw-Hill Open University
Press.
 Tabachnick, B.G. and Fidell, L.S. (2013). Using Multivariate Statistics (6th ed.). Boston:
Pearson Education.

Choosing the Right Statistical Techniques

More Related Content

What's hot

Similar to Choosing the Right Statistical Techniques

Recently uploaded

Choosing the Right Statistical Techniques