1) The document discusses key concepts in understanding research in second language learning, including formulating research questions and hypotheses, different types of sampling methods, and determining statistical significance.
2) It explains how to identify problems, operationalize variables, and develop testable hypotheses. Random sampling, systematic sampling, and stratified random sampling are sampling methods covered.
3) The passage also discusses statistical decisions like choosing appropriate tests, formulating statistical hypotheses, setting the alpha level, and determining whether to reject the null hypothesis based on p-values and significance levels. Key considerations like observed statistics, assumptions, and degrees of freedom are also addressed.
2. Chapter 9;
Statistical Logic
Use a sample study to show how logic is applied in the three most commonly
reported families of statistical studies:
1_ Explore the strength of relationships between variables;
2_ Compare group means;
3_ Compare frequency.
3. Stage1: Focusing the Study
– Noticing a problem
– Identifying and operationally defining the constructs to be
examined
– Formulating research questions and hypotheses.
4. Identifying a Problem
– These often derived from classroom teaching experiences or
from reading the literature in the field.
– The researcher must first notice a problem that is worthy of a
solution; one for which the answer will be indirectly useful to
teachers in theory building or more directly for actual language
teaching.
5. Operationalizing constructs as
variables
– The researcher must attempt to identify all the constructs that are pertinent
to solving the problem at hand.
– It often requires a great deal of hard thought because studies of language
learning and teaching are highly complex and have many variables.
– It can be frustrating because many constructs that are important to
language learning (e.g., students’ motivation and students’ ambition) may
be difficult to measure or operationalize as variables.
– The failure to identify or operationalize variables in the beginning could
threaten the entire logic and framework of a study.
6. Research Hypotheses
• Hypotheses: is a precise, testable statement of what the researcher(s)
predict will be the outcome of the study.
• This usually involves proposing a possible relationship between two
variables: the independent variable (what the researcher changes)
and the dependent variable (what the research measures).
• In research, there is a convention that the hypothesis is written in
two forms, the null hypothesis, and the alternative hypothesis (called
the experimental hypothesis when the method of investigation is
an experiment).
7. Directional Hypothesis
• A one-tailed directional hypothesis predicts the nature of the effect of the
independent variable on the dependent variable. E.g., adults will correctly recall
more words than children.
• It can be formulated when there is a good theoretical reason, usually based on
previous research, to hypothesize that the relationship, if there is any, will be in
one direction or the other.
• If we had a correlational study, the directional hypothesis would state whether
we expect a positive or a negative correlation, we are stating how the two
variables will be related to each other. The directional hypothesis can also state
a negative correlation, e.g. the higher the number of face-book friends, the
lower the life satisfaction score “
8. Nondirectional Hypotheses
• A non-directional (or two tailed hypothesis) simply states that there will be a
difference between the two groups/conditions but does not say which will be
greater/smaller, quicker/slower etc. Using our example above we would say
“There will be a difference between the number of cold symptoms experienced
in the following week after exposure to a virus for those participants who have
been sleep deprived for 24 hours compared with those who have not been sleep
deprived for 24 hours.”
• When the study is correlational, we simply state that variables will be correlated
but do not state whether the relationship will be positive or negative, e.g. there
will be a significant correlation between variable A and variable B.
• A two-tailed non-directional hypothesis predicts that the independent variable
will have an effect on the dependent variable, but the direction of the effect is not
specified.
9. Null Hypothesis
The null hypothesis states that there is no relationship between the two
variables being studied (one variable does not affect the other).
It states results are due to chance and are not significant in terms of
supporting the idea being investigated.
A null hypothesis is a type of conjecture used in statistics that proposes that
there is no difference between certain characteristics of a population or data-
generating process.
Hypothesis testing provides a method to reject a null hypothesis within a
certain confidence level. (Null hypotheses cannot be proven, though.)
10. Null Hypothesis Example:
Participants who have been deprived of sleep for 24 hours will NOT have more cold
symptoms in the following week after exposure to a virus than participants who have
not been sleep deprived and any difference that does arise will be due to chance alone.
With a directional correlational hypothesis: There will NOT be a positive correlation
between the number of stress life events experienced in the last year and the number of
coughs and colds suffered, whereby the more life events you have suffered the more
coughs and cold you will have had”
With a non-directional or two tailed hypothesis: There will be NO difference between
the number of cold symptoms experienced in the following week after exposure to a
virus for those participants who have been sleep deprived for 24 hours compared with
those who have not been sleep deprived for 24 hours.
For a correlational: there will be NO correlation between variable A and variable B.
11. Alternative Hypotheses
The alternative hypothesis describes the population parameters that the
sample data represent if the predicted relationship exists.
The alternative hypothesis (H 1 ) is the statement that the scores came
from different populations the independent variable significantly affected
the dependent variable.
“There are no differences between the groups”. This is the hypothesis that
you are testing! Alternative hypothesis (Ha): “There are
effects/differences between the groups”. This is what you expect to find!
12. Alternative Hypotheses Example
In a two-tailed test, the null hypothesis states that the population mean equals a
given value. For example, H 0 : = 100. In a two-tailed test, the alternative
hypothesis states that the population mean does not equal the same given value as
in the null hypothesis. For example, H a : 100. Two-Tailed Hypotheses
The Null Hypothesis (H o ) states that there is no difference, effect, or correlation
in the population The Null Hypothesis (H o ) states that there is no difference,
effect, or correlation in the population H o is assumed to be true unless there is
enough evidence to reject it. H o is assumed to be true unless there is enough
evidence to reject it. Burden of proof on the researcher Burden of proof on the
researcher The researcher’s hypothesis (Alternative Hypothesis, H A ) is only tested
indirectly.
13. Stage 2: Sampling
Sampling is a process used in statistical analysis in which a
predetermined number of observations are taken from a
larger population. The methodology used to sample from a larger
population depends on the type of analysis being performed, but it
may include simple random sampling or systematic sampling.
1_ Random Sampling
2_ Systematic Sampling
3_ Stratified Random Sampling
14. Random Sampling
With random sampling, every item within a population has an equal
probability of being chosen. It is the furthest removed from any
potential bias because there is no human judgement involved in
selecting the sample.
E.g., a random sample may include choosing the names of 25
employees out of a hat in a company of 250 employees.
The population is all 250 employees, and the sample is random
because each employee has an equal chance of being chosen.
15. Systematic Sampling
– Systematic sampling begins at a random starting point within the population
and uses a fixed, periodic interval to select items for a sample. The sampling
interval is calculated as the population size divided by the sample size.
Despite the sample population being selected in advance, systematic
sampling is still considered random if the periodic interval is determined
beforehand and the starting point is random.
– Systematic sampling is simpler and more straightforward than random
sampling. It can also be more conducive to covering a wide study area. On
the other hand, systematic sampling introduces certain arbitrary parameters
in the data. This can cause over- or under-representation of particular
patterns.
16. Systematic Sampling
Because of its simplicity, systematic sampling is popular with
researchers.
Other advantages of this methodology include eliminating the
phenomenon of clustered selection and a low probability of
contaminating data.
Disadvantages include over- or under-representation of
particular patterns and a greater risk of data manipulation.
17. Stratified Random Sampling
Stratified random sampling allows researchers to obtain a sample
population that best represents the entire population being studied
by dividing it into subgroups called strata.
This method of statistical sampling, however, cannot be used in
every study design or with every data set.
Stratified random sampling differs from simple random sampling,
which involves the random selection of data from an entire
population, so each possible sample is equally likely to occur.
18. Stratified Random Sampling
– Stratified random sampling involves first dividing a population into
subpopulations and then applying random sampling methods to each
subpopulation to form a test group. A disadvantage is when researchers can't
classify every member of the population into a subgroup.
– This is different from simple random sampling, which involves the random
selection of data from the entire population so that each possible sample is
equally likely to occur. In contrast, stratified random sampling divides the
population into smaller groups, or strata, based on shared characteristics. A
random sample is taken from each stratum in direct proportion to the size of
the stratum compared to the population.
19. Sampling Distribution
A sampling distribution is a statistic that is arrived out through
repeated sampling from a larger population.
It describes a range of possible outcomes that of a statistic, such
as the mean or mode of some variable, as it truly exists a
population.
The majority of data analyzed by researchers are actually drawn
from samples, and not populations.
20. Sampling Distribution
– A sampling distribution is a probability distribution of a statistic obtained
from a larger number of samples drawn from a specific population. The
sampling distribution of a given population is the distribution of
frequencies of a range of different outcomes that could possibly occur for
a statistic of a population.
– In statistics, a population is the entire pool from which a
statistical sample is drawn. A population may refer to an entire group of
people, objects, events, hospital visits, or measurements. A population
can thus be said to be an aggregate observation of subjects grouped
together by a common feature.
21. Stage 3: Setting up Statistical Decisions
• On the basis of research hypotheses, the researcher must:
1: Select the correct statistical procedures
2: Formulate statistical hypotheses
3: Select an alpha decision level
22. 1: Choosing the correct statistics
The choose will be based on clear thinking about:
1: How many variables there are;
2: Which variables are dependent, independent, moderator or control
variables;
3: Which scales (nominal, ordinal, or interval) are used for each.
Then the researcher will have to decide the appropriateness of the
statistics that he/she used.
23. Statistical Hypotheses
– We can formulate the following shorthand versions:
H₀ r = 0 r equals zero
H₁ r > 0 r is greater than zero
H₂ r < 0 r is less than zero
H₃ r # 0 r does not equals zero
24. Statistical Hypotheses
• A population is the entire group that is of interest in a study
• A sample is a subgroup taken from that population to represent it.
• When calculations are made to describe a sample, they are called
statistics. (page,114, Brown)
• If the same calculations were actually done for the entire
population, they would be called parameters. These parameters
would give the best picture of what is going on in a given
population.
25. The Conceptual Differences
Between Statistics and Parameters
A statistic and a parameter are very similar. The difference between
a statistic and a parameter is that statistics describe a sample.
A parameter describes an entire population.
We use different notation for parameters and statistics:
The statistical symbols are usually Roman letters (e.g., X and SD
for the sample mean and the standard deviation)
The parameters are symbolized by Greek letters (μ and δ for the
population mean and the standard deviation)
26. Alpha Decision Level
– Typically researcher sets alpha at 0.05.However there are
instances when researcher may decide to use a more stringent
level of alpha , e.g., Alpha 0.05 indicates researcher willing to take
up to 5% risk of making an error (Type I error) when deciding
statistical significance. Alpha 0.01 indicates researcher willing to
take up to 1% risk of Type I error Type I error occurs when a
researcher rejects the null hypothesis when in fact it is true in the
population.
27. Do you reject or fail to reject the
null hypothesis?
The decision is made by examining the p level furnished by
the computer. Example: if the alpha level is set at .05,
inferential statistics with p levels of .05 or less are
statistically significant. When this is the case , the H₀ is
rejected and Hḁ is supported.
28. How strong does the evidence
have to be to reject the Null?
–The researcher must set a criterion. This is the significance level, or alpha (
). The researcher must set a criterion. This is the significance level, or alpha (
). The conventional alpha level is.05. The conventional alpha level is.05. We are
conservative about rejecting H₀. We are conservative about rejecting H₀.
When testing for significance, we calculate a test statistic. When testing for
significance, we calculate a test statistic. The test statistic allows us to
determine the probability of obtaining our results under the assumption that H o
is true. The test statistic allows us to determine the probability of obtaining our
results under the assumption that H o is true. If this probability is small enough,
then H o is probably not true, so we should reject it.
29. Determining Significance
• If the probability is lower than our significance level, we Reject H₀ (p
<.05). If the probability is lower than our significance level, we Reject
H₀ (p <.05). If the probability is not lower than our significance level,
we Fail to Reject H₀ (p >.05). If the probability is not lower than our
significance level, we Fail to Reject H₀ (p >.05). H₀ is never
“accepted” or “proven.” H₀ is never “accepted” or “proven.”
• Decide what p-value would be “too unlikely” This threshold is called
the alpha level. When a sample statistic surpasses this level, the result
is said to be significant. Typical alpha levels are .05 and .01.
30. Determining Significance
Significance as a Probability Game
There are four possible outcomes in significance test, based on two
dimensions:
The researcher’s decision about H₀. The researcher’s decision about
H₀. Whether H₀ is really true or false. Whether H₀ is really true or
false. The probability of each outcome can be determined. The
probability of each outcome can be determined.
31. Stage 4: Necessary Consideration
Four types of information must need to be found:
1: The observed statistics (those that were actually calculated)
2: Whether the assumption underlying those statistics were met
3: The degree of freedom involved for each statistic.
4: The critical values for each statistics
32. Observed Statistics
Whether the results are a straightforward Pearson r or a
complicated looking analysis of variance table based on F
ratios (Chapter 11, Brown), the researcher does a lot of
adding, subtracting, dividing, and multiplying to get there.
Often, he/she does so with a mainframe computer and using
different statistical software like the Statistical Package for the
Social Sciences (1975).
The result of the calculations will be observed statistics.
33. Assumptions
An assumption is a precondition that must be met for the particular
statistical analysis to be accurately applied.
E.g., one of the assumptions that underlies the proper application of the
Pearson product-moment correlation coefficient (r) is that each set of scores
is on an interval scale. The scales involved must not be nominal or ordinal.
If they are other than interval scales, other statistics may be applied.
If the data don’t meet the assumptions of the procedure perfectly, we will
have only a negligible amount of error in the inferences we draw.
34. Degrees of Freedom
It is a mathematical equation used primarily in statistics, degrees of
freedom can be used in statistics to determine if results are
significant.
The degrees of freedom (df) are simply n-1.
– The degrees of freedom can be calculated to help ensure the
statistical validity of chi-square tests, t-tests and even the more
advanced f-tests. These tests are commonly used to compare
observed data with data that would be expected to be obtained
according to a specific hypothesis.
35. Degrees of Freedom
Because degrees of freedom calculations identify how many values
in the final calculation are allowed to vary, they can contribute to
the validity of an outcome. These calculations are dependent upon
the sample size, or observations, and the parameters to be
estimated, but generally, in statistics, degrees of freedom equal the
number of observations minus the number of parameters. This
means there are more degrees of freedom with a larger sample size.
36. Formula for Degrees of Freedom
df = N-1 (Where N is the number of values in the data set (sample size). Take a
look at the sample computation.)
If there is a data set of 4, (N=4).
Call the data set X and create a list with the values for each data. For this
example data, set X includes: 15, 30, 25, 10
This data set has a mean, or average of 20. Calculate the mean by adding the
values and dividing by N: (15+30+25+10)/4= 20
Using the formula, the degrees of freedom would be calculated as df = N-1: In
this example, it looks like, df = 4-1 = 3
This indicates that, in this data set, three numbers have the freedom to vary as
long as the mean remains 20.
37. Critical Value
The critical value is the value that the researcher might expect to
observe in the sample simply because of chance. In most cases, an
observed statistic must exceed the critical value to reject the null
hypothesis and thereby accept one of the alternative hypotheses.
This critical value will vary from study to study even for the same
statistic because the degrees of freedom will usually vary, largely
owing to differences in the size of samples.
38. Stage 5: Statistical Decisions
1: Hypothesis testing (not to be confused with the common
meaning of “testing”);
2: The careful interpretation of the results;
3: An awareness of the potential pitfalls for a particular statistical
test.
39. Hypothesis testing
o The formal procedure statisticians follow to determine whether a
certain hypothesis is valid or not is referred to as hypothesis testing.
o By using hypothesis testing, statisticians can validate statements such
as, 'This washer only needs one gallon of water to wash a large load of
clothes.'
Hypothesis testing is a 4-step process:
Step 1: Write the hypothesis.
Step 2: Create an analysis plan.
Step 3: Analyze the data.
Step 4: Interpret the results.
40. Interpretation of the Results
Whenever we encounter a research finding based on the interpretation of a p value
from a statistical test, whether we realize it or not, we are discussing the result of a
formal hypothesis test. This is true irrespective of whether the test involves
comparisons of means, regression results or other types of statistical tests. As
readers of research, it is important to understand the underlying principles of
hypothesis testing, so that when faced with statistical results, we reach the right
conclusions and make good decisions about which findings are robust enough to
be translated into clinical practice.
A result is statistically significant when the p-value is less than alpha. This
signifies a change was detected: that the default hypothesis can be rejected. If p-
value > alpha: Fail to reject the null hypothesis (i.e. not significant result). If p-
value <= alpha: Reject the null hypothesis (i.e. significant result).
44. Stage 4: Necessary Calculations
Observed statistics
Assumptions:
1: Independence
2: Normal Distribution
3: Interval Scales
4: Linear Relationship
Degrees of Freedom
Critical Values
45. Statistical Decisions
o Hypothesis Testing
o Interpretation of Results
o Potential Pitfalls
1: Restriction of Range
2: Skewedness
3: Casuality
46. Biserial Correlation
The biserial correlation is a correlation between on one hand, one
or more quantitative variables, and on the other hand one or more
binary variables. It was introduced by Pearson (1909).
The biserial correlation coefficient varies between -1 and 1. 0
corresponds to no association (the means of the quantitative
variable for the two categories of the qualitative variable are
identical).
47. Biserial Correlation
– For the two-tailed test, the null H0 and alternative Ha hypotheses are as follows:
H0 : r = 0
Ha : r ≠ 0
– In the left one-tailed test, the following hypotheses are used:
H0 : r = 0
Ha : r < 0
– In the right one-tailed test, the following hypotheses are used:
H0 : r = 0
Ha : r > 0
48. Correlation Coefficient
Correlation coefficients are used to measure how strong a relationship is
between two variables.
There are several types of correlation coefficient, but the most popular
is Pearson’s. Pearson’s correlation (called Pearson’s R) is a correlation
coefficient commonly used in linear regression.
If you’re starting out in statistics, you’ll probably learn about
Pearson’s R first.
In fact, when anyone refers to the correlation coefficient, they are usually
talking about Pearson’s.
49. Correlation Coefficient
Correlation coefficient formulas are used to find how strong a relationship is
between data. The formulas return a value between -1 and 1, where:
1 indicates a strong positive relationship.
-1 indicates a strong negative relationship.
A result of zero indicates no relationship at all
50. A correlation coefficient of 1 means that for every positive increase in one variable,
there is a positive increase of a fixed proportion in the other. For example, shoe sizes
go up in (almost) perfect correlation with foot length.
A correlation coefficient of -1 means that for every positive increase in one variable,
there is a negative decrease of a fixed proportion in the other. For example, the
amount of gas in a tank decreases in (almost) perfect correlation with speed.
Zero means that for every increase, there isn’t a positive or negative increase. The
two just aren’t related.
The absolute value of the correlation coefficient gives us the relationship strength.
The larger the number, the stronger the relationship. For example, |-.75| = .75, which
has a stronger relationship than .65.
Correlation Coefficient
52. Kendall Tau
Kendall’s Tau is a non-parametric measure of relationships between columns
of ranked data. The Tau correlation coefficient returns a value of 0 to 1, where:
0 is no relationship,
1 is a perfect relationship.
A quirk of this test is that it can also produce negative values (i.e. from -1 to 0).
Unlike a linear graph, a negative relationship doesn’t mean much with ranked
columns (other than you perhaps switched the columns around), so just remove
the negative sign when you’re interpreting Tau.
53. Kendall Tau
Several version’s of Tau exist.
Tau-A and Tau-B are usually used for square tables (with equal columns
and rows). Tau-B will adjust for tied ranks. Tau-C is usually used for
rectangular tables. For square tables, Tau-B and Tau-C are essentially the
same.
Most statistical packages have Tau-B built in, but you can use the following
formula to calculate it by hand:
Kendall’s Tau = (C – D / C + D)
Where C is the number of concordant pairs and D is the number
of discordant pairs.
54. Kendall W
– Kendall's W (known as Kendall's coefficient of concordance) is a non-
parametric statistic. It is a normalization of the statistic of the Friedman test, and
can be used for assessing agreement among raters. Kendall's W ranges from 0 (no
agreement) to 1 (complete agreement).
E.g., a number of people have been asked to rank a list of political concerns, from
most important to least important. Kendall's W can be calculated from these data.
If the test statistic W is 1, then all the survey respondents have been unanimous,
and each respondent has assigned the same order to the list of concerns. If W is 0,
then there is no overall trend of agreement among the respondents, and their
responses may be regarded as essentially random. Intermediate values
of W indicate a greater or lesser degree of unanimity among the various responses.
55. Multiple Regression
– Multiple regression is an extension of simple linear regression. It is used
when we want to predict the value of a variable based on the value of two or
more other variables. The variable we want to predict is called the dependent
variable (or sometimes, the outcome, target or criterion variable).
– Multiple regression also allows you to determine the overall fit (variance
explained) of the model and the relative contribution of each of the
predictors to the total variance explained. For example, you might want to
know how much of the variation in exam performance can be explained by
revision time, test anxiety, lecture attendance and gender "as a whole", but
also the "relative contribution" of each independent variable in explaining
the variance.
56. Standard Partial Regression
– the number of standard deviations that YY would change for every
one standard deviation change in X_1X1, if all the
other XX variables could be kept constant.
– When the purpose of multiple regression is prediction, the
important result is an equation
containing partial regression coefficients (slopes).
– The magnitude of the partial regression coefficient depends on
the unit used for each variable.
57. – When the purpose of multiple regression is understanding functional
relationships, the important result is an equation
containing standard partial regression coefficients, like this:
Where b'_1b1′ is
the standard partial regression coefficient of yy on X_1X1.
The magnitude of the standard partial regression coefficients tells you
something about the relative importance of different variables; XX variables
with bigger standard partial regression coefficients have a stronger
relationship with the YY variable.
Standard Partial Regression
58. Linear Regression
– Linear regression, while a useful tool, has significant limits. As it’s name implies,
it can’t easily match any data set that is non-linear. It can only be used to make
predictions that fit within the range of the training data set. And, most
importantly for this article, it can only be fit to data sets with a single dependent
variable and a single independent variable.
– The general form of the equation for linear regression is: y = B * x + A
– where y is the dependent variable, x is the independent variable, and A and B are
coefficients dictating the equation. The difference between the equation for linear
regression and the equation for multiple regression is that the equation for
multiple regression must be able to handle multiple inputs, instead of only the
one input of linear regression.
59. Heteroscedasticity
– Heteroscedasticity is a hard word to pronounce, but it doesn't need to be a
difficult concept to understand. Put simply, heteroscedasticity refers to the
circumstance in which the variability of a variable is unequal across the
range of values of a second variable that predicts it.
– A scatterplot of these variables will often create a cone-like shape, as the
scatter (or variability) of the dependent variable (DV) widens or narrows as
the value of the independent variable (IV) increases. The inverse of
heteroscedasticity is homoscedasticity, which indicates that a DV's
variability is equal across values of an IV.
60. Heteroscedasticity
Plot with random data showing
heteroscedasticity
In statistics, is heteroscedastic (or
heteroskedastic; if the variability
of the random disturbance is
different across elements of the
vector. Variability could be
quantified by the variance or any
other measure of statistical
dispersion. Heteroscedasticity is the
absence of homoscedasticity. A
typical example is the set of
observations of income in different
cities.
61. Heteroscedasticity
• The existence of heteroscedasticity is a major concern in regression
analysis and the analysis of variance, as it invalidates statistical tests of
significance that assume that the modelling errors all have the same
variance. While the ordinary least squares estimator is still unbiased in the
presence of heteroscedasticity, it is inefficient and generalized least
squares should be used instead.
• Because heteroscedasticity concerns expectations of the second moment of
the errors, its presence is referred to as misspecification of the second order.
62. Multicollinearity
– Multicollinearity is the occurrence of high intercorrelations among two or
more independent variables in a multiple regression model. Multicollinearity
can lead to skewed or misleading results when a researcher or analyst
attempts to determine how well each independent variable can be used most
effectively to predict or understand the dependent variable in a statistical
model.
– Multicollinearity can lead to wider confidence intervals that produce less
reliable probabilities in terms of the effect of independent variables in a
model. That is, the statistical inferences from a model with multicollinearity
may not be dependable.
63. KEY TAKEAWAYS
Multicollinearity is a statistical concept where independent
variables in a model are correlated.
Multicollinearity among independent variables will result in less
reliable statistical inferences.
It is better to use independent variables that are not correlated or
repetitive when building multiple regression models that use two
or more variables.
Multicollinearity
64. Data Transformation
In statistics:
Data transformation is the application of
a deterministic mathematical function to each point in a data set—that is, each
data point zi is replaced with the transformed value yi = f (zi), where f is a
function.
Transforms are usually applied so that the data appear to more closely meet the
assumptions of a statistical inference procedure that is to be applied, or to
improve the interpretability or appearance of graphs.
65. Phi Coefficient
– The Phi Coefficient is a measure of association between two binary
variables (i.e. living/dead, black/white, success/failure). It is also called the Yule
phi or Mean Square Contingency Coefficient and is used for contingency
tables when:
– At least one variable is a nominal variable.
– Both variables are dichotomous variables.
A simple contingency table. Image: Michigan Dept. of Agriculture
66. – The phi coefficient is a symmetrical statistic, which means
the independent variable and dependent variables are interchangeable.
The interpretation for the phi coefficient is similar to the Pearson
Correlation Coefficient. The range is from -1 to 1, where:
0 is no relationship.
1 is a perfect positive relationship: most of your data falls along the
diagonal cells.
-1 is a perfect negative relationship: most of your data is not on the
diagonal.
Phi Coefficient
67. Point-Biserial Correlation
– A point-biserial correlation is used to measure the strength and direction of
the association that exists between one continuous variable and one
dichotomous variable. It is a special case of the Pearson’s product-moment
correlation, which is applied when you have two continuous variables,
whereas in this case one of the variables is measured on a dichotomous
scale.
– E.g., you could use a point-biserial correlation to determine whether there
is an association between salaries, measured in dollars, and gender (i.e.,
your continuous variable would be "salary" and your dichotomous
variable would be "gender", which has two categories: "males" and
"females").
68. Spearman rho
Spearman Rank Correlation
The Spearman rank correlation coefficient, rs, is
the nonparametric version of the Pearson correlation coefficient.
Your data must be ordinal, interval or ratio. Spearman’s returns a
value from -1 to 1, where:
+1 = a perfect positive correlation between ranks
-1 = a perfect negative correlation between ranks
0 = no correlation between ranks.
69. – The formula for the Spearman rank correlation coefficient when
there are no tied ranks is:
Spearman rho
Spearman Rank Correlation
70. Tetrachoric Correlation
– Tetrachoric correlation is used to measure rater agreement for binary
data; Binary data is data with two possible answers—usually right or
wrong. The tetrachoric correlation estimates what the correlation
would be if measured on a continuous scale. It is used for a variety
of reasons including analysis of scores in Item Response Theory
(IRT) and converting comorbity statistics to correlation coefficients.
This type of correlation has the advantage that it’s not affected by
the number of rating levels, or the marginal proportions for rating
levels.
71. – The term “tetrachoric correlation” comes from the tetrachoric
series, a numerical method used before the advent of
computers. While it’s more common to estimate correlations
with methods like maximum likelihood estimation, there is a
basic formula you can use.
Tetrachoric Correlation
72. The two main assumptions are:
The underlying variables come from a normal distribution. With only two
variables, this is impossible to test. You should, therefore, have a good
theoretical reason for using this particular type of correlation; in other words,
you might know that the type of data you are dealing with tends to follow a
normal distribution most of the time. Rating errors should also follow a
normal distribution.
There is a latent continuous scale underneath your binary data. In other
words, the trait you are measuring should be continuous and not discrete.
In addition, you may want to make sure that errors are independent between
raters and cases and the variance for errors is homogeneous across levels of
the independent variable.
73. Curvilinear
– Curvilinear regression analysis fits curves to data instead of the straight
lines you see in linear regression. Technically, it’s a catch all term for any
regression that involves a curve. For example, quadratic regression and cubic
regression. About the only type that isn’t includes in this catch-all definition
is simple linear regression.
74. Standard Error of Estimate (SEE)
– A linear regression gives us a best-fit line for a scatterplot of data. The standard
error of estimate (SEE) is one of the metrics that tells us about the fit of the line to
the data. The SEE is the standard deviation of the errors (or residuals).
– The standard error of estimate tells you approximately how large the prediction
errors (residuals) are for your data set, in the same units as Y. How well can you
predict Y? The answer is, to within about Se above or below.16
– Since you usually want your forecasts and predictions to be as accurate as
possible, you would be glad to find a small value for Se. You can interpret Se as a
standard deviation in the sense that, if you have a normal distribution for the
prediction errors, then you will expect about two-thirds of the data points to fall
within a distance Se either above or below the regression line.
75. Pearson r
In statistics, the Pearson correlation coefficient (PCC), also referred to
as Pearson's r, the Pearson product-moment correlation
coefficient (PPMCC), or the bivariate correlation.
Pearson's correlation coefficient is the covariance of the two variables
divided by the product of their standard deviations. The form of the
definition involves a "product moment", that is, the mean (the first
moment about the origin) of the product of the mean-adjusted random
variables; hence the modifier product-moment in the name.
76. The correlation coefficient ranges from −1 to 1. A value of 1 implies that a linear
equation describes the relationship between X and Y perfectly, with all data points
lying on a line for which Y increases as X increases. A value of −1 implies that all
data points lie on a line for which Y decreases as X increases. A value of 0 implies
that there is no linear correlation between the variables.
More generally, note that (Xᵢ − X)(Yᵢ − Y) is positive if and only if Xᵢ and Yᵢ lie on
the same side of their respective means. Thus the correlation coefficient is positive
if Xᵢ and Yᵢ tend to be simultaneously greater than, or simultaneously less than, their
respective means. The correlation coefficient is negative (anti-correlation) if Xᵢ and
Yᵢ tend to lie on opposite sides of their respective means. Moreover, the stronger is
either tendency, the larger is the absolute value of the correlation coefficient.
Pearson r
77. One-Tailed Decisions
One tailed test. Although this picture is shaded on the left, it’s mirror
image (i.e. where it’s shaded on the right) would also be a one tailed
test.
78. A one-tailed requires a smaller sample size to achieve the same
effect with the same power.
A one-tailed is a statistical test in which the critical area of a
distribution is one-sided so that it is either greater than or less than
a certain value, but not both. If the sample being tested falls into
the one-sided critical area, the alternative hypothesis will be
accepted instead of the null hypothesis.
A one-tailed test is also known as a directional hypothesis or
directional test.
One-Tailed Decisions
79. A one-tailed test is a statistical hypothesis test set up to show that the
sample mean would be higher or lower than the population mean, but not
both.
When using a one-tailed test, the analyst is testing for the possibility of
the relationship in one direction of interest, and completely disregarding
the possibility of a relationship in another direction.
Before running a one-tailed test, the analyst must set up a null hypothesis
and an alternative hypothesis and establish a probability value (p-value).
One-Tailed Decisions
80. Two-Tailed Decisions
– In statistics, a two-tailed test is a method in which the critical area
of a distribution is two-sided and tests whether a sample is greater
than or less than a certain range of values. It is used in null-
hypothesis testing and testing for statistical significance. If the
sample being tested falls into either of the critical areas, the
alternative hypothesis is accepted instead of the null hypothesis.
The two-tailed test gets its name from testing the area under both
tails of a normal distribution, although the test can be used in other
non-normal distributions.
81. In statistics, a two-tailed test is a method in which the critical area of a
distribution is two-sided and tests whether a sample is greater or less than
a range of values.
It is used in null-hypothesis testing and testing for statistical significance.
If the sample being tested falls into either of the critical areas, the
alternative hypothesis is accepted instead of the null hypothesis.
By convention two-tailed tests are used to determine significance at the
5% level, meaning each side of the distribution is cut at 2.5%.
Two-Tailed Decisions
82. Summary
State your hypotheses: You are not attempting to prove your alternative
hypotheses. You are testing the null hypothesis. If you reject the null hypothesis,
then you are left with support for the alternative(s).
Set your decision criteria. Your alpha level will tell you what to decide. Reject
the null hypothesis. Fail to reject the null hypothesis.
Describe the data you collected from the sample Inferential Statistics; Making
inferences about the population from the data collected from the sample;
Generalize results from study to the population.