Kinds of variable
The independent variable:
It is the factor that is measured, manipulated or selected by the experimenter to determine its
relationship to an observed phenomenon. It is a stimulus variable or input operates within a person or
within his environment to effect behavior. Independent variable may be called factor and its variation
is called levels.
The dependent variable:
The dependent variable is a response variable or output. The dependent variable is the factor that is
observed and measured to determine the effect of the independent variable; it is the factor that appears,
disappears, or varies as the researcher introduces, removes, or varies the independent variables.
It is the factor that is measured, manipulated or selected by the experimenter to discover whether it
modifies the relationship of the independent variable to an observed phenomenon. The term moderate
variable describes a special type of independent variable, a secondary independent variable selected to
determine if it affects the relationship between the study’s primary independent variable and its
Control variables are factors controlled by the experimenter to cancel out or neutralized any effect they
might otherwise on the observed phenomena. A single study can not examine all of the variables in a
situation (situational variable) or in a person (dispositional variable); some must be neutralized to
guarantee that they will not exert differential or moderating effects on the relationship between the
independent variables and dependent variables.
An intervening variable is the factor that theoretically effects observed phenomena but can not be seen,
measured, or manipulated; its effects must be inferred from the effects of the independent and
moderate variable on the observed phenomena.
Consider the hypothesis
Among students of the same age and intelligence, skill performance is directly related to the number of
practice trials, the relationship being particularly strong among boys, but also holding, though less
directly, among girls’. this hypothesis that indicates that practice increases learning, involve several
Independent variable: number of practice trail
Dependent variable: skill performance
Control variable: age, intelligence
Moderate variable: gender
Intervening variable: learning
Causes relationship effects
Moderate Intervening Dependent
variables variables variables
Steps in data processing
Raw data Editing Coding Analysis
Interview Developing a
code book Developing a
Questionnaires frame of analysis
observation Pre-testing the
code book Analysis
Sec.sources Coding the data
Verifying the Manual
There are two types of data.
Qualitative Data and Quantitative Data
Qualitative Data is further divided into Nominal data and ordinal data.
As it obvious from the name nominal means “to give names”. In social sciences, the qualitative cannot
be measured or simplified. To calculate this type of data, it is named and categorized. This named or
categorized data is called nominal data.
The friendly term ordinal gives a meaning of “ordered or arranged”. This data is arranged into orders,
categorizing individuals as more than or less than one another. After nominalising the data into
categories, it is then ordered or arranged to get the desired result. Although ordinal measurement may
require more difficult processes but it gives more informative, précised data.
It is the data or score as units of equal appearing magnitude. The interval data can be added subtracted
but cannot be multiplied or divided.
It has a true zero value, that is, a point that represents the complete absence of the measured
characteristics, ratios are comparable at different points. It is much more frequently used in the
physical sciences than in behavioral sciences;
For example 9 ohms indicates three times the resistance of 3 ohms, while 6 ohms stands in the same
ratio to 2 ohms.
Listed below are the scores of a group of students on a mid semester English test.
How many students received a score of 36?
Did most of the students receive a score above 50?
Unordered data is difficult to tell, to make any sense out of this data. We must put it into some sort of
order. One of the most common ways to do this is to prepare a frequency distribution. This is done by
listing, in rank order from high to low, with tallies, to indicate the number of subjects receiving each score.
Often score in distribution are grouped into intervals. This results in grouped frequency distribution.
Example of a frequency distribution
Raw score frequency
N = 20
Table of a Grouped Frequency Distribution
Intervals of Five Frequencies
60 -- 64 4
55 – 59 3
50 – 54 3
45 – 49 0
40 – 44 0
35 – 39 5
30 – 34 5
It enables a researcher to summarize the data in a frequency distribution with single number.
It is of three kinds;
The mode is the most frequent score in a distribution. The score attained by more students than any
other score e.g. in a distribution, 25, 20,19,17,16,16,13,12. The mode is 16.
What about this distribution?
25, 20, 19, 19, 17, 16, 16, 12, and 11.
This distribution has two modes, 19 and 16. Hence it is called bimodal distribution. This mode does
not tell us very much about a distribution. However, it is not often used in educational research.
Median (the mid point)
The median is the point below and above 50% of the score in a distribution fall. This is a distribution
1, 2, 3, 4, 5. The median is 3. If the numbers are even in a distribution then the median is the point
halfway between the two middle most scores. In a distribution;
The median is 7.
It is determined by adding up all of the scores and then dividing this sum by the total number of scores.
X where sum of X represents any raw score value, n represents the total number of
scores and X represents the mean. All the averages give us ample information data by a single value.
But sometimes, the researcher cannot get the required results from the data by using average. Then
there is a need for measures researchers can use to describe the spread or variability that exists within a
distribution because average tells us the total behavior of data by single unit that sometimes leads to
confusion and ambiguity. To calculate the position of the data or deviation there are certain ways.
1. Measures of Data Variability:
Knowing central tendencies (mean, median, and mode) isn’t enough. Also need a method for
determining how close the data is clustered around its center point(s).
The most typical measures of data variability:
– Variance, and
– Standard Deviation.
• Simplest measure of variability.
• Calculated by subtracting the smallest measurement from the largest measurement.
• It is not a good measure of variability. i.e. if two ranges are same, it does not mean that the
spread is same.
• It is the sum of the square of the deviation from the mean divided by (n-1) for a sample and is
denoted by s2.
Similarly, the sum of the square of the deviation from the mean divided by N for the population and is
denoted by s2.
Note: Deviations are squared to remove effects of negative differences.
• While variance does not provide a useful metric (i.e. “units squared”), taking the positive
square root of the variance provides a metric which is the same as the data itself (i.e. “units”).
– Sample Standard Deviation - s
– Population Standard Deviation - s
Application of mean & standard deviation to observe the behavior ofthe data
• Data can be standardized using mean & standard deviation. Thus, for a single data set,
variability can be discussed in terms of how many members of the data set fall within one, two,
three, or more standard deviations of the mean.
It uses a common scale to indicate how an individual compare to other individual in group. These
scores are particularly helpful in comparing an individual’s relative position. The two standards score
are the most frequently used in educati nal research,
1. 1 Z – Score
2. T- Score
Z – Score
The simplest form of standard score is the Z – score. It expresses how far a raw score is from the mean
in standard deviation units. A big advantage of Z – Score is that they allow raw scores on different
tests to be compared.
Researchers use a formula to convert a raw score into z-score
Z score = raw score – mean
For example a student received raw scores of 60 on a biology test and 80 on a chemistry test. A naïve
observer might be inclined to infer that the student was doing better in chemistry than in biology. But
this might be unwise, for how well the student is comparatively cannot be determined until we know
the mean and standard deviation for each distribution of score. Let us suppose the mean is 50 in
biology and 90 in chemistry. Also assume the standard deviation on biology deviation is 5 and on
chemistry is 10.
What does this tell us?
The comparison of raw score and Z score on two tests.
Test score Raw score Mean SD Z Score % rank
Bio 60 50 5 2 98
Chemistry 80 90 10 -1 16
Probability and Z score.
It refers to the likely hood of an event occurring and a percentage stated in decimal form. For example
if there is a probability that an event will occur 25 percent of the time, this event can be said to have a
probability of .25.
There are two kinds of hypothesis; one is the predictive outcome of the study called research
hypothesis where as the null hypothesis is the assumption that there is no relationship between the
variables or in the population..
Co relational analysis:
It shows the existing relationship between the variables, with no manipulation of variables. It is also
used to analyze data containing two variables as well as examine the reliability and validity of the data
Types of correlation:
(When the variables are directly proportional to each other)
(When there is no correlation between the variables)
(When the variables are inversely proportional to each other)
When the researcher wants to make inferences to the population, he will have to examine their
Statistical significance can be determined if correlation have been obtained from the randomly selected
Depends on the size of the correlation
Significance of correlation
Size of the sample
Level of significance is very important since it relates directly to whether the null hypothesis is
rejected or not.
It is used to find out the relationship between more than two variables as in correlation analysis.
There are two ways;
Through multiple regressions it is possible to examine the relationship and predictive power of one or
more independent variables with the dependent variables. it shows which variables are significant in
their contribution explaining the variance in the dependent variable and how much they contribute.
Which contribution of variables distinguishes between one or more categories of dependent variables?
In it independent variable is not related to dependent variables as in regression, but rather operates
within a number of independent variables without a need to have dependent variables. In factor
analysis the interrelationships between and among the variables of the data are examined in an attempt
to find out how many independent dimensions can be identified in the data. It thus provides
information on the characteristics of the variables. This type of analysis is based on the assumption
that variables measuring the same factor will be highly related. Whereas variables measuring different
factors will have low correlations with one another.
As we know that different designs call for different methods of analysis. A statistical technique
appropriate for quantitative data will generally inappropriate for categorical data.
Types of Inferences Techniques:
There are two types of Inferences techniques that a researcher uses.
Non- Parametric technique
It is the most appropriate for interval data. It makes various kind of assumptions about the nature of
population from which the sample involved in the research study, are drawn they are generally more
powerful than non- Parametric techniques because it reveals a true difference or relationship if really
It is the most appropriate for nominal and ordinal data. It makes few assumptions about the nature of
the population from which the sample are taken.
It is used to compare the means of the two groups. T test is used to determine the probability - that
the difference between the groups of subjects rather than a chance variance in data it is used to
T-test for independent means;
It is used to compare the mean scores of two different independen groups.
T-test for correlate means;
It is used to compare the means scores of the same group before and after a treat mint of some sort is
given to see if any observed gain is significant or when the researcher design involve two matched
The result of t-test provides the researcher with a t-value.
A researcher is comparing the performance of the two randomly selected groups learning French by
two different methods. The experimental group learns wit the aid of computer while the control group
is exposed to the teacher. The researcher investigates the effects of the computer practice on students’
achievement on French. After three months both the groups undergo an achievement test.
The researcher uses t- test to examine whether there are differences in the achievements of the two
To have a deep insight of the data through descriptive statistics, first it have a mean X, SD and sample
size N of the data .There must be a mean of experimental or control group.
ANOVA :( one way analysis of variance)
One way analysis of variance is used to examine the differences in more than two groups.
The analysis is performed on the variances of the groups, focusing on whether the variability between
the groups is greater that the variability within the groups value is the ratio between variances over the
within the variances.
F= between group variance
Within group variance
If the difference between the groups is greater than the difference within the groups, than F value is
significant and the researcher can reject the null hypothesis. If the situation is inverse than F value is
Chi-Square (Non-Parametric Technique)
The chi test allows analysis of one, two or more nominal variables. It is based on the comparison
between expected frequencies and actual, obtained frequencies.
A researcher might want to compare how many male and female teachers favor a new curriculum, to
be instituted in a particular school district. he asks a sample of 50 teachers ,if they favor or oppose new
curriculum. if they do not differ significantly in their responses, then we would expect hat about the
same proportion of males and females would be in favor(or opposed to)instituting the curriculum.
Degree of freedom
Number of scores in a distribution that are free to vary-that is, that are not fixed.