SlideShare a Scribd company logo
1 of 10
οƒ˜ Criteria to Consider when Constructing Good Tests
A. Validity – is the degree to which the test measures what is intended to
measure. It is the usefulness of the test for a given purpose. It is the most
important criterion of a good examination.
Factors Influencing the Validity of the Tests In General
1. Appropriateness of Test – it should measure the abilities, skill and
information it is supposed to measure.
2. Directions –it should indicate how the learners should answer and
record their answers.
3. Reading Vocabulary and Sentence Structure –it should be based on
the intellectual level of maturity and background experience of the
learners.
4. Difficulty of Items - it should have items that are not too difficult and not
too easy to be able to discriminate the bright from slow pupils.
5. Construction of Test Items – it should not provide clues so it will not be
a test on clues nor ambiguous so it will not be a test on interpretation.
6. Length of the Test –it should just be sufficient length so it can measure
what it is supposed to measure and not that it is too short that it cannot
adequately measure the performance we want to measure.
7. Arrangement of Items –it should have items that are arranged in
ascending level of difficulty such that it starts with the easy so that the
pupils will pursue on taking the test.
8. Patterns of Answer –it should not allow the creation of patterns in
answering the test.
Ways in Establishing Validity
1. Face Validity – is done by examining the physical appearance of the test
2. Content Validity – is done through a careful and critical examination of
the objectives of the test so that it reflects the curricular objectives.
3. Criterion-related Validity – is established statistically such that a set of
scores revealed by a test is correlated with the scores obtained in
another external predictor or measure.
a. Concurrent validity – describes the present status of the individual
by correlating the sets of scores obtained from two measures given
concurrently.
b. Predictive validity – describes the future performance of an
individual by correlating the sets of scores obtained from two
measures given at a longer time interval.
4. Construct Validity – is established statistically by comparing
psychological traits or factors that theoretically influence scores in a test.
a. Convergent Validity – is established if the instrument defines
another similar trait other than what it is intended to measure. e.g.
Critical Thinking Test may be correlated with Creative Thinking Test.
b. Divergent Validity – is established if an instrument can describe only
the intended trait and not the other traits. e. g. Critical Thinking Test
may not be correlated with Reading Comprehension Test.
B. Reliability – it refers to the consistency of scores obtained by the same person
when retested using the same instrument or one that is parallel to it.
Factors Affecting Reliability
1. Length of the Test – as a general rule, the longer the test, the higher the
reliability. A longer test provides a more adequate sample of the behavior
being measured and is less distorted by chance factors like guessing.
2. Difficulty of the Test – ideally, achievement tests should be constructed
such that the average score is 50 percent correct and the scores range from
near zero to perfect. The bigger spread of the scores, the more reliable the
measured difference is likely to be. A test is reliable if the coefficient of
correlation is not less than 0.85.
3. Objectivity – can be obtained by eliminating the bias, opinions or
judgments of the person who checks the test.
Method
Type of Reliability
Measure
Procedure
Statistical
Measure
A.
Test-Retest Measure
of stability
Give a test twice to the same
group with any time interval
between tests from several
minutes to several years.
Pearson r
B.
Equivalent
Forms
Measure
of equivalence
Give parallel forms of tests with
close time intervals between
forms.
Pearson r
C.
Test-Retest
with Equivalent
Forms
Measure
of stability
and equivalence
Give parallel forms of test with
increased time intervals
between forms.
Pearson r
D.
Split Half Measure
of Internal Consistency
Give a test once. Score
equivalent halves of the test
e.g. odd- and even- numbered
items
Pearson r &
Spearman
Brown
Formula
E.
Kuder-
Richardson
Measure
of Internal Consistency
Give the test once then
correlate the
proportion/percentage of the
students passing and not
passing a given item.
Kuder-
Richardson
Formula 20
and 21
Formulas for Measures of Correlation Used in Establishing Test Validity & Reliability
Pearson r
π‘Ÿ =
βˆ‘ π‘‹π‘Œ
𝑁
βˆ’(
βˆ‘ 𝑋
𝑁
)(
βˆ‘ π‘Œ
𝑁
)
βˆšβˆ‘ 𝑋2
𝑁
βˆ’(
βˆ‘ 𝑋
𝑁
)
2
βˆšβˆ‘ π‘Œ2
𝑁
βˆ’ (
βˆ‘ π‘Œ
𝑁
)
2
Spearman Brown Formula
π‘Ÿπ‘’π‘™π‘–π‘Žπ‘π‘–π‘™π‘–π‘‘π‘¦ π‘œπ‘“ π‘‘β„Žπ‘’ π‘€β„Žπ‘œπ‘™π‘’ 𝑑𝑒𝑠𝑑 =
2π‘Ÿ π‘œπ‘’
1+ π‘Ÿ π‘œπ‘’
Kuder-Richardson Formula 20
𝐾𝑅20 =
𝐾
πΎβˆ’1
[1 βˆ’
βˆ‘ π‘π‘ž
𝑆2
]
Where:
X – scores in a test
Y – scores in a retest
N –number of examinees
Where:
roe– reliability coefficient
using the split-half or odd-
even procedure
Where:
K – no. of items
p – proportion of the examinees who got the
item right
q – proportion of the examinees who got the
item wrong
S2
– variance or the square of the standard
deviation
Kuder-Richardson Formula 21
𝐾𝑅21 =
𝐾
πΎβˆ’1
[1 βˆ’
π‘˜π‘Μ… π‘ž
𝑆2
]
Interpretation of the Pearson r correlation value
π»π‘–π‘”β„Ž π‘π‘œπ‘ π‘–π‘‘π‘–π‘£π‘’ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘› {
1 βˆ’ π‘ƒπ‘’π‘Ÿπ‘“π‘’π‘π‘‘ π‘π‘œπ‘ π‘–π‘‘π‘–π‘£π‘’ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘›
0.5 βˆ’ π‘ƒπ‘œπ‘ π‘–π‘‘π‘–π‘£π‘’ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘›
πΏπ‘œπ‘€ π‘π‘œπ‘ π‘–π‘‘π‘–π‘£π‘’ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘› {
0.5 βˆ’ π‘ƒπ‘œπ‘ π‘–π‘‘π‘–π‘£π‘’ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘›
0 βˆ’ π‘π‘’π‘Ÿπ‘œ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘›
πΏπ‘œπ‘€ π‘›π‘’π‘”π‘Žπ‘‘π‘–π‘£π‘’ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘› {
0 βˆ’ π‘π‘’π‘Ÿπ‘œ πΆπ‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘›
βˆ’0.5 βˆ’ π‘π‘’π‘”π‘Žπ‘‘π‘–π‘£π‘’ πΆπ‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘›
π»π‘–π‘”β„Ž π‘›π‘’π‘”π‘Žπ‘‘π‘–π‘£π‘’ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘› {
βˆ’0.5 βˆ’ π‘π‘’π‘”π‘Žπ‘‘π‘–π‘£π‘’ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘›
βˆ’1 βˆ’ π‘ƒπ‘’π‘Ÿπ‘“π‘’π‘π‘‘ π‘›π‘’π‘”π‘Žπ‘‘π‘–π‘£π‘’ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘›
C. Administrability – the test should be administered with ease, clarity and
uniformity so that scores obtained are comparable. Uniformity can be obtained
by setting the time limit and oral instructions.
D. Scorability – the test should be easy to score such that directions for scoring
are clear, the scoring key is simple; provisions for answer sheets are made.
E. Economy – the test should be given in the cheapest way, which means that
answer sheets must be provided so the test can be given from time to time.
F. Adequacy – the test should contain a wide sampling of items to determine the
educational outcomes or abilities so that the resulting scores are
representatives of the total performance in the areas measured.
G. Authenticity – the test should simulate real-life situations.
οƒ˜ Shapes of the Frequency Polygons
1. Normal – bell-shaped curve
2. Positively skewed – most scores are below the mean and there are extremely high scores, π‘₯Μ… >
π‘₯Μ‚ (mean is greater than the mode)
3. Negatively skewed – most scores are above the mean and there are extremely low scores,
π‘₯Μ… < π‘₯Μ‚ (mean is lower than the mode)
4. Leptokurtic – highly peaked and the tails are more elevated above the baseline
5. Mesokurtic – moderately peaked
6. Platykurtic – flattened peak
7. Bimodal Curve – curve with two peaks or mode
8. Polymodal Curve – curve with three or more modes
9. Rectangular Distribution – there is no mode
οƒ˜ Four Types of Measurement Scales
Measurement Scale Characteristics Example
1. Nominal  Groups and labels data Gender (1-male, 2-female)
2. Ordinal  Ranks data
 Distance between
points are indefinite
Income (1-low, 2-average, 3-
high)
3. Interval  Distance between
points are equal
 No absolute zero point
Test scores and temperature
*a score of zero in a test does
not mean no knowledge at all
4. Ratio  All of the above except
that it has an absolute
zero point
Height, weight
* a zero weight means no
weight at all
Where: 𝑝̅ =
𝑋̅
𝐾
; π‘ž = 1 βˆ’ 𝑝
Measures of Central Tendency and Variability
Assumptions When Used
Appropriate Statistical Tools
Measure of Central
Tendency
-describes the
representative value of
a set of data
Measure of Variability
-describes the degree of
spread or dispersion of a
set of data
ο‚· When the frequency distribution
is regularly/ symmetrically/
normal
ο‚· Usually used when the data are
numeric (interval or ratio)
Mean – the arithmetic
average
Standard Deviation – the
root-mean-square of the
deviations from the mean.
ο‚· When the frequency distribution
is irregular/ skewed
ο‚· Usually used when the data are
ordinal
Median – the middle
score in a group of
scores that are ranked
Quartile Deviation – the
average deviation of the 1st
and 3rd quartiles from the
median
ο‚· When the distribution of scores is
normal and quick answer is
needed
ο‚· Usually used when the data are
nominal
Mode – the score that
occurs frequently
Range – the difference
between the highest and
lowest score in a set of
observation
I. Procedure in the Computation of the Measures of Central Tendency
A. Mean
Procedure:
1. Mean of Ungrouped Data: used for few cases (N<30)
a. Get the sum of scores (Ξ£X)
b. Divide the sum by the number of cases (N)
Formula: 𝑋̅ = βˆ‘ 𝑋/𝑁
2. Mean of Grouped Data: uses for large cases (N>30)
There are 2 possible methods that will be discussed in computing the mean of grouped data.
a. Using Midpoint Method
Procedures:
1) Group data in the form of a frequency distribution
2) Compute the midpoints of all class limits (M)
3) Multiply the midpoints by their frequencies (M x F)
4) Get the sum of the products of the midpoints and frequencies (Ξ£ MF)
5) Divide the sum by the number of cases (N)
Formula: 𝑋̅ =
βˆ‘ 𝑀𝐹
𝑁
b. Using Class Deviation Method
Procedures:
1) Choose your arbitrary starting point or origin from any of the class limits
2) Get the midpoint of the class limit that you have chosen as your starting point. Call this
your assumed mean (AM)
3) Get the deviation (D) of each class limit from the class limit where the assumed mean
is. The deviation of the class limit where the assumed mean is located is 0. Add one
(+1) to each class limit higher than this point of origin and subtract one (-1) to the
class limit lower than the origin.
4) Multiply the frequencies by their corresponding deviations (FD)
5) Add the products of the frequencies and deviations (Ξ£FD)
6) Divide the sum by the number of cases (Ξ£FD/N)
7) Multiply the quotient by the number of class interval (i)
8) Add the product to the assumed mean
Formula: 𝑋̅ = 𝐴𝑀 + 𝑖 (
βˆ‘ 𝐹𝐷
𝑁
)
B. Mode
ο‚· Median of Ungrouped Data
There are several ways in the computation of median for ungrouped data. The process
depends on a case to case basis
Case 1: The total number of cases is an odd number
Procedure:
1.) Arrange the scores from the highest to lowest or vice versa
2.) Get the middlemost score. The score is the median score
Case 2: The total number of cases is an even number
Procedure:
1.) Arrange the scores from highest to lowest or vice versa.
2.) Get the two middlemost scores
3.) Compute the average of the two middlemost scores. The average is the median score.
Case 3: The middlemost score occurs twice, thrice, or more number of times
Procedure:
1.) Get the middlemost score/s, its/their identical score/s and its/their counterparts either
above or below the middlemost score/s
2.) Compute their average and the average score is the median.
2. Median for Grouped Data
Procedure:
1.) Add up or accumulate the frequencies starting from the lowest to the highest class limit. Call
this the cumulative frequency. (CF)
2.) Find one half of the number of cases in the distribution. (N/2)
3.) Find the cumulative frequency which is equal or closest but higher than the half of the
number of cases. The class containing this frequency is the median class.
4.) Find the lowest limit (LL) of the median class.
5.) Get the cumulative frequency of the class below the median class. (CFb)
6.) Subtract this from the half of the number of cases in the distribution. (N/2 – CFb)
7.) Get the frequency of the median class. (FMdn)
8.) Find the class interval (i) then follow the given formula below.
Formula:
𝑋̃ = 𝐿𝐿 + 𝑖 (
𝑁
2
βˆ’πΆπΉ 𝑏
𝐹𝑀𝑑𝑛
)
C. Mode
Procedure
1. Mode of Ungrouped Data
ο‚· Get the most frequent score
 when there are more than three modes, they are called polymodal or multimodal
 when there is no mode, it is describe as a rectangular distribution.
2. Mode for Grouped Data
a. Crude Mode – refers to the midpoint of the class limit with the highest frequency.
Procedure:
1.) Find the class limit with the highest frequency
2.) Get the midpoint of that class limit
3.) The midpoint of the class limit with the highest frequency is the crude mode
Where:
LL = lowestlimitof the medianclass
i = class interval
N/2 = half of the numberof cases
CFb = cumulative frequencybelow the
medianclass
FMdn = frequencyof the medianclass
b. Refined Mode–refers to the mode obtained from an ordered arrangements or a class
frequency distribution
Procedure:
1.) Get the mean and the median of the grouped data.
2.) Multiply the median by three (3Mdn)
3.) Multiply the mean by two (2Mn)
4.) Subtract 2Mn from 3Mdn to get the Mode. (Md)
Formula: 𝑋̂ = 3𝑀𝑑𝑛 βˆ’ 2𝑀𝑛
οƒ˜ How will you interpret the Measures of Central Tendency?
1.) The value that represents a set of data will be the basis in determining whether the group is
performing better or poorer than the other groups.
II. Procedure in the computation of the Measures of Variability
A. Range (R)
1. For Ungrouped Data – the difference between the highest and lowest score
2. For Grouped Data – the difference between the highest limit of the highest class limit and
the lowest limit of the lowest class limit.
B. Standard Deviation (SD)
Procedure for Ungrouped Data
1.) Find the mean. (𝑋̅)
2.) Subtract the mean from each score to get the deviation. [ 𝑑 = 𝑋̅ βˆ’ 𝑋̅]
3.) Square the deviation. (d2)
4.) Get the sum of the squared deviations. (Ξ£d2)
5.) Divide the sum by the number of cases (Ξ£ d2 / N – 1)
6.) Get the square root of the answer. √Σd2 / N-1
Formula: 𝑆𝐷 = √ βˆ‘ 𝑑
2
π‘βˆ’1
Procedure for Grouped Data
A. Using Class Deviation Method
1.) Like what you did in the mean, get the deviation (d) and the product of the frequency and
deviation of each score. (fd)
2.) Multiply the product of the frequency and the deviation by the deviation. (fd2)
3.) Get the sum of the product of the frequency and squared deviation. (Ξ£fd2)
4.) Compute the standard deviation using the formula below
Formula: 𝑺𝑫 = π‘°βˆš[
βˆ‘ 𝒇𝒅
𝟐
𝑡
] βˆ’ [
(βˆ‘ 𝒇𝒅)
𝟐
𝑡
𝟐
]
B. Using Midpoint Method
1.) Square the midpoint (M2) and multiply it by the
frequency midpoint (FM)
2.) Write the products of M & FM in another column and label it (FM2)
3.) Use the formula below to compute the Standard Deviation.
Formula:
𝑆𝐷 = √
βˆ‘ 𝐹𝑀2
𝑁
βˆ’ ( 𝑋̅)2
Where:
I = interval
N = Number of cases
Ξ£fd = sum of the product of frequency
and deviation
Ξ£fd2
= sum of the product of the
frequency and squared
deviation
οƒ˜ How will you interpret the standard deviation?
1.) The results will help you determine if the group is homogeneous or not.
2.) The results will also help you determine the number of students that fall below and above
the average performance.
Study how to do this:
ο‚· Mean – 1 SD and mean + 1 SD would give the limits of an average ability
ο‚· The point right below – 1 SD is the upper limit of the below average ability
ο‚· The point right above + 1 SD is the lower limitof the above average ability
C. Quartile Deviation (QD)
1. Procedure in the Computation of QD for Ungrouped Data
1.) Arrange the scores in descending or ascending order
2.) Compute the Q1 i.e. [ΒΌ (N)] and the results tells the rank of the Q1 score in the ordered
arrangement from the bottom.
3.) Look for the score in this rank.
4.) Compute the Q3 score [d = ΒΎ (N)] and the results tells the rank of the Q3 score.
5.) Look for the Q3 score in this rank
6.) Compute the QD
𝑄𝐷 =
𝑄3βˆ’π‘„1
2
2. Procedure in the Computation of QD for Grouped Data
1.) Compute for the value of the 1st quartile
𝑄1 = 𝐿𝐿 + (
𝑁
2
βˆ’πΆπΉ 𝑏
πΉπ‘ž
) 𝑖
2.) Compute for the 3rd quartile
𝑄3 = 𝐿𝐿 + (
3𝑁
2
βˆ’πΆπΉ 𝑏
πΉπ‘ž
) 𝑖
3.) Compute for the interquartile range or quartile
𝑄𝐷 =
𝑄3βˆ’π‘„1
2
οƒ˜ How will you interpret the quartile deviation?
The results will also tell if the group is homogeneous or not. It will also tell
how many of the students fall below or above the region of acceptable
performance. To do this, study the instruction below.
ο‚· Median – 1 QD and Median +1 QD would give the limits of an average ability
ο‚· The Point right below the (-1) QD is the upper limit of the below average
ability
ο‚· The point right above the +1 QD is the lower limit of the above average ability
STANDARD SCORES
ο‚· Indicate the pupil’s relative position by showing how far his raw score is
above or below average
ο‚· Express the pupil’s performance in terms of standard unit from the mean
ο‚· Represented by the normal probability curve or what is commonly called the
normal curve
ο‚· Used to have a common unit to compare raw scores from different tests
1. PERCENTILE
ο‚· tells the percentage of examinees that
lies below one’s score.
Formula: Pπ‘Ž = LL + i [
π‘Žπ‘βˆ’πΆπΉ 𝑏
𝐹𝑃 π‘Ž
]
Where:
Q1 – standsforthe 1st
quartile
LL – lowestlimit
N/4 – one-fourthof the total
numberof the population
CF – cumulative frequencybelow
the quartile class
Fq – frequencyof the classwhere
the firstquartile score falls
I - interval
Where:
LL – lowestlimitof the classof a% N
CFb – cumulative frequencybelowthe
classof a% N
FPa – frequencyof the classof a% N
2. Z-SCORES
ο‚· tells the number of standard deviations equivalent to a given raw score
Formula: 𝑍 =
π‘‹βˆ’π‘‹Μ…
𝑆𝐷
Note:
Z – score is negative when X <𝑋̅
Z – score is positive when X >𝑋̅
3. T-SCORES
ο‚· it refers to any set of normally distributed standard deviation score that has a mean of
50 and a standard deviation of 10.
ο‚· computed after converting raw scores to z-scores to get rid of negative values
Formula: 𝑇 βˆ’ π‘ π‘π‘œπ‘Ÿπ‘’ = 50 + 10(𝑍)
ASSIGNING GRADES/MARKS/RATINGS
A. Marking/Grading - is the process of assigning value to a performance
B. Mark/Grades/Ratings are symbols which:
Could be in –
ο‚· Percent such as: 70%, 75%, 80%, etc.
ο‚· Letters such as: A, B, C, D, or F
ο‚· Numbers such as: 1, 2, 3, 4, or 5
ο‚· Descriptive expressions such as:
Outstanding (O),
Very Satisfactory (VS),
Satisfactory (S),
Moderately Satisfactory (MS),
Needs Improvement (NI), etc.
[Note: Any symbol can be used provided that it has uniform meaning to all concerned]
Could represent –
ο‚· How a student is performing in relation to other students (Norm-Referenced
Grading)
ο‚· The extent to which a student has mastered a particular body of knowledge
(Criterion-Referenced Grading)
ο‚· How a student is performing in relation to a teacher’s judgment of his or her
potential. (Grading in Relation to Teacher’s Judgment)
Could be for –
ο‚· Certification that gives assurance that a student has mastered a specific
content or achieved a certain level of accomplishment.
ο‚· Selection that provides basis in identifying or grouping students for certain
educational paths or programs.
ο‚· Direction that provides information for diagnosis and planning
ο‚· Motivation that emphasizes specific material or skills to be learned and
helping students to understand and improve their performance.
Could be based on –
ο‚· Examination results or test data
ο‚· Observations of student work
ο‚· Group evaluation activities
ο‚· Class discussions and recitations
ο‚· Homework
ο‚· Notebooks and note taking
ο‚· Reports, themes and research papers discussions and debates
ο‚· Portfolios
ο‚· Projects
ο‚· Attitudes, etc.
Could be assigned by –
ο‚· Criterion-referenced grading or grading - based on fixed or absolute
standards where grade is assigned based on how a student has met the
criteria or the well-defined objectives of a course that were spelled out in
advance.
It is then up to the student to earn the grade he or she wants to
receive regardless of how other students in the class have performed. This
is done by transmuting test scores into marks or ratings.
ο‚· Norm-referenced grading or grading - based on relative standards where
a student’s grade reflects his or her level of achievement relative to the
performance of other students in the class.
In this system the grade is assigned based in the average of test
scores. The rating scales that are used in assigning grades are:
1.) The four point rating scale which uses the median and quartile deviation
of the test scores to group the scores into four and each group is
assigned the corresponding grade of A, B, C, and D or 1, 2, 3, or 4.
2.) The five point rating scale which uses the median and quartile deviation
of the test scores to group the scores into 5 and each group is assigned
the corresponding grade of A, B, C, D, or F or 1, 2, 3, 4, or 5
ο‚· Point or Percentage Grading System whereby the teacher identifies
points or percentages of various tests and class activities depending on
their importance. The total of these points will be the bases for the grade
assigned to the student.
ο‚· Contract Grading System where each student agrees to work for a
particular grade according to agreed-upon standards.
οƒ˜ Guidelines in Grading Students
1.) Explain your grading system to the students early in the course and remind them
of the grading policies regularly
2.) Base grades on a predetermined and reasonable set of standards.
3.) Base your grades on as much objective evidence as possible.
4.) Base grades on the student’s attitude as well as achievement, especially at the
elementary and high school level.
5.) Base grades on the student’s relative standing compared to classmates.
6.) Base grades on a variety of sources
7.) As a rule, do not change grades.
8.) Become familiar with the grading policy of your school and with your colleagues’
standards
9.) When failing a student, closely follow school procedures.
10.)Record grades on report cards and cumulative records.
11.)Guard against bias in grading.
12.)Keep pupils informed of their standing in the class
References
Frankael, J.R. & Wallen, N.E. (1993). How to Design and Evaluate Research in
Education, 2nd Edition, New York: McGrawHill Inc.
Nackmeas, C.F. and Nachmeas, D. (1996). Research Methods in the Social Sciences,
5th Edition, London: St. Martius Press, Inc.
Oriondo, Leonora et. al. (1996). Evaluating Educational Outcomes. Quezon City: Rex
Printing Company, Inc.
Omstein, Allan C. (1990). Strategies for Effective Teaching. Newyork: Harper Collins
Publisher: Navotas, M.M.

More Related Content

What's hot

Qualities of a good test (1)
Qualities of a good test (1)Qualities of a good test (1)
Qualities of a good test (1)kimoya
Β 
Testing, assessment, measurement and evaluation definition
Testing, assessment, measurement and evaluation definitionTesting, assessment, measurement and evaluation definition
Testing, assessment, measurement and evaluation definitionnorazmi danuri
Β 
Portfolio Assessment
Portfolio AssessmentPortfolio Assessment
Portfolio AssessmentRandy Epon
Β 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good testALMA HERMOGINO
Β 
Grading and reporting
Grading and reportingGrading and reporting
Grading and reportingReynel Dan
Β 
Development of classroom assessment tools
Development of classroom assessment toolsDevelopment of classroom assessment tools
Development of classroom assessment toolsEaicz12
Β 
Principles of high quality assessment
Principles of high quality assessmentPrinciples of high quality assessment
Principles of high quality assessmentaelnogab
Β 
Selection and Organization of Content
Selection and Organization of ContentSelection and Organization of Content
Selection and Organization of ContentCHS SHS
Β 
Norm referenced and Criterion Referenced Test
Norm referenced and Criterion Referenced TestNorm referenced and Criterion Referenced Test
Norm referenced and Criterion Referenced TestDrSindhuAlmas
Β 
Alternative Assessment
Alternative AssessmentAlternative Assessment
Alternative AssessmentWHS
Β 
teacher made test Vs standardized test
 teacher made test Vs standardized test teacher made test Vs standardized test
teacher made test Vs standardized testathiranandan
Β 
Types of Test
Types of Test Types of Test
Types of Test jasper gaboc
Β 
CONSTRUCTING PAPER-AND-PENCIL TESTS
CONSTRUCTING PAPER-AND-PENCIL TESTSCONSTRUCTING PAPER-AND-PENCIL TESTS
CONSTRUCTING PAPER-AND-PENCIL TESTSJhenq Campo
Β 

What's hot (20)

Qualities of a good test (1)
Qualities of a good test (1)Qualities of a good test (1)
Qualities of a good test (1)
Β 
Testing, assessment, measurement and evaluation definition
Testing, assessment, measurement and evaluation definitionTesting, assessment, measurement and evaluation definition
Testing, assessment, measurement and evaluation definition
Β 
Portfolio Assessment
Portfolio AssessmentPortfolio Assessment
Portfolio Assessment
Β 
Essay type test
Essay type testEssay type test
Essay type test
Β 
Types of Test
Types of TestTypes of Test
Types of Test
Β 
Characteristics of a good test
Characteristics of a good testCharacteristics of a good test
Characteristics of a good test
Β 
Grading and reporting
Grading and reportingGrading and reporting
Grading and reporting
Β 
Classroom assessment
Classroom assessmentClassroom assessment
Classroom assessment
Β 
Writing Test Items
Writing Test ItemsWriting Test Items
Writing Test Items
Β 
Development of classroom assessment tools
Development of classroom assessment toolsDevelopment of classroom assessment tools
Development of classroom assessment tools
Β 
Types of test
Types of testTypes of test
Types of test
Β 
Principles of high quality assessment
Principles of high quality assessmentPrinciples of high quality assessment
Principles of high quality assessment
Β 
Selection and Organization of Content
Selection and Organization of ContentSelection and Organization of Content
Selection and Organization of Content
Β 
Norm referenced and Criterion Referenced Test
Norm referenced and Criterion Referenced TestNorm referenced and Criterion Referenced Test
Norm referenced and Criterion Referenced Test
Β 
Alternative Assessment
Alternative AssessmentAlternative Assessment
Alternative Assessment
Β 
teacher made test Vs standardized test
 teacher made test Vs standardized test teacher made test Vs standardized test
teacher made test Vs standardized test
Β 
Types of Test
Types of Test Types of Test
Types of Test
Β 
Type of Test
Type of TestType of Test
Type of Test
Β 
CONSTRUCTING PAPER-AND-PENCIL TESTS
CONSTRUCTING PAPER-AND-PENCIL TESTSCONSTRUCTING PAPER-AND-PENCIL TESTS
CONSTRUCTING PAPER-AND-PENCIL TESTS
Β 
Qualities of a Good Test
Qualities of a Good TestQualities of a Good Test
Qualities of a Good Test
Β 

Similar to Criteria to consider when constructing good tests

Adapted from Assessment in Special and incl.docx
Adapted from Assessment in Special and incl.docxAdapted from Assessment in Special and incl.docx
Adapted from Assessment in Special and incl.docxnettletondevon
Β 
Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Linejan
Β 
Educational measurement and evaluation
Educational measurement and evaluationEducational measurement and evaluation
Educational measurement and evaluationalkhaizar
Β 
Class demo in teaching (ugly version)
Class demo in teaching (ugly version) Class demo in teaching (ugly version)
Class demo in teaching (ugly version) CharityNice Nulo
Β 
Assessment of learning and Educational Technology
Assessment of learning and Educational Technology Assessment of learning and Educational Technology
Assessment of learning and Educational Technology Jofamaeluceno
Β 
CA Group # 4.pptx
CA Group # 4.pptxCA Group # 4.pptx
CA Group # 4.pptxZunairaabbas9
Β 
Educational technology-assessment-of-learning-and-statistical-measures-ed-09-...
Educational technology-assessment-of-learning-and-statistical-measures-ed-09-...Educational technology-assessment-of-learning-and-statistical-measures-ed-09-...
Educational technology-assessment-of-learning-and-statistical-measures-ed-09-...YvonneErekaOlazo
Β 
STANDARDIZED AND NON-STANDARDIZED TEST
STANDARDIZED AND NON-STANDARDIZED TESTSTANDARDIZED AND NON-STANDARDIZED TEST
STANDARDIZED AND NON-STANDARDIZED TESTsakshi rana
Β 
Ag Extn.504 :- RESEARCH METHODS IN BEHAVIOURAL SCIENCE
Ag Extn.504 :-  RESEARCH METHODS IN BEHAVIOURAL SCIENCE  Ag Extn.504 :-  RESEARCH METHODS IN BEHAVIOURAL SCIENCE
Ag Extn.504 :- RESEARCH METHODS IN BEHAVIOURAL SCIENCE Pradip Limbani
Β 
Characteristics of Good Evaluation Instrument
Characteristics of Good Evaluation InstrumentCharacteristics of Good Evaluation Instrument
Characteristics of Good Evaluation InstrumentSuresh Babu
Β 
Assessment of Learning
Assessment of LearningAssessment of Learning
Assessment of LearningRoelMaramara
Β 
LESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptx
LESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptxLESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptx
LESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptxMarjoriAnneDelosReye
Β 
Characteristics of Assessment
Characteristics of Assessment Characteristics of Assessment
Characteristics of Assessment AliAlZurfi
Β 
Monika seminar
Monika seminarMonika seminar
Monika seminarmonika22singh
Β 
Monika seminar
Monika seminarMonika seminar
Monika seminarmonika22singh
Β 
tools of research
tools of researchtools of research
tools of researchPriyanka Eka
Β 

Similar to Criteria to consider when constructing good tests (20)

Adapted from Assessment in Special and incl.docx
Adapted from Assessment in Special and incl.docxAdapted from Assessment in Special and incl.docx
Adapted from Assessment in Special and incl.docx
Β 
Louzel Report - Reliability & validity
Louzel Report - Reliability & validity Louzel Report - Reliability & validity
Louzel Report - Reliability & validity
Β 
Assessment of Learning
Assessment of LearningAssessment of Learning
Assessment of Learning
Β 
Educational measurement and evaluation
Educational measurement and evaluationEducational measurement and evaluation
Educational measurement and evaluation
Β 
Class demo in teaching (ugly version)
Class demo in teaching (ugly version) Class demo in teaching (ugly version)
Class demo in teaching (ugly version)
Β 
Assessment of learning and Educational Technology
Assessment of learning and Educational Technology Assessment of learning and Educational Technology
Assessment of learning and Educational Technology
Β 
Quantitative Analysis
Quantitative AnalysisQuantitative Analysis
Quantitative Analysis
Β 
CA Group # 4.pptx
CA Group # 4.pptxCA Group # 4.pptx
CA Group # 4.pptx
Β 
Educational technology-assessment-of-learning-and-statistical-measures-ed-09-...
Educational technology-assessment-of-learning-and-statistical-measures-ed-09-...Educational technology-assessment-of-learning-and-statistical-measures-ed-09-...
Educational technology-assessment-of-learning-and-statistical-measures-ed-09-...
Β 
Unit 2.pptx
Unit 2.pptxUnit 2.pptx
Unit 2.pptx
Β 
STANDARDIZED AND NON-STANDARDIZED TEST
STANDARDIZED AND NON-STANDARDIZED TESTSTANDARDIZED AND NON-STANDARDIZED TEST
STANDARDIZED AND NON-STANDARDIZED TEST
Β 
Ag Extn.504 :- RESEARCH METHODS IN BEHAVIOURAL SCIENCE
Ag Extn.504 :-  RESEARCH METHODS IN BEHAVIOURAL SCIENCE  Ag Extn.504 :-  RESEARCH METHODS IN BEHAVIOURAL SCIENCE
Ag Extn.504 :- RESEARCH METHODS IN BEHAVIOURAL SCIENCE
Β 
RM-3 SCY.pdf
RM-3 SCY.pdfRM-3 SCY.pdf
RM-3 SCY.pdf
Β 
Characteristics of Good Evaluation Instrument
Characteristics of Good Evaluation InstrumentCharacteristics of Good Evaluation Instrument
Characteristics of Good Evaluation Instrument
Β 
Assessment of Learning
Assessment of LearningAssessment of Learning
Assessment of Learning
Β 
LESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptx
LESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptxLESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptx
LESSON-8-ANALYSIS-INTERPRETATION-AND-USE-OF-TEST-DATA.pptx
Β 
Characteristics of Assessment
Characteristics of Assessment Characteristics of Assessment
Characteristics of Assessment
Β 
Monika seminar
Monika seminarMonika seminar
Monika seminar
Β 
Monika seminar
Monika seminarMonika seminar
Monika seminar
Β 
tools of research
tools of researchtools of research
tools of research
Β 

Recently uploaded

Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
Β 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
Β 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
Β 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxabhijeetpadhi001
Β 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
Β 
call girls in Kamla Market (DELHI) πŸ” >ΰΌ’9953330565πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈ
call girls in Kamla Market (DELHI) πŸ” >ΰΌ’9953330565πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈcall girls in Kamla Market (DELHI) πŸ” >ΰΌ’9953330565πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈ
call girls in Kamla Market (DELHI) πŸ” >ΰΌ’9953330565πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈ9953056974 Low Rate Call Girls In Saket, Delhi NCR
Β 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
Β 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
Β 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
Β 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
Β 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
Β 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
Β 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
Β 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
Β 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
Β 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
Β 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
Β 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
Β 

Recently uploaded (20)

Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
Β 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
Β 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
Β 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptx
Β 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
Β 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
Β 
call girls in Kamla Market (DELHI) πŸ” >ΰΌ’9953330565πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈ
call girls in Kamla Market (DELHI) πŸ” >ΰΌ’9953330565πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈcall girls in Kamla Market (DELHI) πŸ” >ΰΌ’9953330565πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈ
call girls in Kamla Market (DELHI) πŸ” >ΰΌ’9953330565πŸ” genuine Escort Service πŸ”βœ”οΈβœ”οΈ
Β 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
Β 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
Β 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
Β 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
Β 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
Β 
Model Call Girl in Bikash Puri Delhi reach out to us at πŸ”9953056974πŸ”
Model Call Girl in Bikash Puri  Delhi reach out to us at πŸ”9953056974πŸ”Model Call Girl in Bikash Puri  Delhi reach out to us at πŸ”9953056974πŸ”
Model Call Girl in Bikash Puri Delhi reach out to us at πŸ”9953056974πŸ”
Β 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
Β 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
Β 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
Β 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
Β 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
Β 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
Β 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Β 

Criteria to consider when constructing good tests

  • 1. οƒ˜ Criteria to Consider when Constructing Good Tests A. Validity – is the degree to which the test measures what is intended to measure. It is the usefulness of the test for a given purpose. It is the most important criterion of a good examination. Factors Influencing the Validity of the Tests In General 1. Appropriateness of Test – it should measure the abilities, skill and information it is supposed to measure. 2. Directions –it should indicate how the learners should answer and record their answers. 3. Reading Vocabulary and Sentence Structure –it should be based on the intellectual level of maturity and background experience of the learners. 4. Difficulty of Items - it should have items that are not too difficult and not too easy to be able to discriminate the bright from slow pupils. 5. Construction of Test Items – it should not provide clues so it will not be a test on clues nor ambiguous so it will not be a test on interpretation. 6. Length of the Test –it should just be sufficient length so it can measure what it is supposed to measure and not that it is too short that it cannot adequately measure the performance we want to measure. 7. Arrangement of Items –it should have items that are arranged in ascending level of difficulty such that it starts with the easy so that the pupils will pursue on taking the test. 8. Patterns of Answer –it should not allow the creation of patterns in answering the test. Ways in Establishing Validity 1. Face Validity – is done by examining the physical appearance of the test 2. Content Validity – is done through a careful and critical examination of the objectives of the test so that it reflects the curricular objectives. 3. Criterion-related Validity – is established statistically such that a set of scores revealed by a test is correlated with the scores obtained in another external predictor or measure. a. Concurrent validity – describes the present status of the individual by correlating the sets of scores obtained from two measures given concurrently. b. Predictive validity – describes the future performance of an individual by correlating the sets of scores obtained from two measures given at a longer time interval. 4. Construct Validity – is established statistically by comparing psychological traits or factors that theoretically influence scores in a test. a. Convergent Validity – is established if the instrument defines another similar trait other than what it is intended to measure. e.g. Critical Thinking Test may be correlated with Creative Thinking Test. b. Divergent Validity – is established if an instrument can describe only the intended trait and not the other traits. e. g. Critical Thinking Test may not be correlated with Reading Comprehension Test. B. Reliability – it refers to the consistency of scores obtained by the same person when retested using the same instrument or one that is parallel to it.
  • 2. Factors Affecting Reliability 1. Length of the Test – as a general rule, the longer the test, the higher the reliability. A longer test provides a more adequate sample of the behavior being measured and is less distorted by chance factors like guessing. 2. Difficulty of the Test – ideally, achievement tests should be constructed such that the average score is 50 percent correct and the scores range from near zero to perfect. The bigger spread of the scores, the more reliable the measured difference is likely to be. A test is reliable if the coefficient of correlation is not less than 0.85. 3. Objectivity – can be obtained by eliminating the bias, opinions or judgments of the person who checks the test. Method Type of Reliability Measure Procedure Statistical Measure A. Test-Retest Measure of stability Give a test twice to the same group with any time interval between tests from several minutes to several years. Pearson r B. Equivalent Forms Measure of equivalence Give parallel forms of tests with close time intervals between forms. Pearson r C. Test-Retest with Equivalent Forms Measure of stability and equivalence Give parallel forms of test with increased time intervals between forms. Pearson r D. Split Half Measure of Internal Consistency Give a test once. Score equivalent halves of the test e.g. odd- and even- numbered items Pearson r & Spearman Brown Formula E. Kuder- Richardson Measure of Internal Consistency Give the test once then correlate the proportion/percentage of the students passing and not passing a given item. Kuder- Richardson Formula 20 and 21 Formulas for Measures of Correlation Used in Establishing Test Validity & Reliability Pearson r π‘Ÿ = βˆ‘ π‘‹π‘Œ 𝑁 βˆ’( βˆ‘ 𝑋 𝑁 )( βˆ‘ π‘Œ 𝑁 ) βˆšβˆ‘ 𝑋2 𝑁 βˆ’( βˆ‘ 𝑋 𝑁 ) 2 βˆšβˆ‘ π‘Œ2 𝑁 βˆ’ ( βˆ‘ π‘Œ 𝑁 ) 2 Spearman Brown Formula π‘Ÿπ‘’π‘™π‘–π‘Žπ‘π‘–π‘™π‘–π‘‘π‘¦ π‘œπ‘“ π‘‘β„Žπ‘’ π‘€β„Žπ‘œπ‘™π‘’ 𝑑𝑒𝑠𝑑 = 2π‘Ÿ π‘œπ‘’ 1+ π‘Ÿ π‘œπ‘’ Kuder-Richardson Formula 20 𝐾𝑅20 = 𝐾 πΎβˆ’1 [1 βˆ’ βˆ‘ π‘π‘ž 𝑆2 ] Where: X – scores in a test Y – scores in a retest N –number of examinees Where: roe– reliability coefficient using the split-half or odd- even procedure Where: K – no. of items p – proportion of the examinees who got the item right q – proportion of the examinees who got the item wrong S2 – variance or the square of the standard deviation
  • 3. Kuder-Richardson Formula 21 𝐾𝑅21 = 𝐾 πΎβˆ’1 [1 βˆ’ π‘˜π‘Μ… π‘ž 𝑆2 ] Interpretation of the Pearson r correlation value π»π‘–π‘”β„Ž π‘π‘œπ‘ π‘–π‘‘π‘–π‘£π‘’ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘› { 1 βˆ’ π‘ƒπ‘’π‘Ÿπ‘“π‘’π‘π‘‘ π‘π‘œπ‘ π‘–π‘‘π‘–π‘£π‘’ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘› 0.5 βˆ’ π‘ƒπ‘œπ‘ π‘–π‘‘π‘–π‘£π‘’ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘› πΏπ‘œπ‘€ π‘π‘œπ‘ π‘–π‘‘π‘–π‘£π‘’ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘› { 0.5 βˆ’ π‘ƒπ‘œπ‘ π‘–π‘‘π‘–π‘£π‘’ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘› 0 βˆ’ π‘π‘’π‘Ÿπ‘œ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘› πΏπ‘œπ‘€ π‘›π‘’π‘”π‘Žπ‘‘π‘–π‘£π‘’ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘› { 0 βˆ’ π‘π‘’π‘Ÿπ‘œ πΆπ‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘› βˆ’0.5 βˆ’ π‘π‘’π‘”π‘Žπ‘‘π‘–π‘£π‘’ πΆπ‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘› π»π‘–π‘”β„Ž π‘›π‘’π‘”π‘Žπ‘‘π‘–π‘£π‘’ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘› { βˆ’0.5 βˆ’ π‘π‘’π‘”π‘Žπ‘‘π‘–π‘£π‘’ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘› βˆ’1 βˆ’ π‘ƒπ‘’π‘Ÿπ‘“π‘’π‘π‘‘ π‘›π‘’π‘”π‘Žπ‘‘π‘–π‘£π‘’ π‘π‘œπ‘Ÿπ‘Ÿπ‘’π‘™π‘Žπ‘‘π‘–π‘œπ‘› C. Administrability – the test should be administered with ease, clarity and uniformity so that scores obtained are comparable. Uniformity can be obtained by setting the time limit and oral instructions. D. Scorability – the test should be easy to score such that directions for scoring are clear, the scoring key is simple; provisions for answer sheets are made. E. Economy – the test should be given in the cheapest way, which means that answer sheets must be provided so the test can be given from time to time. F. Adequacy – the test should contain a wide sampling of items to determine the educational outcomes or abilities so that the resulting scores are representatives of the total performance in the areas measured. G. Authenticity – the test should simulate real-life situations. οƒ˜ Shapes of the Frequency Polygons 1. Normal – bell-shaped curve 2. Positively skewed – most scores are below the mean and there are extremely high scores, π‘₯Μ… > π‘₯Μ‚ (mean is greater than the mode) 3. Negatively skewed – most scores are above the mean and there are extremely low scores, π‘₯Μ… < π‘₯Μ‚ (mean is lower than the mode) 4. Leptokurtic – highly peaked and the tails are more elevated above the baseline 5. Mesokurtic – moderately peaked 6. Platykurtic – flattened peak 7. Bimodal Curve – curve with two peaks or mode 8. Polymodal Curve – curve with three or more modes 9. Rectangular Distribution – there is no mode οƒ˜ Four Types of Measurement Scales Measurement Scale Characteristics Example 1. Nominal  Groups and labels data Gender (1-male, 2-female) 2. Ordinal  Ranks data  Distance between points are indefinite Income (1-low, 2-average, 3- high) 3. Interval  Distance between points are equal  No absolute zero point Test scores and temperature *a score of zero in a test does not mean no knowledge at all 4. Ratio  All of the above except that it has an absolute zero point Height, weight * a zero weight means no weight at all Where: 𝑝̅ = 𝑋̅ 𝐾 ; π‘ž = 1 βˆ’ 𝑝
  • 4. Measures of Central Tendency and Variability Assumptions When Used Appropriate Statistical Tools Measure of Central Tendency -describes the representative value of a set of data Measure of Variability -describes the degree of spread or dispersion of a set of data ο‚· When the frequency distribution is regularly/ symmetrically/ normal ο‚· Usually used when the data are numeric (interval or ratio) Mean – the arithmetic average Standard Deviation – the root-mean-square of the deviations from the mean. ο‚· When the frequency distribution is irregular/ skewed ο‚· Usually used when the data are ordinal Median – the middle score in a group of scores that are ranked Quartile Deviation – the average deviation of the 1st and 3rd quartiles from the median ο‚· When the distribution of scores is normal and quick answer is needed ο‚· Usually used when the data are nominal Mode – the score that occurs frequently Range – the difference between the highest and lowest score in a set of observation I. Procedure in the Computation of the Measures of Central Tendency A. Mean Procedure: 1. Mean of Ungrouped Data: used for few cases (N<30) a. Get the sum of scores (Ξ£X) b. Divide the sum by the number of cases (N) Formula: 𝑋̅ = βˆ‘ 𝑋/𝑁 2. Mean of Grouped Data: uses for large cases (N>30) There are 2 possible methods that will be discussed in computing the mean of grouped data. a. Using Midpoint Method Procedures: 1) Group data in the form of a frequency distribution 2) Compute the midpoints of all class limits (M) 3) Multiply the midpoints by their frequencies (M x F) 4) Get the sum of the products of the midpoints and frequencies (Ξ£ MF) 5) Divide the sum by the number of cases (N) Formula: 𝑋̅ = βˆ‘ 𝑀𝐹 𝑁 b. Using Class Deviation Method Procedures: 1) Choose your arbitrary starting point or origin from any of the class limits 2) Get the midpoint of the class limit that you have chosen as your starting point. Call this your assumed mean (AM) 3) Get the deviation (D) of each class limit from the class limit where the assumed mean is. The deviation of the class limit where the assumed mean is located is 0. Add one (+1) to each class limit higher than this point of origin and subtract one (-1) to the class limit lower than the origin. 4) Multiply the frequencies by their corresponding deviations (FD) 5) Add the products of the frequencies and deviations (Ξ£FD) 6) Divide the sum by the number of cases (Ξ£FD/N) 7) Multiply the quotient by the number of class interval (i) 8) Add the product to the assumed mean Formula: 𝑋̅ = 𝐴𝑀 + 𝑖 ( βˆ‘ 𝐹𝐷 𝑁 )
  • 5. B. Mode ο‚· Median of Ungrouped Data There are several ways in the computation of median for ungrouped data. The process depends on a case to case basis Case 1: The total number of cases is an odd number Procedure: 1.) Arrange the scores from the highest to lowest or vice versa 2.) Get the middlemost score. The score is the median score Case 2: The total number of cases is an even number Procedure: 1.) Arrange the scores from highest to lowest or vice versa. 2.) Get the two middlemost scores 3.) Compute the average of the two middlemost scores. The average is the median score. Case 3: The middlemost score occurs twice, thrice, or more number of times Procedure: 1.) Get the middlemost score/s, its/their identical score/s and its/their counterparts either above or below the middlemost score/s 2.) Compute their average and the average score is the median. 2. Median for Grouped Data Procedure: 1.) Add up or accumulate the frequencies starting from the lowest to the highest class limit. Call this the cumulative frequency. (CF) 2.) Find one half of the number of cases in the distribution. (N/2) 3.) Find the cumulative frequency which is equal or closest but higher than the half of the number of cases. The class containing this frequency is the median class. 4.) Find the lowest limit (LL) of the median class. 5.) Get the cumulative frequency of the class below the median class. (CFb) 6.) Subtract this from the half of the number of cases in the distribution. (N/2 – CFb) 7.) Get the frequency of the median class. (FMdn) 8.) Find the class interval (i) then follow the given formula below. Formula: 𝑋̃ = 𝐿𝐿 + 𝑖 ( 𝑁 2 βˆ’πΆπΉ 𝑏 𝐹𝑀𝑑𝑛 ) C. Mode Procedure 1. Mode of Ungrouped Data ο‚· Get the most frequent score  when there are more than three modes, they are called polymodal or multimodal  when there is no mode, it is describe as a rectangular distribution. 2. Mode for Grouped Data a. Crude Mode – refers to the midpoint of the class limit with the highest frequency. Procedure: 1.) Find the class limit with the highest frequency 2.) Get the midpoint of that class limit 3.) The midpoint of the class limit with the highest frequency is the crude mode Where: LL = lowestlimitof the medianclass i = class interval N/2 = half of the numberof cases CFb = cumulative frequencybelow the medianclass FMdn = frequencyof the medianclass
  • 6. b. Refined Mode–refers to the mode obtained from an ordered arrangements or a class frequency distribution Procedure: 1.) Get the mean and the median of the grouped data. 2.) Multiply the median by three (3Mdn) 3.) Multiply the mean by two (2Mn) 4.) Subtract 2Mn from 3Mdn to get the Mode. (Md) Formula: 𝑋̂ = 3𝑀𝑑𝑛 βˆ’ 2𝑀𝑛 οƒ˜ How will you interpret the Measures of Central Tendency? 1.) The value that represents a set of data will be the basis in determining whether the group is performing better or poorer than the other groups. II. Procedure in the computation of the Measures of Variability A. Range (R) 1. For Ungrouped Data – the difference between the highest and lowest score 2. For Grouped Data – the difference between the highest limit of the highest class limit and the lowest limit of the lowest class limit. B. Standard Deviation (SD) Procedure for Ungrouped Data 1.) Find the mean. (𝑋̅) 2.) Subtract the mean from each score to get the deviation. [ 𝑑 = 𝑋̅ βˆ’ 𝑋̅] 3.) Square the deviation. (d2) 4.) Get the sum of the squared deviations. (Ξ£d2) 5.) Divide the sum by the number of cases (Ξ£ d2 / N – 1) 6.) Get the square root of the answer. √Σd2 / N-1 Formula: 𝑆𝐷 = √ βˆ‘ 𝑑 2 π‘βˆ’1 Procedure for Grouped Data A. Using Class Deviation Method 1.) Like what you did in the mean, get the deviation (d) and the product of the frequency and deviation of each score. (fd) 2.) Multiply the product of the frequency and the deviation by the deviation. (fd2) 3.) Get the sum of the product of the frequency and squared deviation. (Ξ£fd2) 4.) Compute the standard deviation using the formula below Formula: 𝑺𝑫 = π‘°βˆš[ βˆ‘ 𝒇𝒅 𝟐 𝑡 ] βˆ’ [ (βˆ‘ 𝒇𝒅) 𝟐 𝑡 𝟐 ] B. Using Midpoint Method 1.) Square the midpoint (M2) and multiply it by the frequency midpoint (FM) 2.) Write the products of M & FM in another column and label it (FM2) 3.) Use the formula below to compute the Standard Deviation. Formula: 𝑆𝐷 = √ βˆ‘ 𝐹𝑀2 𝑁 βˆ’ ( 𝑋̅)2 Where: I = interval N = Number of cases Ξ£fd = sum of the product of frequency and deviation Ξ£fd2 = sum of the product of the frequency and squared deviation
  • 7. οƒ˜ How will you interpret the standard deviation? 1.) The results will help you determine if the group is homogeneous or not. 2.) The results will also help you determine the number of students that fall below and above the average performance. Study how to do this: ο‚· Mean – 1 SD and mean + 1 SD would give the limits of an average ability ο‚· The point right below – 1 SD is the upper limit of the below average ability ο‚· The point right above + 1 SD is the lower limitof the above average ability C. Quartile Deviation (QD) 1. Procedure in the Computation of QD for Ungrouped Data 1.) Arrange the scores in descending or ascending order 2.) Compute the Q1 i.e. [ΒΌ (N)] and the results tells the rank of the Q1 score in the ordered arrangement from the bottom. 3.) Look for the score in this rank. 4.) Compute the Q3 score [d = ΒΎ (N)] and the results tells the rank of the Q3 score. 5.) Look for the Q3 score in this rank 6.) Compute the QD 𝑄𝐷 = 𝑄3βˆ’π‘„1 2 2. Procedure in the Computation of QD for Grouped Data 1.) Compute for the value of the 1st quartile 𝑄1 = 𝐿𝐿 + ( 𝑁 2 βˆ’πΆπΉ 𝑏 πΉπ‘ž ) 𝑖 2.) Compute for the 3rd quartile 𝑄3 = 𝐿𝐿 + ( 3𝑁 2 βˆ’πΆπΉ 𝑏 πΉπ‘ž ) 𝑖 3.) Compute for the interquartile range or quartile 𝑄𝐷 = 𝑄3βˆ’π‘„1 2 οƒ˜ How will you interpret the quartile deviation? The results will also tell if the group is homogeneous or not. It will also tell how many of the students fall below or above the region of acceptable performance. To do this, study the instruction below. ο‚· Median – 1 QD and Median +1 QD would give the limits of an average ability ο‚· The Point right below the (-1) QD is the upper limit of the below average ability ο‚· The point right above the +1 QD is the lower limit of the above average ability STANDARD SCORES ο‚· Indicate the pupil’s relative position by showing how far his raw score is above or below average ο‚· Express the pupil’s performance in terms of standard unit from the mean ο‚· Represented by the normal probability curve or what is commonly called the normal curve ο‚· Used to have a common unit to compare raw scores from different tests 1. PERCENTILE ο‚· tells the percentage of examinees that lies below one’s score. Formula: Pπ‘Ž = LL + i [ π‘Žπ‘βˆ’πΆπΉ 𝑏 𝐹𝑃 π‘Ž ] Where: Q1 – standsforthe 1st quartile LL – lowestlimit N/4 – one-fourthof the total numberof the population CF – cumulative frequencybelow the quartile class Fq – frequencyof the classwhere the firstquartile score falls I - interval Where: LL – lowestlimitof the classof a% N CFb – cumulative frequencybelowthe classof a% N FPa – frequencyof the classof a% N
  • 8. 2. Z-SCORES ο‚· tells the number of standard deviations equivalent to a given raw score Formula: 𝑍 = π‘‹βˆ’π‘‹Μ… 𝑆𝐷 Note: Z – score is negative when X <𝑋̅ Z – score is positive when X >𝑋̅ 3. T-SCORES ο‚· it refers to any set of normally distributed standard deviation score that has a mean of 50 and a standard deviation of 10. ο‚· computed after converting raw scores to z-scores to get rid of negative values Formula: 𝑇 βˆ’ π‘ π‘π‘œπ‘Ÿπ‘’ = 50 + 10(𝑍) ASSIGNING GRADES/MARKS/RATINGS A. Marking/Grading - is the process of assigning value to a performance B. Mark/Grades/Ratings are symbols which: Could be in – ο‚· Percent such as: 70%, 75%, 80%, etc. ο‚· Letters such as: A, B, C, D, or F ο‚· Numbers such as: 1, 2, 3, 4, or 5 ο‚· Descriptive expressions such as: Outstanding (O), Very Satisfactory (VS), Satisfactory (S), Moderately Satisfactory (MS), Needs Improvement (NI), etc. [Note: Any symbol can be used provided that it has uniform meaning to all concerned] Could represent – ο‚· How a student is performing in relation to other students (Norm-Referenced Grading) ο‚· The extent to which a student has mastered a particular body of knowledge (Criterion-Referenced Grading) ο‚· How a student is performing in relation to a teacher’s judgment of his or her potential. (Grading in Relation to Teacher’s Judgment) Could be for – ο‚· Certification that gives assurance that a student has mastered a specific content or achieved a certain level of accomplishment. ο‚· Selection that provides basis in identifying or grouping students for certain educational paths or programs. ο‚· Direction that provides information for diagnosis and planning ο‚· Motivation that emphasizes specific material or skills to be learned and helping students to understand and improve their performance. Could be based on – ο‚· Examination results or test data ο‚· Observations of student work ο‚· Group evaluation activities ο‚· Class discussions and recitations
  • 9. ο‚· Homework ο‚· Notebooks and note taking ο‚· Reports, themes and research papers discussions and debates ο‚· Portfolios ο‚· Projects ο‚· Attitudes, etc. Could be assigned by – ο‚· Criterion-referenced grading or grading - based on fixed or absolute standards where grade is assigned based on how a student has met the criteria or the well-defined objectives of a course that were spelled out in advance. It is then up to the student to earn the grade he or she wants to receive regardless of how other students in the class have performed. This is done by transmuting test scores into marks or ratings. ο‚· Norm-referenced grading or grading - based on relative standards where a student’s grade reflects his or her level of achievement relative to the performance of other students in the class. In this system the grade is assigned based in the average of test scores. The rating scales that are used in assigning grades are: 1.) The four point rating scale which uses the median and quartile deviation of the test scores to group the scores into four and each group is assigned the corresponding grade of A, B, C, and D or 1, 2, 3, or 4. 2.) The five point rating scale which uses the median and quartile deviation of the test scores to group the scores into 5 and each group is assigned the corresponding grade of A, B, C, D, or F or 1, 2, 3, 4, or 5 ο‚· Point or Percentage Grading System whereby the teacher identifies points or percentages of various tests and class activities depending on their importance. The total of these points will be the bases for the grade assigned to the student. ο‚· Contract Grading System where each student agrees to work for a particular grade according to agreed-upon standards. οƒ˜ Guidelines in Grading Students 1.) Explain your grading system to the students early in the course and remind them of the grading policies regularly 2.) Base grades on a predetermined and reasonable set of standards. 3.) Base your grades on as much objective evidence as possible. 4.) Base grades on the student’s attitude as well as achievement, especially at the elementary and high school level. 5.) Base grades on the student’s relative standing compared to classmates. 6.) Base grades on a variety of sources 7.) As a rule, do not change grades. 8.) Become familiar with the grading policy of your school and with your colleagues’ standards 9.) When failing a student, closely follow school procedures. 10.)Record grades on report cards and cumulative records. 11.)Guard against bias in grading. 12.)Keep pupils informed of their standing in the class
  • 10. References Frankael, J.R. & Wallen, N.E. (1993). How to Design and Evaluate Research in Education, 2nd Edition, New York: McGrawHill Inc. Nackmeas, C.F. and Nachmeas, D. (1996). Research Methods in the Social Sciences, 5th Edition, London: St. Martius Press, Inc. Oriondo, Leonora et. al. (1996). Evaluating Educational Outcomes. Quezon City: Rex Printing Company, Inc. Omstein, Allan C. (1990). Strategies for Effective Teaching. Newyork: Harper Collins Publisher: Navotas, M.M.