Assessment In Education At Secondary Level

Assessment in education at Secondary level
1.Functions of grading
- This book is about designing classroom grading systems that are both precise and efficient.
One of the first steps to this end is to clarify the basic purpose of grades. How a school or
district defines the purpose of grades dictates much of the form and function of grades.
- Measurement experts such as Peter Airasian (1994) explain that educators use grades
primarily (1) for administrative purposes, (2) to give students feedback about their progress
and achievement, (3) to provide guidance to students about future course work, (4) to
provide guidance to teachers for instructional planning, and (5) to motivate students.
Administrative Purposes
For at least several decades, grades have served a variety of administrative functions (Wrinkle,
1947), most dealing with district-level decisions about students, including
• Student matriculation and retention.
• Placement when students transfer from one school to another.
• Student entrance into college.
Research indicates that some districts explicitly make note of the administrative function of
grades. For example, in a study of school board manuals, district guidelines, and handbooks for
teaching, researchers Susan Austin and Richard McCann (1992) found the explicit mention of
administration as a basic purpose for grades in 7 percent of school board documents, 10 percent
of district guidelines, and 4 percent of handbooks for teachers. Finally, in a survey conducted by
The College Board (1998), over 81 percent of the schools reported using grades for
administrative purposes.
Feedback About Student Achievement
One of the more obvious purposes for grades is to provide feedback about student achievement.
Studies have consistently shown support for this purpose. For example, in 1976, Simon and
Bellanca reported that both educators and noneducators perceived providing information about
student achievement as the primary purpose of grading. In a 1989 study of high school teachers,
Stiggins, Frisbie, and Griswold reported that this grading function—which they refer to as the
information function—was highly valued by teachers. Finally, the study by Austin and McCann
(1992) found that 25 percent of school board documents, 45 percent of district documents, and
65 percent of teacher documents mentioned reporting student achievement as a basic purpose of
grades.
Guidance
When used for guidance purposes, grades help counselors provide direction for students
(Wrinkle, 1947; Terwilliger, 1971). Specifically, counselors use grades to recommend to

individual students courses they should or should not take and schools and occupations they
might consider (Airasian, 1994). Austin and McCann (1992) found that 82 percent of school
board documents, 40 percent of district documents, and 38 percent of teacher documents
identified guidance as an important purpose of grades.
Instructional Planning
Teachers also use grades to make initial decisions about student strengths and weaknesses in
order to group them for instruction. Grading as a tool for instructional planning is not commonly
mentioned by measurement experts. However, the Austin and McCann (1992) study reported
that 44 percent of school board documents, 20 percent of district documents, and 10 percent of
teacher documents emphasized this purpose.
Motivation
Those who advocate using grades to motivate students assume that they encourage students to try
harder both from negative and positive perspectives. On the negative side, receiving a low grade
is believed to motivate students to try harder. On the positive side, it is assumed that receiving a
high grade will motivate students to continue or renew their efforts.
As discussed later in this chapter, some educators object strongly to using grades as motivators.
Rightly or wrongly, however, this purpose is manifested in some U.S. schools. For example,
Austin and McCann (1992) found that 7 percent of school board documents, 15 percent of
district-level documents, and 10 percent of teacher documents emphasized motivation as a
purpose for grades
.
Type of Grading,Definition,Historical Background-
1.Percentage grading
Using a percentage scale (percent of 100), usually based on percent correct on exams and/or
percent of points earned on assignments
• Most common method in use in high schools and grading colleges c.1890–1910.
• Used today as a grading method or as a way of arriving at letter grades.
2.Letter grading and variations
Using a series of letters (often A, B, C, D, F) or letters with plusses and minuses as an ordered
category scale - can be done in a norm-referenced (standards-based) manner
• Yale used a four-category variations system in 1813.
• In the 1920 letter grading was seen as the solution to the problem of reliability of percentage
grading (fewer or criterion-referenced categories) and was increasingly adopted.
3.Norm-referenced grading
Comparing students to each other; using class standing as the basis for assigning grades (usually
letter grades)

• Was advocated in early grading 1900s as scientific measurement.
• Educational disadvantages were known by the 1930s.
4.Mastery grading
Grading students as “masters” or “passers” when their attainment reaches a prespecified level,
usually allowing different amounts of time for different students to reach mastery
• Originating in the 1920s (e.g., Morrison, 1926) as a grading strategy, it became associated with
the educational strategy of mastery learning (Bloom, Hastings, & Madaus, 1971).
5.Pass/Fail
Using a scale with two levels (pass and fail), sometimes in connection with mastery grading
• In 1851, the University of Michigan experimented with pass/fail grading for classes.
6.Standards (or Absolute-Standards)
gradingOriginally, comparing student performance to a preestablished standard (level) of
performance; currently, standards grading sometimes means grading with reference to a list of
state or district content standards according to preestablished performance levels
• Grading according to standards of performance has been championed since the grading 1930s as
more educationally sound than norm-referenced grading.
• Current advocates of standards grading use the same principle but the term "standard” is now
used for the criterion itself, not the level of performance.
• Since 2002, the scales on some standards-based report cards use the state accountability
(proficiency) reporting categories instead of letters.
7.Narrative grading
Writing comments about students’ achievement, either in addition to or instead of using numbers
or letters
• Using a normal instructional practice (describing students’ work) in an assessment context.
Progress Report
a written document that explains how much progress is being made on something you have
previously planned:
. These reports will be staggered throughout the year and we have done our best to ensure that
key points in a student’s development, such as making option choices, are supported with an
appropriate report.
We do not provide an end of year report for all students as it would not be possible for teachers
to write reports on every student they teach at one time. Furthermore, we do not believe that a

summative* end of year report is as valuable for a students’ development as providing a
formative** report that can give them advice on how to improve and crucially, time to work on
those developments.
All reports will provide information on the following:
• Attendance
• A record of the number of positive events
• A record of the number of negative events
• An end of year attainment estimate
• A teacher assessment of current attainment
• A teacher assessment of Learner Characteristics
Further to these points once a year teachers will write a brief statement about strengths and areas
for development.
The learner characteristics we will grade for each report are:
• Attitude to Learning
• Communication Skills
• Homework Quality
• Personal Organisation
• Presentation of Work
Each of these is assessed on a scale from one to five, with one being ‘unacceptable’ and five
being ‘exceptional’. Further information and detail is contained within the document below.
*A summative report is given at the end of a period of study. It would state how well the student
has done but would not give advice on how to improve. If a comment was given on how to
improve the student would not have the opportunity to work on this development.
** A formative report is given during a period of study. It would state how well a student is
doing and would give advice on how to make further progress. The student then has further time
to work on the advice given in the report.

Perspectives on assessment
Assessment is at the centre of the student's experience. It provides a means of evaluating student
progress and achievement; it drives the activity of the student and therefore their learning. This
collection of short presentations is intended to provoke debate about assessment. Over the last
few years, those involved in developing assessment have generated some new perspectives
which, as yet, have not been fully incorporated within mainstream practice. There has been a gap
between the emerging understandings of 'reflective practitioners' and educational developers and
those who are setting assessment policy and defining practice. We would like to close that gap.
In order to do so we have set out some 'challenging perspectives' in short talks. They are
intended to be contentious but well grounded. Each Web talk is an introduction to an idea that we
hope you will pursue using the references provided. The talks may be used by individuals or
serve as a catalyst for a group discussion, for example in a workshop. Please feel free to
comment. We haven?t covered all the ground - far from it - and hope that others might add to
this collection.
Purposes of assessment
Teaching and learning
The primary purpose of assessment is to improve students’ learning and teachers’ teaching as
both respond to the information it provides. Assessment for learning is an ongoing process that
arises out of the interaction between teaching and learning.
What makes assessment for learning effective is how well the information is used.
System improvement
Assessment can do more than simply diagnose and identify students’ learning needs; it can be
used to assist improvements across the education system in a cycle of continuous improvement:
• Students and teachers can use the information gained from assessment to determine their next
teaching and learning steps.
• Parents, families and whānau can be kept informed of next plans for teaching and learning and
the progress being made, so they can play an active role in their children’s learning.
• School leaders can use the information for school-wide planning, to support their teachers and
determine professional development needs.

• Communities and Boards of Trustees can use assessment information to assist their
governance role and their decisions about staffing and resourcing.
• The Education Review Office can use assessment information to inform their advice for
school improvement.
• The Ministry of Education can use assessment information to undertake policy review and
development at a national level, so that government funding and policy intervention is
targeted appropriately to support improved student outcomes.
1. Assessment for Learning (Formative)
The purpose of Formative Assessment is to provide students with feedback on how they are
going. The aim is to help students improve their performance and make their next piece of
assessed work better. It is developmental or formative in nature; hence the term "Formative
Assessment".
The feedback students receive is the key component of formative assessment. Feedback is
intended to help them identify weaknesses and build on strengths to improve the quality of their
next piece of assessment. The focus is on comments for improvement, not marks, and the
awarding of marks in formative assessment can actually be counterproductive.
2. Assessment for Certification (Summative)
Another key purpose of assessment is to gather evidence to make a judgment about a student's
level of performance; against the specified learning objectives.
Students are usually assessed at the end of an element of learning, such as the end of a module,
mid semester or end of semester. They are awarded results typically as marks or grades to
represent a particular level of achievement (high, medium, low). This judgmental "summative"
process formally provides the evidence, to verify or "certify" which students may progress to the
next level of their studies.
3. Protect Academic Standards
Grades from cumulative assessments are used to certify that a person has the necessary
knowledge and skills (and can apply them appropriately) to be awarded a qualification.
Consequently, the quality and integrity of assessment is essential to guarantee the credibility of
qualifications and the academic reputation of the issuing Institution. There is considerable local,
national and international concern to ensure that the ways we protect academic standards stand
up to scrutiny.
4. Feedback for Teaching
The results from both formative and summative assessments can help you track how your
students are going throughout your courses. Closely looking at the results can help you identify
any patterns of difficulties or misunderstandings students might have. This in turn allows you
alter your approach to teaching and adjust your curriculum accordingly. For example, you may
identify that you need to offer more detailed explanations or provide additional resources in a
particular area.
Continuous and comprehensive evaluation
Concept and Importance

Continuous and comprehensive evaluationwas a process of assessment, mandated by the Right to
Education Act, of India. This approach to assessment has been introduced by state governments
in India, as well as by the Central Board of Secondary Education in India, for students of sixth to
tenth grades and twelfth in some schools. The Karnataka government has introduced CCE for
grades 1 through 9 later it was also introduced for 12th grades students. The main aim of CCE is
to evaluate every aspect of the child during their presence at the school. This is believed to help
reduce the pressure on the child during/before examinations as the student will have to sit for
multiple tests throughout the year, of which no test or the syllabus covered will be repeated at the
end of the year, whatsoever. The CCE method is claimed to bring enormous changes from the
traditional chalk and talk method of teaching, provided it is implemented accurately. In 2017, the
CCE system was cancelled for students appearing in the Class 10 Board Exam for 2017-18,
bringing back compulsory Annual Board Exam and removing the Formative and Summative
Assessments under the Remodeled Assessment Pattern.[1]
As a part of this new system, student's marks will be replaced by grades which will be evaluated
through a series of curricular and extra-curricular evaluations along with academics. The aim is
to decrease the workload on the student by means of continuous evaluation by taking number of
small tests throughout the year in place of single test at the end of the academic program. Only
Grades are awarded to students based on work experience skills, dexterity, innovation,
steadiness, teamwork, public speaking, behavior, etc. to evaluate and present an overall measure
of the student's ability. This helps the students who are not good in academics to show their
talent in other fields such as arts, humanities, sports, music, athletics, and also helps to motivate
the students who have a thirst of knowledge
Unlike CBSE's old pattern of only one test at the end of the academic year, the CCE conducts
several. There are two different types of tests. Namely, the formative and the summative.
Formative tests will comprise the student's work at class and home, the student's performance in
oral tests and quizzes and the quality of the projects or assignments submitted by the child.
Formative tests will be conducted four times in an academic session, and they will carry a 40%
weightage for the aggregate. In some schools, an additional written test is conducted instead of
multiple oral tests. However, at least one oral test is conducted.
The summative assessment is a three-hour long written test conducted twice a year. The first
summative or Summative Assessment 1 (SA-1) will be conducted after the first two formatives
are completed. The second (SA-2) will be conducted after the next two formatives. Each
summative will carry a 30% weightage and both together will carry a 60% weightage for the
aggregate. The summative assessment will be conducted by the schools itself. However, the
question papers will be partially prepared by the CBSE and evaluation of the answer sheets is
also strictly monitored by the CBSE. Once completed, the syllabus of one summative will not be
repeated in the next. A student will have to concentrate on totally new topics for the next
summative.
At the end of the year, the CBSE processes the result by adding the formative score to the
summative score, i.e. 40% + 60% = 100%. Depending upon the percentage obtained, the board
will deduce the CGPA (Cumulative Grade Point Average) and thereby deduce the grade
obtained. In addition to the summative assessment, the board will offer an optional online
aptitude test that may also be used as a tool along with the grades obtained in the CCE to help
students to decide the choice of subjects in further studies. The board has also instructed the
schools to prepare the report card and it will be duly signed by the principal, the student.

• Deductive Method - What does the student know and how can he use it to explain a situation.
• Co-relation with a real-life situation - Whether the situation given matches any real-life
situation, like tsunamis, floods, tropical cyclones, etc.
• Usage of Information Technology - Can the problem be solved with the use of IT? If yes,
how?
In addition to that, various assignments can be given such as projects, models and charts, group
work, worksheet, survey, seminar, etc. The teacher will also play a major role. For example, they
give remedial help, maintain a term-wise record and checklists, etc.
Assessment for learning
Assessment for Learning is the process of seeking and interpreting evidence for use by learners
and their teachers to decide where the learners are in their learning, where they need to go and
how best to get there.
Assessment for learning is best described as a process by which assessment information is used
by teachers to adjust their teaching strategies, and by students to adjust their learning strategies.
Assessment, teaching, and learning are inextricably linked, as each informs the others.
Assessment is a powerful process that can either optimise or inhibit learning, depending on how
it’s applied.
For teachers
Assessment for learning helps teachers gather information to:
• plan and modify teaching and learning programmes for individual students, groups of
students, and the class as a whole
• pinpoint students’ strengths so that both teachers and students can build on them
• identify students’ learning needs in a clear and constructive way so they can be addressed
• involve parents, families, and whānau in their children's learning.
For students
Assessment for learning provides students with information and guidance so they can plan and
manage the next steps in their learning.
Assessment for learning uses information to lead from what has been learned to what needs to be
learned next.
Describing assessment for learning
Assessment for learning should use a range of approaches. These may include:

• day-to-day activities, such as learning conversations
• a simple mental note taken by the teacher during observation
• student self and peer assessments
• a detailed analysis of a student’s work
• assessment tools, which may be written items, structured interview questions, or items
teachers make up themselves.
What matters most is not so much the form of the assessment, but how the information gathered
is used to improve teaching and learning
Testing, Assessment, Measurement and Definition
The definitions for each are:
Test: A method to determine a students ability to complete certain tasks or demonstrate
masteryof a skill or knowledge of content. Some types would be multiple choice tests, or a
weeklyspelling test. While it is commonly used interchangeably with assessment, or even
evaluation, itcan be distinguished by the fact that a test is one form of an assessment
Assessment: The process of gathering information to monitor progress and make educational
decisions if necessary. As noted in my definition of test, an assessment may include a test,
butalso includes methods such as observations, interviews, behavior monitoring, etc.
Measurement:beyond its general definition, refers to the set of procedures and the principles
forhow to use the procedures in educational tests and assessments. Some of the basic principlesof
measurement in educational evaluations would be raw scores, percentile ranks, derivedscores,
standard scores, etc
Assessment
In education, the term assessment refers to the wide variety of methods or tools that educators
use to evaluate, measure, and document the academic readiness, learning progress, skill
acquisition, or educational needs of students.
• Assessment involves the use of empirical data on student learning to refine programs and
improve student learning.
• Assessment is the process of gathering and discussing information from multiple and
diverse sources in order to develop a deep understanding of what students know,
understand, and can do with their knowledge as a result of their educational experiences;
the process culminates when assessment results are used to improve subsequent
learning. Assessment is the systematic basis for making inferences about the learning and
development of students. It is the process of defining, selecting, designing, collecting,
analyzing, interpreting, and using information to increase students’ learning and
developmentAssessment is the systematic collection, review, and use of information about
educational programs undertaken for the purpose of improving student learning and
development.

Characteristics
• Learner-Centered
• The primary attention of teachers is focused on observing and improving learning.
• Teacher-Directed
• Individual teachers decide what to assess, how to assess, and how to respond to
the information gained through the assessment
• Teachers do not need to share results with anyone outside of the class.
• Mutually Beneficial
• Students are active participants.
• Students are motivated by the increased interest of faculty in their success as
learners.
• Teachers improve their teaching skills and gain new insights.
• Formative
• Assessments are almost never "graded".
• Assessments are almost always anonymous in the classroom and often
anonymous online.
• Assessments do not provide evidence for evaluating or grading students.
• Context-Specific
• Assessments respond to the particular needs and characteristics of the teachers,
students, and disciplines to which they are applied.
• Customize to meet the needs of your students and course.
• Ongoing
• Classroom assessment is a continuous process.
• Part of the process is creating and maintaining a classroom "feedback loop"
• Each classroom assessment event is of short duration.
• Rooted in Good Teaching Practice
• Classroom assessment builds on good practices by making feedback on students'
learning more systematic, more flexible, and more effective.
Test
• A test or examination (informally, exam or evaluation) is an assessment intended to
measure a test-taker's knowledge, skill, aptitude, physical fitness, or classification in
many other topics (e.g., beliefs).[1]
A test may be administered verbally, on paper, on
a computer, or in a confined area that requires a test taker to physically perform a set of
skills. Tests vary in style, rigor and requirements. For example, in a closed book test, a
test taker is often required to rely upon memory to respond to specific items whereas in
an open book test, a test taker may use one or more supplementary tools such as a
reference book or calculator when responding to an item. A test may be administered
formally or informally. An example of an informal test would be a reading test
administered by a parent to a child. An example of a formal test would be a final
examination administered by a teacher in a classroom or an I.Q. test administered by a
psychologist in a clinic. Formal testing often results in a grade or a test score.[2]
A test
score may be interpreted with regards to a norm or criterion, or occasionally both. The

norm may be established independently, or by statistical analysis of a large number of
participants. An exam is meant to test a child's knowledge or willingness to give time to
manipulate that subject.
• A standardized test is any test that is administered and scored in a consistent manner to
ensure legal defensibility.[3]
Standardized tests are often used in education, professional
certification, psychology (e.g., MMPI), the military, and many other fields.
• A non-standardized test is usually flexible in scope and format, variable in difficulty and
significance. Since these tests are usually developed by individual instructors, the format
and difficulty of these tests may not be widely adopted or used by other instructors or
institutions. A non-standardized test may be used to determine the proficiency level of
students, to motivate students to study, and to provide feedback to students. In some
instances, a teacher may develop non-standardized tests that resemble standardized tests
in scope, format, and difficulty for the purpose of preparing their students for an
upcoming standardized test.[4]
Finally, the frequency and setting by which a non-
standardized tests are administered are highly variable and are usually constrained by the
duration of the class period. A class instructor may for example, administer a test on a
weekly basis or just twice a semester. Depending on the policy of the instructor or
institution, the duration of each test itself may last for only five minutes to an entire class
period.
• In contrasts to non-standardized tests, standardized tests are widely used, fixed in terms
of scope, difficulty and format, and are usually significant in consequences. Standardized
tests are usually held on fixed dates as determined by the test developer, educational
institution, or governing body, which may or may not be administered by the instructor,
held within the classroom, or constrained by the classroom period. Although there is little
variability between different copies of the same type of standardized test
(e.g., SAT or GRE), there is variability between different types of standardized tests.
• Any test with important consequences for the individual test taker is referred to as a high-
stakes test.
• A test may be developed and administered by an instructor, a clinician, a governing body,
or a test provider. In some instances, the developer of the test may not be directly
responsible for its administration. For example, Educational Testing Service (ETS), a
nonprofit educational testing and assessment organization, develops standardized tests
such as the SAT but may not directly be involved in the administration or proctoring of
these tests. As with the development and administration of educational tests, the format
and level of difficulty of the tests themselves are highly variable and there is no general
consensus or invariable standard for test formats and difficulty. Often, the format and
difficulty of the test is dependent upon the educational philosophy of the instructor,
subject matter, class size, policy of the educational institution, and requirements of
accreditation or governing bodies. In general, tests developed and administered by
individual instructors are non-standardized whereas tests developed by testing
organizations are standardized.

Characteristics of Test
Reliable
Reliability refers to the accuracy of the obtained test score or to how close the obtained scores
for individuals are to what would be their “true” score, if we could ever know their true score.
Thus, reliability is the lack of measurement error, the less measurement error the better. The
reliability coefficient, similar to a correlation coefficient, is used as the indicator of the reliability
of a test. The reliability coefficient can range from 0 to 1, and the closer to 1 the better.
Generally, experts tend to look for a reliability coefficient in excess of .70. However, many tests
used in public safety screening are what is referred to as multi-dimensional. Interpreting the
meaning of a reliability coefficient for a knowledge test based on a variety of sources requires a
great deal of experience and even experts are often fooled or offer incorrect interpretations.
There are a number of types of reliability, but the type usually reported is internal consistency or
coefficient alpha. All things being equal, one should look for an assessment with strong evidence
of reliability, where information is offered on the degree of confidence you can have in the
reported test score.
Valid
Validity will be the topic of our third primer in the series. In the selection context, the term
“validity” refers to whether there is an expectation that scores on the test have a demonstrable
relationship to job performance, or other important job-related criteria. Validity may also be used
interchangeably with related terms such as “job related” or “business necessity.” For now, we
will state that there are a number of ways of evaluating validity including:
▪ Content
▪ Criterion-related
▪ Construct
▪ Transfer or transportability
▪ Validity generalization
A good test will offer extensive documentation of the validity of the test.
Practical
A good test should be practical. What defines or constitutes a practical test? Well, this would be
a balancing of a number of factors including:
▪ Length – a shorter test is generally preferred
▪ Time – a test that takes less time is generally preferred
▪ Low cost – speaks for itself

▪ Easy to administer
▪ Easy to score
▪ Differentiates between candidates – a test is of little value if all the applicants obtain the
same score
▪ Adequate test manual – provides a test manual offering adequate information and
documentation
▪ Professionalism – is produced by test developers possessing high levels of expertise
The issue of the practicality of a test is a subjective judgment, which will be impacted by the
constraints facing the public-sector jurisdiction. A test that may be practical for a large city with
10,000 applicants and a large budget, may not be practical for a small town with 10 applicants
and a miniscule testing budget.
Socially Sensitive
A consideration of the social implications and effects of the use of a test is critical in public
sector, especially for high stakes jobs such as public safety occupations. The public safety
assessment professional must be considerate of and responsive to multiple group of stakeholders.
In addition, in evaluating a test, it is critical that attention be given to:
▪ Avoiding adverse Impact – Recent events have highlighted the importance of balance in
the demographics of safety force personnel. Adverse impact refers to differences in the
passing rates on exams between males and females, or minorities and majority group
members. Tests should be designed with an eye toward the minimization of adverse
impact..
▪ Universal Testing – The concept behind universal testing is that your exams should be able
to be taken by the most diverse set of applicants possible, including those with disabilities
and by those who speak other languages. Having a truly universal test is a difficult, if not
impossible, standard to meet. However, organizations should strive to ensure that testing
locations and environments are compatible with the needs of as wide a variety of
individuals as possible. In addition, organizations should have in place committees and
procedures for dealing with requests for accommodations.
Candidate Friendly
One of the biggest changes in testing over the past twenty years has been the increased attention
paid to the candidate experience. Thus, your tests should be designed to look professional and be
easy to administer. Furthermore, the candidate should see a clear connection between the exams
and the job. As the candidate completed the selection battery, you want the reaction to be “That

was a fair test, I had an opportunity to prove why I deserve the job, and this is the type of
organization where I would like to work.”
Measurement
Measurement is the assignment of a number to a characteristic of an object or event, which can
be compared with other objects or events.
The scope and application of a measurement is dependent on the context and discipline. In
the natural sciences and engineering, measurements do not apply to nominal properties of objects
or events, which is consistent with the guidelines of the International vocabulary of
metrology published by the International Bureau of Weights and Measures.
However, in other fields such as statisticsas well as the social and behavioral sciences,
measurements can have multiple levels, which would include nominal, ordinal, interval, and ratio
scales.
Measurement is a cornerstone of trade, science, technology, and quantitative research in many
disciplines. Historically, many measurement systems existed for the varied fields of human
existence to facilitate comparisons in these fields. Often these were achieved by local agreements
between trading partners or collaborators. Since the 18th century, developments progressed
towards unifying, widely accepted standards that resulted in the modern International System of
Units (SI). This system reduces all physical measurements to a mathematical combination of
seven base units. The science of measurement is pursued in the field of metrology.
Characteristic # 1. In educational measurement there is no absolute zero point:
In educational measurement there is no absolute zero point. It is relative to some arbitrary
standard. For example a student has secured ‘O’ in a test of mathematics. It does not mean that
he has ‘O’ knowledge in mathematics. Because he may secured 30 in another test, which is
easier than the first one. As the zero point is not fixed so we cannot say that a student with a
score of ’60’ has doubled the knowledge of a student with a score of ’30’.
Characteristic # 2. The units are not definite in educational measurement:
In educational measurement the units are not definite, so we may not obtain the same value for
every person. Because the test vary in their content and difficulty level. Therefore one individual
may perform differently on different tests and different individuals may perform differently on
one test.
Characteristic # 3. It conveys a sense of infinity:
It means we cannot measure the whole of an attribute of an individual. Generally the scores
obtained from a measurement are observed scores which contains measurement errors. So that
true score is infinite and unknown.

Characteristic # 4. It is a process of assigning symbols:
Measurement is a process of assigning symbols to observations in some meaningful and
consistent manner. In measurement generally we compare with certain standard unit or criteria
which have an universal acceptability.
Characteristic # 5. It cannot be measured directly:
In case of educational measurement we cannot measure for attribute directly. It is observed
through behaviour. For example (he reading ability of an individual can only be measured when
he is asked to read a written material.
Characteristic # 6. It is a means to an end but not an end itself:
The objective of educational measurement is not just to measure a particular attribute. Rather it is
done to evaluate to what extent different objectives have been achieved.
Principles of assessment
Reliability
If a particular assessment were totally reliable, assessors acting independently using the same
criteria and mark scheme would come to exactly the same judgment about a given piece of work.
In the interests of quality assurance, standards and fairness, whilst recognising that complete
objectivity is impossible to achieve, when it comes to summative assessment it is a goal worth
aiming for. To this end, what has been described as the 'connoisseur' approach to assessment
(like a wine-taster or tea-blender of many years experience, not able to describe exactly what
they are looking for but 'knowing it when they find it') is no longer acceptable. Explicitness in
terms of learning outcomes and assessment criteria is vitally important in attempting to achieve
reliability. They should be explicit to the students when the task is set, and where there are
multiple markers they should be discussed, and preferably used on some sample cases prior to be
using used 'for real'.
Validity
Just as important as reliability is the question of validity. Does the assessed task actually assess
what you want it to? Just because an exam question includes the instruction 'analyse and
evaluate' does not actually mean that the skills of analysis and evaluation are going to be
assessed. They may be, if the student is presented with a case study scenario and data they have
never seen before. But if they can answer perfectly adequately by regurgitating the notes they
took from the lecture you gave on the subject then little more may be being assessed than the
ability to memorise. There is an argument that all too often in British higher education we assess

the things which are easy to assess, which tend to be basic factual knowledge and comprehension
rather than the higher order objectives of analysis, synthesis and evaluation.
Relevance and transferability
There is much evidence that human beings do not find it easy to transfer skills from one context
to another, and there is in fact a debate as to whether transferability is in itself a separate skill
which needs to be taught and learnt. Whatever the outcome of that, the transfer of skills is
certainly more likely to be successful when the contexts in which they are developed and used
are similar. It is also true to say that academic assessment has traditionally been based on a fairly
narrow range of tasks with arguably an emphasis on knowing rather than doing; it has therefore
tended to develop a fairly narrow range of skills. For these two reasons, when devising an
assessment task it is important that it both addresses the skills you want the student to develop
and that as much as possible it puts them into a recognisable context with a sense of 'real
purpose' behind why the task would be undertaken and a sense of a 'real audience', beyond the
tutor, for whom the task would be done.
Criterion v Norm referenced assessment
In criterion-referenced assessment particular abilities, skills or behaviours are each specified as a
criterion which must be reached. The driving test is the classic example of a criterion-referenced
test. The examiner has a list of criteria each of which must be satisfactorily demonstrated in
order to pass - completing a three-point turn without hitting either kerb for example. The
important thing is that failure in one criterion cannot be compensated for by above average
performance in others; neither can you fail despite meeting every criterion simply because
everybody else that day surpassed the criteria and was better than you.
Norm-referenced assessment makes judgments on how well the individual did in relation to
others who took the test. Often used in conjunction with this is the curve of 'normal distribution'
which assumes that a few will do exceptionally well and a few will do badly and the majority
will peak in the middle as average. Despite the fact that a cohort may not fit this assumption for
any number of reasons (it may have been a poor intake, or a very good intake, they have been
taught well, or badly, or in introductory courses in particular you may have half who have done it
all before and half who are just starting the subject giving a bimodal distribution) there are even
some assessment systems which require results to be manipulated to fit.
The logic of a model of course design built on learning outcomes is that the assessment should
be criterion-referenced at least to the extent that sufficiently meeting each outcome becomes a
'threshold' minimum to passing the course. If grades and marks have to be generated, a more

complex system than pass/fail can be devised by defining the criteria for each grade either
holistically grade by grade, or grade by grade for each criterion (see below).
Writing and using assessment criteria
Assessment criteria describe how well a student has to be able to achieve the learning outcome,
either in order to pass (in a simple pass/fail system) or in order to be awarded a particular grade;
essentially they describe standards. Most importantly they need to be more than a set of
headings. Use of theory, for example, is not on its own a criterion. Criteria about theory must
describe what aspects of the use of theory are being looked for. You may value any one of the
following: the students' ability to make an appropriate choice of theory to address a particular
problem, or to give an accurate summary of that theory as it applies to the problem, or to apply it
correctly, or imaginatively, or with originality, or to critique the theory, or to compare and
contrast it with other theories. And remember, as soon as you have more than one assessment
criterion you will also have to make decisions about their relative importance (or weighting).
Graded criteria are criteria related to a particular band of marks or honours classification or grade
framework such as Pass, Merit, Distinction. If you write these, be very careful about the
statement at the 'pass' level. Preferably start writing at this level and work upwards. The danger
in starting from, eg first class honours, is that as you move downwards, the criteria become more
and more negative. When drafted, ask yourself whether you would be happy for someone
meeting the standard expressed for pass, or third class, to receive an award from your institution.
Where possible, discuss draft assessment activities, and particularly criteria, with colleagues
before issuing them.
Once decided, the criteria and weightings should be given to the students at the time the task is
set, and preferably some time should be spent discussing and clarifying what they mean. Apart
from the argument of fairness, this hopefully then gives the student a clear idea of the standard
they should aim for and increases the chances they will produce a better piece of work (and
hence have learnt what you wanted them to). And feedback to the student on the work produced
should be explicitly in terms of the extent to which each criterion has been met.
Instructional Assessment Process
Instructional Assessment Process involves collection and analysis of data from six sources that,
when combined, present a comprehensive view of the current state of the school as it compares
to the underlying beliefs and principals that make up the Pedagogy of Confidence and lead to
school transformation. The six components are:
• School Background Pre-Interview Questionnaire

• Principal Interview
• Achievement Data
• Teacher Survey
• Student Survey
• Classroom Visits
School Background Pre-Interview Questionnaire
It gathers background information using a pre-interview questionnaire submitted by the
principal, the school’s School Improvement Plan (or similar document), and interviewing the
principal in person. The pre-interview questionnaire collects basic demographic data about the
school, the students and the faculty, as well as a brief history of current initiatives, school
organization and scheduling practices, special services, community partnerships and the like.
Principal Interview
It meets with the principal to review the questionnaire, to obtain more information about the
school and to learn the principal’s perspectives on the instructional program, students and
staff. Care is taken to ensure that the principal speaks first about the strengths of the school,
unique situations that exist within the school, recent changes that may be affecting the school, his
or her goals for the school and what he or she believes is needed to achieve those goals.
Achievement Data
It gathers and analyzes existing achievement data to uncover patterns over time and to correlate
with what constituents say about the school, how achievement data compares to state and district
achievement, and any other relevant comparisons.
Teacher Survey
It representative conducts the teacher survey during a schoolwide faculty meeting to
ensure consistency of administration and to explain to the faculty other data collection activities
that may be taking place at the school. The survey probes teachers’ perspectives on the school’s
climate and instructional program and seeks suggestions about how they, as a faculty, could best
serve their students, especially underachievers. Surveys are anonymous and make use of
multiple choice and open-ended questions that allow teachers leeway to express their inside
perspective on the instructional life of the school; their assessments of and attitudes toward
students, families and administration; recent and needed professional development initiatives;
and their preferred pedagogical approaches.
Student Survey
The student survey contains 20 items and is administered to all students following a prescribed
method of administration. Its purpose is to assess the school’s instructional program from the
students’ perspectives. The items invite response in five areas:
• Perspectives on myself as a learner

• School climate
• My teachers
• Classroom activities
• My preferred learning activities
Students are asked to strongly agree, agree, disagree, or strongly disagree with some statements
and to select their choices among others. It provides a summary of student survey responses for
ease of analysis.
Classroom Visits
A team of specially trained It representatives conducts classroom visitations that follow a
schedule intended to cover a broad spectrum of classes. Visitors note the activities in which
students are engaged, study the interactions between teacher and students, and attend to other
visible characteristics of the instructional program, including the physical environment of the
rooms. Approximately half the classes in a school are visited to help form a composite picture of
the current state of instruction. Teachers voluntarily participate in the visits and all data is
recorded without identifying individual teachers. Visitors concentrate on elements of effective
instruction that NUA knows to have positive effects on all students’ learning and that NUA finds
particularly important in raising the performance of underachieving students. A sample of these
elements includes:
• The learning engages students. Students comprehend and retain what they are taught most
effectively when they are engaged in classroom activities. Engagement is marked by willing
participation, expressions of interest and displays of enthusiasm, and results when students
find classroom activities and assignments highly meaningful and interesting. Instruction that
engages students has a positive effect on their achievement and increases the likelihood they
will develop into lifelong learners.
• Learning activities guide students to relate lesson content to their lives. Students benefit from
deliberately connecting what they are learning to what they know from their experience as
individuals and as members of the cultural groups with which they most closely identify.
Making such connections between the curriculum and what is personally relevant and
meaningful has a positive influence on students’ motivation to learn, on their confidence as
learners, and on their comprehension and retention of the material. Although the teacher can
suggest such connections, students benefit most by generating and expressing their own
connections.
• The learning includes students interacting with each other as learners.Working
collaboratively in pairs or small groups enables students to pool their knowledge as they
develop their understanding of curriculum material. Interacting productively with peers also
helps students stay attentive in class. In addition, collaborative work can increase students’
motivation to learn because of the support they get from their peers and the enjoyment that
results from peer interaction. Pair or small-group interactions may be used for solving
problems, discussing possible answers to a teacher’s question, generating new questions on a
topic being discussed before sharing ideas with the whole class, representing information that
has been learned in a creative way, and other such purposes.

• The learning promotes high-level thinking about lesson content. High-level thinking about
curriculum content helps students generate deeper and broader understandings while
developing their thinking capacities. Students’ learning is enhanced when they have frequent
opportunities to respond at length to thought-provoking questions, to engage in high-level
conversations with peers, and to ask their own questions about what they are learning to
clarify, refine and extend meanings. High-level thinking includes such mental processes as
hypothesizing, inferring, generalizing, analyzing, synthesizing and evaluating. Opportunities
to engage in such thinking are ideally part of daily instruction as well as integral to long-
term, complex projects.
Types of assessment procedure
• 1. Diagnostic Assessment (as Pre-Assessment)
• One way to think about it: Assesses a student’s strengths, weaknesses, knowledge, and
skills prior to instruction.
• Another way to think about it: A baseline to work from
• 2. Formative Assessment
• One way to think about it: Assesses a student’s performance during instruction, and
usually occurs regularly throughout the instruction process.
• Another way to think about it: Like a doctor’s “check-up” to provide data to revise
instruction
• 3. Summative Assessment
• One way to think about it: Measures a student’s achievement at the end of instruction.
• Another way to think about it: It’s macabre, but if formative assessment is the check-up,
you might think of summative assessment as the autopsy. What happened? Now that it’s
all over, what went right and what went wrong?
• 4. Norm-Referenced Assessment
• One way to think about it: Compares a student’s performance against other students (a
national group or other “norm”)

• Another way to think about it: Group or “Demographic” assessment
• 5. Criterion-Referenced Assessment
• One way to think about it: Measures a student’s performance against a goal, specific
objective, or standard.
• Another way to think about it: a bar to measure all students against
• 6. Interim/Benchmark Assessment
• One way to think about it: Evaluates student performance at periodic intervals, frequently
at the end of a grading period. Can predict student performance on end-of-the-year
summative assessments.
• Another way to think about it: Bar graph growth through a year
• Explanation
• Formative Assessment are informal and formal tests given by teachers during the learning
process. These specific assessment modifies the activities done in the classroom, so that
there is more student achievement. It identifies strengths and weakness and target areas
that needs work.
• Summative Assessment evaluates students learning at the end of an instructional unit
such as a chapter or specified topic. Final papers, midterms and final exams allow the
teachers to determine if you comprehended the information given correctly.
• Norm reference assessment compare student’s performance against a national or other
“norm” group.
• Performance base assessment requires students to solve real world problems or produce
something with real world application. These specific assessment allows the educator to
distinguish how well the students think critical and analytical as well as .Restricted
response is known to be more narrowly defined than extended response task. Examples
would be , multiple choice question and answers as opposed to extended response which
normally is connected to writing a report.
• Authentic assessment is the measurement of accomplishments that are worth while
compared to multiple choice standardized tests.
• Selective response assessment is also referred to as objective assessments including
multiple choice, matching, and true and false questions. It is very effective and efficient

methods for measuring students knowledge. It is a very common form of assessing the
students in th classroom.
• Supply response students must supply an answer to a question prompt.
• Criterion referenced test are designed to measure student performance against a fixed set
of predetermined criteria or learning standards.
Instructional decision
• Instructional Decisions are made to identify student’s instructional needs. This is a
general education initiative, and focuses on instruction by using data about student’s
responses to past instruction to guide future educational decisions. Decisions are
proactive approaches of providing early assistance to students with instructional needs
and matching the amount of resources to the nature of the student’s needs.฀1.
• Screening all students to ensure early identification of students needing extra assistance;
2. Seamless integration of general and special education services; and 3. A focus on
research based practices that match students needs.
• Teachers are constantly collecting informal and formal information about what and how
their student’s are learning. They check student test and assignments, listen to small-
group activities, and observe students engaged in structured and unstructured activities.
They use this information for a variety of purposes, ranging from communicating with
parents to meeting standards and benchmarks. However, when teachers systematically
collect the right kinds of information and use it effectively, they can help their student’s
grow as thinkers and learners
• The need for a complete review of the material; 2. Class discussion may reveal
misunderstanding that must be corrected on the spot; and 3. Interest in a topic may
suggest that more time should be spent on it than originally planned.
Selection assessment
• A selection assessment is the most frequently used type of assessment and part of a
selection procedure. The selection assessment often takes place towards the end of the
procedure, to test the candidates' suitability for the position in question.
• Thegoalofaselectionassessment
• A selection assessment is an attempt to get a better understanding of how the candidate
would perform in the position applied for. The assessment is used based on the idea that
suitability does not really show when using questionnaires, letters and interviews. This is
because candidates often will say what they think the employer wants to hear, so only
practical simulations can clearly demonstrate how a person responds in certain situations.

• Components
• The components of a selection assessment depend on the position being applied for. For
an executive position, the focus will be on testing the candidates' leadership qualities, for
other positions the emphasis can be, for example, on communication skills.
• Frequently used components of an assessment include the mailbox exercise, fact
finding and role-playing. Intelligence tests and interviews are often part of a selection
assessment as well. To prepare for an assessment, you can practice different tests. For
example, you can try the free IQ test.
• Assessmentreport
• Following the assessment, a report will be drafted describing the conclusions on each
candidate. As a candidate, you will always be the first to see this assessment report and
you have the right not to agree to the report being sent to the employer. However, if you
do not agree to this, your chances of getting the job will be practically nil.
• Assessmentcompanies
• Selection assessments are often performed by independent companies that conduct
assessments on behalf of different companies. In that case, the assessment will take place
in the offices of the assessment company. Some companies, especially larger ones,
organise their own assessments and in that case the assessment will take place in the
company itself.
• In the case of internal reorganisations, career assessments are often used. Read more
about the career assessment.
Placement and classification decisions
Selection is a personnel decision whereby an organization decides whether to hire
individuals using each person’s score on a single assessment, such as a test or interview, or a
single predicted performance score based on a composite of multiple assessments. Using this
single score to assign each individual to one of multiple jobs or assignments is referred to as
placement. An example of placement is when colleges assign new students to a particular level
of math class based on a math test score. Classification refers to the situation in which each of a

number of individuals is assigned to one of multiple jobs based on their scores on multiple
assessments. Classification refers to a complex set of personnel decisions and requires more
explanation.
A Conceptual Example
The idea of classification can be illustrated by an example. An organization has 50 openings in
four entry-level jobs: Word processor has 10 openings, administrative assistant has 12 openings,
accounting clerk has 8 openings, and receptionist has 20 openings. Sixty people apply for a job at
this organization and each completes three employment tests: word processing, basic accounting,
and interpersonal skills.
Generally, the goal of classification is to use each applicant’s predicted performance score for
each job to fill all the openings and maximize the overall predicted performance across all four
jobs. Linear computer programming approaches have been developed that make such
assignments within the constraints of a given classification situation such as the number of jobs,
openings or quotas for each job, and applicants. Note that in the example, 50 applicants would
get assigned to one of the four jobs and 10 applicants would get assigned to not hired.
Using past scores on the three tests and measures of performance, formulas can be developed to
estimate predicted performance for each applicant in each job. The tests differ in how well they
predict performance in each job. For example, the basic accounting test is fairly predictive of
performance in the accounting clerk job, but is less predictive of performance in the receptionist
job. Additionally, the word processing test is very predictive of performance in the word
processor job but is less predictive of performance in the receptionist job. This means that the
equations for calculating predicted performance for each job give different weights to each test.
For example, the equation for accounting clerk gives its largest weight to basic accounting test
scores, whereas the receptionist equation gives its largest weight to interpersonal skill test scores
and little weight to accounting test scores. Additionally, scores vary across applicants within
each test and across tests within each individual. This means that each individual will have a
different predicted performance score for each job.
One way to assign applicants to these jobs would be to calculate a single predicted performance
score for each applicant, select all applicants who have scores above some cutoff, and randomly
assign applicants to jobs within the constraints of the quotas. However, random assignment
would not take advantage of the possibility that each selected applicant will not perform equally
well on all available jobs. Classification takes advantage of this possibility. Classification
efficiency can be viewed as the difference in overall predicted performance between this
univariate (one score per applicant) strategy and the multivariate (one score per applicant per
job) classification approach that uses a different equation to predict performance for each job.
A number of parameters influence the degree of classification efficiency. An important one is the
extent to which predicted scores for each job are related to each other. The smaller the
relationships among predicted scores across jobs, the greater the potential classification
efficiency. That is, classification efficiency increases to the extent that multiple assessments
capture differences in the individual characteristics that determine performance in each job.

Policy decisions
Policy decisions are defined in management theory as those decisions that define the basic
principles of the organization and determine how it will develop and function in the future.
Policies set the limits within which operational decisions are made. Examples include:
• Vision, Mission, Aim
• Budget and Finance Practices
• Allocation of Resources
• Organizational Structure
Policy decisions limit the actions an organization and its members can take without changing the
policy.
In sociocracy, policy decisions are made by consent. Operational decisions are made within the
limits set by policy decisions and may be made autocratically by the person in charge or by other
means determined by the people whom the decisions affect.
Examples of Policy Statements
We set policies in our everyday lives without realizing it or writing them down. Examples
include:
• Deciding not to drink coffee or consume animal products
• Pledging to complete tax forms before their due date
• Sending your children to public schools by choice
• Deciding not to have children to devote time to political causes
In non-profit organizations the policies might include:
• Following the IRS regulations that set requirements for 501c3 status to receive tax-deductible
contributions
• Limiting membership to professionals with a demonstrated expertise
• Serving meals to the homeless
• Using contributions only for administrative costs and not staff salaries
In business they might include:
• Annual and departmental budgets
• Employee compensation schedules
• Union agreements
• Future donations of money and employee time to charitable causes
• Production of certain products and not others
• Limiting sales and marketing to retail or wholesale customers
These are all decisions that define the scope of day-to-day decisions about how we will conduct
our personal or work lives, our operations.

Counseling and guidance decisions
Decision making has always been a fundamental human activity.
At some stage within the career guidance planning process, decisions are made. The decision in
some cases might be to make far reaching changes, or perhaps the decision might be not to
change anything. In some cases, little change might ensue, but a decision has still been made,
even if the result, having considered the consequences, is not to change.
As a guide it is important to take into account that individual participants vary a great deal in
terms of how they make decisions, what factors are important to them, how ready they are to
make them and how far participants are prepared to live with uncertain outcomes.
The traditional way within guidance to handle decision making is to see it as a rational, almost
linear process. This is illustrated by the Janis and Mann model exemplified in the practical
exercise example mentioned below involving balance sheets. The aim is to encourage a rational
approach to planning for the future. Typically this involves an evaluation of available options
with a look at the pros and cons of each, taking account of the participant’s personal
circumstances.
In practice of course the process of making a decision is influenced by all sorts of things. In
everyday terms the decision making may in fact be driven by the irrational, the “quick fix”
solution and in some cases, prejudicial ideas, perhaps based upon ingrained or outdated ideas.
Gerard Egan describes this as the “shadow side” of decision making. De Bono’s thinking hats
exercise (see below) attempts to factor in some of the emotional and other factors linked to
decision making.
As individuals we can vary in the style of decision making we use. For some decisions we might
take a “logical” approach based upon the linear thinking mentioned above. For some decisions
we might make a “no thought” decision, either because the matter is so routine it doesn’t require
any thought, or in some occasions just to make a quick fix so we don’t have to think about it any
more. Sometimes participants in guidance interviews may talk about their realisation that they
should have looked into a decision further before rushing into one course of action. Some
individuals employ a hesitant style of decision making, where decisions are delayed as long as
possible, whereas others may make a choice based upon an emotional response, what feels right
subjectively. Finally some participants might make decisions that can be classified as compliant;
that is based upon the perceived expectations of what other people want. A key role in guidance
is to identify how a participant has made previous professional development decisions- and
whether the approach seems to have worked for them. Might there be other ways of deciding that
lead to better decisions?
Using Decision making exercises in a Guidance Setting
There is a broad range of tools to aid the decision making process within a professional
development discussion. Here are two introductory examples. Further examples are available via
the references and web sites below.

Balance sheet
In its simplest form this consists of two columns representing two choices. The advantages and
disadvantages of each choice can simply be listed. Sometimes the very act of writing down pros
and cons can bring clarity.
Sometimes subdividing the headings into Advantages for me, Advantages for others,
disadvantages for me, disadvantages for others can yield a richer analysis. Janis and Mann
suggest this process.
A slightly more sophisticated use of balance sheets might involve the participant completing the
sheet as above initially, then the adviser producing a list of other suggested factors that the
individual may not have considered at first. These can either be included, or ignored by the
participant.
An example of a simple balance sheet
Six thinking Hats
This tool was created by Edward de Bono in his book "6 Thinking Hats".
How to Use the Tool:
To use Six Thinking Hats to improve the quality of the participant’s decision-making; look at the
decision "wearing" each of the thinking hats in turn.
Each "Thinking Hat" is a different style of thinking. These are explained below:
White Hat:
With this thinking hat, the participant is encouraged to focus on the data available. Look at the
information they have about themselves and see what they can learn from it. Look for gaps in
your knowledge, and either try to fill them or take account of them.
This is where the participant is encouraged to analyze past experience, work roles etc. and try to
learn from this
Red Hat:
Wearing the red hat, the participant looks at the decision using intuition, gut reaction, and
emotion. The idea is also to encourage the participant to try to think how other people will react
emotionally to the decision being made, and try to understand the intuitive responses of people
who may not fully know your reasoning.
Black Hat:
When using black hat thinking, look at things pessimistically, cautiously and defensively. Try to
see why ideas and approaches might not work. This is important because it highlights the weak
points in a plan or course of action. It allows the participant to eliminate them, alter your
approach, or prepare contingency plans to counter problems that might arise. Black Hat thinking
can be one of the real benefits of using this technique within professional development planning,

as sometimes participants can get so used to thinking positively that often they cannot see
problems in advance, leaving them under-prepared for difficulties.
Yellow Hat:
The yellow hat helps you to think positively. It is the optimistic viewpoint that helps you to see
all the benefits of the decision and the value in it, and spot the opportunities that arise from it.
Yellow Hat thinking helps you to keep going when everything looks gloomy and difficult.
Green Hat:
The Green Hat stands for creativity. This is where you can develop creative solutions to a
problem. It is a freewheeling way of thinking, in which there is little criticism of ideas.
Blue Hat:
The Blue Hat stands for process control. This is the hat worn by people chairing meetings. When
running into difficulties because ideas are running dry, they may direct activity into Green Hat
thinking. When contingency plans are needed, they will ask for Black Hat thinking, and so on.
You can use Six Thinking Hats in guidane discussions. It is a way of encouraging participants to
look at decision making from different perspectives. This can be done either metaphorically -as
in “imagine you are wearing the white hat...” - or by having cards each with the name of the hat
and a brief description of the “way of looking at things” that the hat brings with it. The cards can
be shuffled and dealt to the participant in turn. By doing this the guide is encouraging the
participant to consider a decision from a range of perspectives
Assembling, Administering and Appraising Classroom Test
and Assessment
Assembling the Test
1. Record items on index cards
2. Double-check all individual test items
3. Double-check the items as a set
4. Arrange items appropriately
5. Prepare directions
6. Reproduce the test
Administering the Test
The guiding principle
• Provide conditions that give all students a fair chance to show what they know
Physical conditions

• Light, ventilation, quiet, etc.
Psychological conditions
• Avoid inducing test anxiety
• Try to reduce test anxiety
• Don’t give test when other events will distract
Suggestions
• Don’t talk unnecessarily before the test
• Minimize interruptions
• Don’t give hints to individuals who ask about items
• Discourage cheating
• Give students plenty of time to take the test.
Appraising the Test
• Step where the institution management find out how effective it has been at conduction
and evaluation of student.
The process
• Define organizational goal
• Defining objective and continuous monitoring the performance and progress
• Performance evaluation / reviews
• Providing feedback
• Performance appraisal ( reward / punishment)
Purpose of Classroom tests and assessment
Classroom assessment is a one of the most important tools teachers can use to
understand the needs of their students. When executed properly and on an ongoing basis,
classroom assessment should shape student learning and give teachers valuable insights.
Identify Student Strengths and Weaknesses
Assessments help teachers identify student strengths as well as areas where students may
be struggling. This is extremely important during the beginning of the year when students
are entering new grades. Classroom assessments, such as diagnostic tests, help teachers
gauge the students' level of mastery of concepts from the prior grade.
Monitor Student Progress
Throughout the course of a lesson or unit, teachers use classroom assessment to monitor
students' understanding of the concepts being taught. This informs teachers in their lesson
planning, helping them pinpoint areas that need further review. Assessment can be done
in the form of weekly tests, daily homework assignments and special projects.

Assess Student Prior Knowledge
Before beginning a new unit, assessment can inform teachers of their students' prior
experience and understanding of a particular concept or subject matter. These types of
assessments can be done orally through classroom discussion or through written
assignments such as journals, surveys or graphic organizers.
Purposes of assessment
Teaching and learning
The primary purpose of assessment is to improve students’ learning and teachers’ teaching as
both respond to the information it provides. Assessment for learning is an ongoing process that
arises out of the interaction between teaching and learning.
What makes assessment for learning effective is how well the information is used.
System improvement
Assessment can do more than simply diagnose and identify students’ learning needs; it can be
used to assist improvements across the education system in a cycle of continuous improvement:
• Students and teachers can use the information gained from assessment to determine their next
teaching and learning steps.
• Parents, families and whānau can be kept informed of next plans for teaching and learning and
the progress being made, so they can play an active role in their children’s learning.
• School leaders can use the information for school-wide planning, to support their teachers and
determine professional development needs.
• Communities and Boards of Trustees can use assessment information to assist their
governance role and their decisions about staffing and resourcing.
• The Education Review Office can use assessment information to inform their advice for
school improvement.
• The Ministry of Education can use assessment information to undertake policy review and
development at a national level, so that government funding and policy intervention is
targeted appropriately to support improved student outcomes.

Developing specifications for tests and assessment
Definitions
I’ve seen the terms “Test Plan” and “Test Specification” mean slightly different things over the
years. In a formal sense (at this given point in time for me), we can define the terms as follows:
• Test Specification – a detailed summary of what scenarios will be tested, how they will
be tested, how often they will be tested, and so on and so forth, for a given feature.
Examples of a given feature include, “Intellisense, Code Snippets, Tool Window
Docking, IDE Navigator.” Trying to include all Editor Features or all Window
Management Features into one Test Specification would make it too large to effectively
read.
• Test Plan – a collection of all test specifications for a given area. The Test Plan contains
a high-level overview of what is tested (and what is tested by others) for the given feature
area. For example, I might want to see how Tool Window Docking is being tested. I can
glance at the Window Management Test Plan for an overview of how Tool Window
Docking is tested, and if I want more info, I can view that particular test specification.
If you ask a tester on another team what’s the difference between the two, you might receive
different answers. In addition, I use the terms interchangeably all the time at work, so if you see
me using the term “Test Plan”, think “Test Specification.”
Parts of a Test Specification
A Test Specification should consist of the following parts:
• History / Revision – Who created the test spec? Who were the developers and Program
Managers (Usability Engineers, Documentation Writers, etc) at the time when the test
spec was created? When was it created? When was the last time it was updated? What
were the major changes at the time of the last update?
• Feature Description – a brief description of what area is being tested.
• What is tested? – a quick overview of what scenarios are tested, so people looking
through this specification know that they are at the correct place.
• What is not tested? – are there any areas being covered by different people or different
test specs? If so, include a pointer to these test specs.
• Nightly Test Cases – a list of the test cases and high-level description of what is tested
each night (or whenever a new build becomes available). This bullet merits its own blog
entry. I’ll link to it here once it is written.
• Breakout of Major Test Areas – This section is the most interesting part of the test spec
where testers arrange test cases according to what they are testing. Note: in no way do I

claim this to be a complete list of all possible Major Test Areas. These areas are
examples to get you going.
o Specific Functionality Tests – Tests to verify the feature is working according to
the design specification. This area also includes verifying error conditions.
o Security tests – any tests that are related to security. An excellent source for
populating this area comes from the Writing Secure Codebook.
o Accessibility Tests – This section shouldn’t be a surprised to any of my blog
readers. <grins> See The Fundamentals of Accessibility for more info.
o Stress Tests – This section talks about what tests you would apply to stress the
feature.
o Performance Tests – this section includes verifying any perf requirements for
your feature.
o Edge cases – This is something I do specifically for my feature areas. I like
walking through books like How to break software, looking for ideas to better test
my features. I jot those ideas down under this section
o Localization / Globalization – tests to ensure you’re meeting your product’s
International requirements.
Setting Test Case Priority
A Test Specification may have a couple of hundred test cases, depending on how the test cases
were defined, how large the feature area is, and so forth. It is important to be able to query for
the most important test cases (nightly), the next most important test cases (weekly), the next
most important test cases (full test pass), and so forth. A sample prioritization for test cases may
look like:
• Highest priority (Nightly) – Must run whenever a new build is available
• Second highest priority (Weekly) – Other major functionality tests run once every three
or four builds
• Lower priority – Run once every major coding milestone
(OR)
Major Points
1. Your goal is valid, reliable, useful assessment
2. Which requires:
a. Determining what is to be measured
b. Defining it precisely
c. Minimizing measurement of irrelevancies

3. And is promoted by following good procedures
Four Steps in Planning an Assessment
1. Deciding its purpose
2. Developing test specifications
3. Selecting best item types
4. Preparing items
Step 1: Decide the Purpose
What location in instruction?
1. pre-testing
o readiness
i. limited in scope
ii. low difficulty level
iii. serve as basis of remedial work, adapting instruction
o pretest (placement)
i. items similar to outcome measure
ii. but not the same (like an alternative form)
2. during instruction
o formative
i. monitor learning progress
ii. detect learning errors
iii. feedback for teacher and students
iv. limited sample of learning outcomes
v. must assure that mix and difficulty of items sufficient
vi. try to use to make correction prescriptions (e.g., review for whole group,
practice exercises for a few)
o diagnostic
i. enough items needed in each specific area
ii. items in one area should have slight variations
3. end of instruction
o mostly summative –broad coverage of objectives
o can be formative too
Step 2: Develop Test Specifications
• Why? Need good sample!
• How? Table of specifications (2-way chart, "blueprint")
1. Prepare list of learning objectives
2. outline instructional content
3. prepare 2-way chart
4. or, use alternative to 2-way chart when more appropriate
5. doublecheck sampling

6. Sample of a Content Domain (For this course)
1. trends/controversies in assessment
2. interdependence of teaching, learning, and assessment
3. purposes and forms of classroom assessment
4. planning a classroom assessment (item types, table of specs)
5. item types (advantages and limitations)
6. strategies for writing good items
7. compiling and administering classroom assessments
8. evaluating and improving classroom assessments
9. grading and reporting systems
10. uses of standardized tests
11. interpreting standardized test scores
Sample Table of Specifications (For chapters 6 and 7 of this course)
Sample SLOs (you would
typically have more)
Bloom Levels
Remember Understand Apply Analyze Evaluate Create
Identifies definition of
key terms (e.g., validity)
X
Identifies examples of
threats to test reliability
and validity
X
Selects best item type
for given objectives
X
Compares the pros and
cons of different kinds
of tests for given
purposes
X
Evaluates particular
educational
reforms (e.g., whether
they will hurt or help
instruction)
X
Create a unit test X
Total number of items

Spot the Poor Specific Learning Outcomes (For use with previous table of specifications)
Which entries are better or worse than others? Why? Improve the poor ones.
1. Knowledge
a. Knows correct definitions
b. Able to list major limitations of different types of items
2. Comprehension
a. Selects correct item type for learning outcome
b. Understands limitations of true-false items
c. Distinguishes poor true-false items from good ones
3. Application
a. Applies construction guidelines to a new content area
b. Creates a table of specifications
4. Analysis
a. Identifies flaws in poor items
b. Lists general and specific learning outcomes
5. Synthesis
a. Lists general and specific content areas
b. Provides weights for areas in table of specifications
6. Evaluation
a. Judges quality of procedure/product
b. Justifies product
c. Improves a product
Why are These Better Specific Learning Outcomes?
1. Knowledge
a. Selects correct definitions
b. Lists major limitations of different item types
2. Comprehension
a. Selects proper procedures for assessment purpose
b. Distinguishes poor procedures from good ones
c. Distinguishes poor decisions/products from good ones
3. Application
a. Applies construction guidelines to a new content area
4. Analysis
a. Identifies flaws in procedure/product
b. Lists major and specific content areas
c. Lists general and specific learning outcomes
5. Synthesis
a. Creates a component of the test
b. Provides weights for cells in table of specifications
6. Evaluation
a. Judges quality of procedure/product
b. Justifies product

c. Improves a product
Step 3: Select the Best Types of Items/Tasks
What types to choose from? Many!
1. objective--supply-type
a. short answer
b. completion
2. objective--selection-type
a. true-false
b. matching
c. multiple choice
3. essays
a. extended response
b. restricted response
4. performance-based
a. extended response
b. restricted response
Which type to use? The one that fits best!
1. most directly measures learning outcome
2. where not clear, use selection-type (more objective)
a. multiple choice best (less guessing, fewer clues)
b. matching only if items homogeneous
c. true-false only if only two possibilities
Strengths and Limitations of Objective vs. Essay/Performance
Objective Items
• Strengths
o Can have many items
o Highly structured
o Scoring quick, easy, accurate
• Limitations
o Cannot assess higher level skills (problem formulation, organization, creativity)
Essay/Performance Tasks
• Strengths
o Can assess higher level skills
o More realistic
• Limitations

o Inefficient for measuring knowledge
o Few items (poorer sampling)
o Time consuming
o Scoring difficult, unreliable
Step 4: Prepare Items/Tasks
Strategies to Measure the Domain Well—Reliably and Validly
1. specifying more precise learning outcomes leads to better-fitting items
2. use 2-way table to assure good sampling of complex skills
3. use enough items for reliable measurement of each objective
o number depends on purpose, task type, age
o if performance-based tasks, use fewer but test more often
4. keep in mind how good assessment can improve (not just measure) learning
o signals learning priorities to students
o clarifies teaching goals for teacher
o if perceived as fair and useful
Strategies to Avoid Contamination
1. eliminate barriers that lead good students to get the item wrong
2. don’t provide clues that help poor students get the item correct
General Suggestions for Item Writing
1. use table of specifications as guide
2. write more items than needed
3. write well in advance of testing date
4. task to be performed is clear, unambiguous, unbiased, and calls forth the intended
outcome
5. use appropriate reading level (don’t be testing for ancillary skills)
6. write so that items provide no clues (minimize value of "test-taking skills")
a. a/an
b. avoid specific determiners (always, never, etc.)
c. don’t use more detailed, longer, or textbook language for correct answers
d. don’t have answers in an identifiable pattern
7. write so that item provides no clues to other items
8. seeming clues should lead away from the correct answer
9. experts would agree on the answer
10. if item revised, recheck its relevance

Selecting and constructing appropriate types of items
and assessment tasks
• Different types of tests- Limited choice questions – MC, T/F, matching type• Open-
ended questions – Short answer, essay• Performance testing – OSCE, OSPE• Action
oriented testing
• Process of test administration Statement Content Table Item of goals outline specification
selection Composition Development of of Item construction answer sheet instructions
Construction of Test Test answer key administration revision
• Characteristics of good test Consistent Reliability, Utility, Validity , How well a test
Uniform Free from Cost & measure in extra time what it measure us source effective
supposed of errors to measure
• A test construction should intend to answer;? What kind of test is to be made? What is
the precise purpose? What are the abilities are to be tested? How detailed and how
accurate the results must be? What constraints are set by unavailability of expertise,
facilities, time of construction, administration & scoring? Who will take the test? What is
the scope of the test
• Principles of test construction1. Measure all instructional objectives – Objectives that
are communicated and imparted to the students – Designed as an operational control to
guide the learning sequences and experiences – Harmonious to the teachers instructional
objectives2. Cover all learning tasks – Measures the representative part of learning task3.
Appropriate testing strategies or items – Items which appraise the specific learning
outcome – Measurements or tests based on the domains of learning
• Make test valid & reliable – Reliable when it produce dependant, consistent, and
accurate scores – Valid when it measures what it purports to measure – Test which are
written clearly and unambiguous are reliable – Tests with fairly more items are reliable
than tests with less items – Tests which are well planned, covers wide objectives, & are
well executed are more valid
• Use test to improve learning – Tests are not only an assessment but also it is a learning
experience – Going over the test items may help teachers to reattach missed items –
Discussion and clarification over the right choice gives further learning – Further
guidance & modification in teaching measures enabled through the revision of test6.
Norm referenced & criterion referenced tests – Norm referenced: higher & abstract level
of cognitive domain – Criterion referenced: lower & concrete levels of learning
• Planning for a test1. Outline the learning objectives or major concepts to be covered by
the test – Test should be representative of objectives and materials covered – Major
students complaint: test don’t fairly cover the material that was supposed to be canvassed
on the test2. Create a test blue print3. Create questions based on blueprint
• For each, check on the blueprint (3-4 alternate questions on the same idea/ objective
should be made)5. Organize questions on item type6. Eliminate similar questions7. Re-
read, and check them from the student stand- point8. Organize questions logically9.
Check the time in completion by teacher-self and then multiplying it with 4 depending on
the level of students10. Analyze the results/ item analysis
• Process of Test Construction

• Preliminary considerations Specify test purposes, & describe the domain of content &/or
behavior of interestb) Specify the group of examinees (age, gender, socio-economic
background etc) c) Determine the time & financial resources available for constructing &
validating the test d) Identify & select qualified staff memberse) Specify the initial
estimate length of the test (time in developing, validating & completion by the students
• Review of content domain/behaviors a) Review the descriptions of the content standard
or objectives to determine the acceptability for inclusion in the test b) Select the final
group of objectives (i.e. finalize the content standard) c) Prepare the item specification
for each objective & review the completeness, clarity, accuracy & practicability
• Item/task writing & preparation of scoring rubrics a) Draft a sufficient number of items
and or tasks for field testing b) Carry out items/task editing, and review scoring rubric
• Assessment of content validity a) Identify a pool of judges & measurement specialties b)
Review the test items & tasks to determine their match to the objectives, their
representativeness, & freedom from stereotyping, & potential biases c) Review the test
items and/or tasks to determine their technical adequacy
• Revision of test tasks/items a) Based upon data from step 4b & 4c; revise the test
items/tasks or delete them b) Write additional test items/tasks & repeat the step 4
• Field test administration a) Organize the test items/ tasks into forms for field testingb)
Administer test forms to appropriately chosen groups of examineesc) Conduct item
analysis & item bias studies {“studies to identify differentially functioning test items”}d)
If statistical thinking or equating of forms is needed
• Revision to test item/ task• Revise/ delete them, using the result from step 6c.• Check the
scoring rubrics for the performance task being field tested
• Test assembly• Determine the test length, the number of forms needed, & the no. of
items/tasks per objective• Select the item from the available pool of valid test material•
Prepare test directions, practice questions, test booklet layout, scoring keys, answer
sheets & so on.• Specify modifications to instructions, medium of presentation, or
examinees response, and time requirement for finishing the items
• Selection of performance standard a) Performance standards are needed to accomplish
the test purpose b) Determine the perform standard c) Initiate & document the
performance standard d) Identify the alternative test score interpretation for examinees
requiring alternative administration or other modalities
• Pilot test (if possible) a) Design the test administration to collect score reliability &
validity informationb) Administer test form(s) to appropriately chosen groups of
examinees c) Identify & evaluate alternative administration/other modification, to meet
individual specific needs that may affect validity and reliability of the test or forms of the
test d) Evaluate the test administration procedures, test items, and score reliability and
validity e) Make final revisions to the test forms of the test based on the available data.
• Preparation of manuals a) Prepare test administrators manual 12. Additional technical
data collection a) Conduct reliability & validity investigations on a continuing basis
• Item analysis• Shortening or lengthening an existing test items is done through item
analysis• Validity & reliability of any test depends on the characteristics of its item• Two
types 1. Qualitative analysis 2. Quantitative analysis
• Qualitative item analysis• Content validity – Content & form of items – Expert opinion•
Effective item formulation Quantitative item analysis• Item difficulty• Item
discrimination

Characteristics of Standardised Tests and teacher made test
Standardised Tests
Some characteristics of these tests are:
1. They consist of items of high quality. The items are pretested and selected on the basis of
difficulty value, discrimination power, and relationship to clearly defined objectives in
behavioural terms.
2. As the directions for administering, exact time limit, and scoring are precisely stated, any
person can administer and score the test.
3. Norms, based on representative groups of individuals, are provided as an aid for interpreting
the test scores. These norms are frequently based on age, grade, sex, etc.
4. Information needed for judging the value of the test is provided. Before the test becomes
available, the reliability and validity are established.
5. A manual is supplied that explains the purposes and uses of the test, describes briefly how it
was constructed, provides specific directions for administering, scoring, and interpreting results,
contains tables of norms and summarizes available research data on the test.
No two standardized tests are exactly alike. Each test measures certain specific aspects of
behaviour and serves a slightly different purpose. There are some tests with similar titles
measuring aspects of behaviour that differ markedly, whereas other tests with dissimilar titles,
the measure the aspects of behaviour that are almost identical. Thus, one has to be careful in
selecting a standardised test.
5. Provides information’s for curriculum planning and to provide remedial coaching for
educationally backward children.
6. It also helps the teacher to assess the effectiveness of his teaching and school instructional
programmes.
7. Provides data for tracing an individual’s growth pattern over a period of years.
8. It helps for organising better guidance programmes.

9. Evaluates the influences of courses of study, teacher’s activities, teaching methods and other
factors considered to be significant for educational practices.
Features of Teacher-Made Tests:
1. The items of the tests are arranged in order of difficulty
2. These are prepared by the teachers which can be used for prognosis and diagnosis purposes.
3. The test covers the whole content area and includes a large number of items.
4. The preparation of the items conforms to the blueprint.
5. Test construction is not a single man’s business, rather it is a co-operative endeavour.
6. A teacher-made test does not cover all the steps of a standardised test.
7. Teacher-made tests may also be employed as a tool for formative evaluation.
8. Preparation and administration of these tests are economical.
9. The test is developed by the teacher to ascertain the student’s achievement and proficiency in
a given subject.
10. Teacher-made tests are least used for research purposes.
11. They do not have norms whereas providing norms is quite essential for standardised tests.
Steps/Principles of Construction of Teacher-made Test:
A teacher-made test does not require a well-planned preparation. Even then, to make it more
efficient and effective tool of evaluation, careful considerations arc needed to be given while
constructing such tests.
The following steps may be followed for the preparation of teacher-made test:
1. Planning:
Planning of a teacher-made test includes:
a. Determining the purpose and objectives of the test, ‘as what to measure and why to measure’.
b. Deciding the length of the test and portion of the syllabus to be covered.

c. Specifying the objectives in behavioural terms. If needed, a table can even be prepared for
specifications and weightage given to the objectives to be measured.
d. Deciding the number and forms of items (questions) according to blueprint.
e. Having a clear knowledge and understanding of the principles of constructing essay type, short
answer type and objective type questions.
f. Deciding date of testing much in advance in order to give time to teachers for test preparation
and administration.
g. Seeking the co-operation and suggestion of co-teachers, experienced teachers of other schools
and test experts.
2. Preparation of the Test:
Planning is the philosophical aspect and preparation is the practical aspect of test construction.
All the practical aspects to be taken into consideration while one constructs the tests. It is an art,
a technique. One is to have it or to acquire it. It requires much thinking, rethinking and reading
before constructing test items.
Different types of objective test items viz., multiple choice, short-answer type and matching type
can be constructed. After construction, test items should be given lo others for review and for
seeking their opinions on it.
The suggestions may be sought even from others on languages, modalities of the items,
statements given, correct answers supplied and on other possible errors anticipated. The
suggestions and views thus sought will help a test constructor in modifying and verifying his
items afresh to make it more acceptable and usable.
After construction of the test, items should be arranged in a simple to complex order. For
arranging the items, a teacher can adopt so many methods viz., group-wise, unit-wise, topic wise
etc. Scoring key should also be prepared forthwith to avoid further delay in scoring.
Direction is an important part of a test construction. Without giving a proper direction or
instruction, there will be a probability of loosing the authenticity of the test reliability. It may
create a misunderstanding in the students also.
Thus, the direction should be simple and adequate to enable the students to know:

Assessment In Education At Secondary Level

Assessment In Education At Secondary Level

Recommended

Recommended

More Related Content

Similar to Assessment In Education At Secondary Level

Similar to Assessment In Education At Secondary Level (20)

More from Sheila Sinclair

More from Sheila Sinclair (20)

Recently uploaded

Recently uploaded (20)

Assessment In Education At Secondary Level