SlideShare a Scribd company logo
1 of 15
Download to read offline
1
Development and Evaluation of a Competence Based
Exam for Prospective Driving Instructors
Erik Roelofs1
,
Maria Bolsinova1
,
Marieke van Onna1
,
Jan Vissers2
1
Cito, National Institute for Educational Measurement, The Netherlands
2
Royal Haskoning DHV, Amersfoort, The Netherlands
Introduction
A growing consensus among driver training and road safety researchers is that driver training
should place greater emphasis on higher-order, cognitive and motivational functions
underlying driving behaviour (Mayhew and Simpson, 2002; Hatakka et al., 2002). This
changed conception of driver training has been laid down in the Goals for Driver Education
matrix (Hatakka et al., 2002). Recent research seems to support this consensus (Isner, Starkey,
& Shepard, 2011; Beanland, Goode, Salmon, & Lenné, 2013). Innovative training initiatives
appear to counteract overconfidence and address motivational factors such as driving anger,
sensation seeking, and boredom (e.g. Isner et al., 2009).
Parallel to the doubts raised about the quality of driver training, the quality of driver instructor
preparation programs is criticized. The MERIT review study (Bartl, Gregeresen, & Sanders,
2005) showed that huge variations existed in quality of driver instructor education throughout
Europe. The content did not cover higher order skills. Most programs relied on teacher-
focused approaches, which seem to fall short in developing higher order skills.
In many European countries the quality of the education of driving instructors is regulated by
means of the instructor exam. One may view this as a problem, but on the other side, this also
offers opportunities for improvement. A valid and reliable exam that only allows proficient
prospective instructors to enter the profession, may have a positive backwash effect on driver
instructor education programs, as in other fields of education: teachers teach and students
learn what will be tested (Crooks, 1988; Fredericksen & Collins, 1989; Madaus, 1988).
In the Netherlands, the first steps in this direction have been made in the last ten years. As
part of a new law on driving education in 2003, competence-based outcome standards for
prospective driving instructors have been formulated (Nägele, Vissers, & Roelofs, 2006). The
most far-reaching change underlying these standards has been the emphasis on performance
in critical job-situations with real learner drivers. In addition, supporting knowledge was
defined in terms of relevant concepts, principles and decision making skills to be applied in
authentic instructional situations.
Based on the standards, a two stage competence-based exam was designed and put into action
in the fall of 2009. Since then, over 4000 prospective driving instructors (PDI’s) have gone
through one or more tests. The question is whether the assessments have resulted in valid and
fair decisions about PDI’s. Regarding the tenability of decisions, this paper focuses on the
separate theoretical assessments, comprising stage 1. In addition, their predictive value for
instructor performance as demonstrated at the final performance assessment lesson (stage 2) is
studied.
2
The fairness question concentrates on the comparability of different versions of assessments.
In the exam under study, items banks are used, from which different sets are drawn to
compose exam versions to prevent effects of item exposure and cheating. The question then
arises whether one cut-off score implies the same level of required proficiency for different
versions. To solve this problem, psychometrical equating methods are common to determine
how scores on two different tests can be projected on one (latent) scale (Kolen, & Brennan,
1995).
In summary, four research questions are addressed:
1. To what extent are the individual parts of the exam psychometrically reliable?
2. To what extent do the different theoretical tests inter correlate?
3. To what extent do results on theoretical tests and performance assessment for
instructional ability correlate?
4. Do the used cut-off score across different versions of the theoretical tests reflect
equivalent required levels of proficiency?
Design Features of the Competence Based Exam
The exam consists of two stages. The first stage comprises the assessment of the theoretical
knowledge base of prospective instructors regarding driving and driving pedagogy. After
having passed the first stage PDI’s receive a provisional instructor license enabling them to
enrol in a half year internship at a (certified) professional driving school. In the second stage,
after having finished their internships, PDI’s are judged on their professional instructional
abilities, during a masterpiece lesson involving one of their own learner drivers, whom they
have been teaching as an intern. If they pass, the will get a full license for the next five years.
Below the design features are described. A summary of all exam parts is provided in Table 1.
The Design Process
All individual assessments of the exam were designed by means of the evidence-centred
design model (ECD, Almond, Steinberg, & Mislevy,2002; Mislevy, & Haertel, 2006). The
ECD model identifies five layers in the design process: domain analysis, domain modelling,
conceptual assessment framework, assessment implementation, and assessment delivery.
These design layers have been gone through successively, whereby a continuous dialogue
took place between the assessment designers and different stake holders: a board of instructor
educators, exam institutes, ICT specialists, psychometricians, educational scholars, academic
teacher educators, driving examiners, and driving instructors.
The Conceptual Assessment Framework
In the exam the layer of the conceptual assessment framework for assessment task design was
of central importance. The conceptual assessment framework helps to sort out the
relationships among attributes of a candidate’s competence, observations which show
competence and situations which elicit relevant driver performance. The central models for
task design are the Competence or Student Model, the Evidence Model, and the Task Model.
The driving instructor competence model. The Competence Model encompasses variables
representing the aspects of instructor competence that are the targets of inference in the
assessment and their inter-relationships. Starting from a literature search on what comprises
3
good teaching in general and more specifically driving instruction a competence model was
constructed. This resulted in the formulation of four domains of competence summarized in
Table 1: 1) Conscious traffic participation as first and second driver, 2) Lesson preparation, 3)
Instruction and coaching, 4) Evaluation, reflection and revision.
A model of competent task performance formed the basis for two competence models: driving
competence (Roelofs, Van Onna, Vissers, 2010) and instructional competence (Roelofs &
Sanders, 2007). A basic tenet in the model (see Figure 1) is that instructor competence is
reflected in the consequences of an instructor’s actions. The most important consequences are
students’ learning activities during delivery of instruction or coaching, and safety, flow and
comfort in traffic during driving.
Starting from the consequences, the remaining elements of the model can be mapped
backwards. First, the component ‘actions’ refers to professional activities, e.g. delivering
instruction or providing coaching to students. Second, any instructor activity takes place
within a specific context in which many decisions need to be made, on a long-term basis
(planning ahead) or immediately during an in-car situation. For instance, instructors will have
to plan their instruction and adapt it depending on differing circumstances (e.g. different
learning paces, traffic situations). Third, when making decisions and performing activities,
teachers will have to draw from a professional knowledge base (e.g. pedagogical principles,
psychology of driving, rules and regulations).
[Insert Figure 1]
Assessment task models and exam assembly. The Task Model describes the kinds of
assessments tasks (items) that embody an assessment (test). They follow directly from the
cognitive activity and interactive activity as mentioned competence model. Three types of
assessments tasks were designed.
1) Case based items. (Schuwirth et al. 2000; Norman, Swanson, & Case, 1996). These items
address knowledge of concepts and cause-effect rules in cases embedded in a rich driving
instruction context. These questions were used for theoretical assessment in all four task
domains. An example of an item is: “An instructor starts a lesson with an explanation on the
first topic. He does not preview on what the learner driver is going to be able to at the end of
the lesson. What is the most likely consequence of his approach?” To respond, the PDI would
choose out of four options: A. the learner will learn less that desirable, B. the learner will not
fully understand your explanation, C. the learner will have less time to practice new driving
tasks, D. the learner cannot direct his attention to the essential parts of the lesson (correct).
2) Situational judgment items. (Whetzel & McDaniel, 2009). These items address decision
making skills in a rich driving instruction context. An example regarding lesson planning is:
“The learner driver is starting to learn manoeuvring in traffic situations with low traffic
intensity as illustrated on the picture. During the upcoming lesson you are going to instruct
how to park backwards into a parking bay. Which of the parking bays [pictures shown with or
without other vehicle parked alongside an empty bay] can you choose best for a learner driver
in this stage of driver education?” To respond, the PDI would choose one out of four pictures.
One option is optimal, one other is suboptimal, and the remaining two are unacceptable.
3) Performance assessment assignments. To address the PDI’s own driving competence and
his ability to verbalize mental task processes (perception, anticipation, decision, action
4
execution) the PDI had to complete a 60 minute drive, on which his performance was judged
by a trained assessor using a standardized scoring form. Five performance criteria were
employed on this form: 1) safety, 2) aiding traffic flow, 3) driving socially considerately, 4)
ECO friendly driving, and 5) vehicle control (Roelofs, Van Onna, & Vissers, J, 2010). At two
intermediate stops the PDI was asked to verbalize his mental processes that went on while
solving the previous traffic situations.
Finally, professional actions regarding instruction, coaching and evaluation were judged by
having the PDI carry out a lesson with a real learner driver. The lesson performance was
judged by trained assessors. To this end a 34 item scoring form was used to judge the quality
of coaching and instruction.
Evidence model. The Evidence Model describes how to extract the key items of evidence
from instructor behaviour, and the relationship of these observable variables to the
competence model variables. All three theory tests consisted of 60 multiple choice items. The
Theory of Driving Test consisted of one-best-answer type items, scored dichotomously (0,1).
The cut-off score for passing the test was 42 items. Both the Theory of Lesson Preparation
test and the Theory of Instruction and Coaching Test included situational judgment items,
which were of the partial credit model. The best answer yielded 7 points, a suboptimal answer
yielded 3 points, while the distractors yielded no points. The case-based knowledge items had
one best answer; all items were scored 0 and 7, for incorrect and correct answers respectively.
The cut-off score for passing the test was 266 points (out of 420 points).
The items on the Performance Assessment Lesson were scored on a three point scale,
representing ‘counterproductive performance’, ‘beginning productive performance’ and
‘optimal performance’. A detailed scoring guide was available for assessors. Initial rater
agreement scores using Gower’s similarity index (Gower, 1971) showed acceptable levels of
agreement, .67 for instruction and .75 for coaching. The cut-off score for passing the
performance assessment was 71 points (out of 102 points), under the condition that on no
more than one out of seven categories the results were below the specified cut-off score
(Mean=2.0).
Methods
Subjects and Data
All data from candidates who enrolled the program between January 1st
1010 and October 1st
2012 were selected. Data from 2009 were discarded of, because the computer-based
assessment platform was not completely stable at that time. This resulted in assessment data
for 4741 prospective driving instructors. 79 per cent of them were male and 21 per cent
female. The mean age was 34.9 years (SD=10.9). Of them 3079 (74.4%) were born in the
Netherlands. The remaining 25.6% originally came from 79 different countries. The majority
of them were immigrants from Morocco (n=199), Suriname (n=190), Turkey (n=151),
Afghanistan (n=112), Iraq (n=89), and Iran (46).
A total of 4644 PDI’s completed at least one of the theory tests. Of them 2977 passed all their
theory exams. Of them 1941 PDI’s took part in de Performance Assessment instruction and
coaching. From the remaining PDI’s about half (n=508) did not participate in the Performance
Assessment Lesson within more than a year after their last successful theory test. The
remaining part (n=528) did not finish their internship. 368 PDI’s got dispensation to
5
participate in the performance assessment, although they failed one of theory tests. In total,
2315 PDI’s participated at least once in the Performance Assessment Lesson.
Analyses
IRT-analyses were performed on the data of the three theory tests. Each test had been
administered in many different versions, based on 60 item representative samples drawn from
an item bank. Available software does not allow to perform IRT-analysis on a very large
number of versions - there were more than 700 versions for each test- of a test, especially if
some of these versions were taken by only one PDI. Therefore, for each of the theory tests
only versions with a substantive number of participants were selected. In Table 2 the number
of test versions, total number of items and sample size chosen for analysis are shown.
[Insert Table 2]
For the tests which included items with partial scoring (0, 3, 7), a partial credit model was
fitted first. The model had a very poor fit in both cases. Responses to all items in the tests
Theory of Instruction and Coaching and Theory of Lesson Preparation were dichotomized,
because this allows to fit more flexible IRT models that can fit the data better. All responses
for which 7 points were given were recoded as 1 (correct response), and all other as 0
(incorrect response). The number of items correct had a very high correlation with the original
number of points (.986 for both tests). A new cut-off score for the number of items correct
was chosen in such a way that the number of subjects having less than 266 points but passing
the test with a new cut-off score in terms of items correct and the number of subjects having
more than 266 points but failing the test according to the new criterion was minimal. For the
both tests, the new cut-off score was 35 items correct, which gave 4.65% of misclassifications
for the test of Theory of Instruction and Coaching and 3.95% for the test of Theory of Lesson
Preparation.
A one parameter logistic model (OPLM, Verhelst, & Glas, 1995; Verhelst, Glas, &
Verstralen, 1995) was fitted to the data of each theory test. In the Theory of Driving Test, one
of the items was excluded from the analysis because it was answered correctly by all PDI’s. In
the Theory of Instruction and Coaching Test, two items were excluded because they had
strong negative correlations with the test score. The model had a reasonable fit for all three
tests.
In the perspective of IRT, the concept of reliability differs from that of the Classical Test
Theory. The measurement precision depends on the position on the latent ability scale. The
ability of a subject for whom all test items are too difficult will be measured with a larger
error than the ability of someone for whom the difficulty of the items is appropriate. In the
case of three theory tests, it is important to measure accurately at the ability level
corresponding to the cut-off score, representing the pass-fail boundary. A standard error of
estimate of ability corresponding to the required number of items correct (SE) has been
chosen as a measure of reliability. To enable interpretation of this value it has to be compared
with the standard deviation of the ability in the population (SD). As a measure, we look at the
so called proportion of true variance: (SD2
– SE2
)/SD2
. Additionally, as a measure of global
accuracy of measurement the MAcc index was used, reported by the OPLM software, which
is the expected reliability coefficient alpha as defined in classical test theory for a given
population of PDI’s.
6
Using an IRT model allows to locate all different versions of the test on the same scale, and
therefore compare different versions by their difficulty and the level of ability needed to pass
the test. For each test version a level of ability at the cut-off score was estimated. Unlike the
raw test scores, the latent estimates from different test versions are directly comparable. All
latent abilities were scaled in such a way, that the population distribution had a mean of 100
and standard deviation of 15.
To analyse relations between the three theory tests, a subsample of n=1980 which have taken
all three tests were selected. Correlations between the three latent abilities scores were
computed.
To determine the reliability of the final Performance Assessment Lesson, all available raw
data (n=580) for PDI’s were obtained and processed into data files. The exam institute only
kept the final pass/fail outcome in their fata files, but still had score forms of 580 PDI’s.
Principal component analysis was carried out to reduce the data to a limited number of
interpretable factors, yielding maximum reliable criterion variables. Correlations between
three latent abilities and the score on the resulting scales for instruction and coaching were
computed.
Results
Figure 2 shows a normal distribution of ability scores, (mean=100; SD =15). In this figure the
cut-off scores for the 14 most frequently administered versions of the Theory of Driving Test
are plotted. The dots represent the cut-off scores for each test version expressed in terms of
ability (Theta). The lines represent the standard errors around the cut-off score. Two things
can be noted. First, the cut-off score for all test versions fall below the average ability in the
total population. The mean cut-off score (M=85.3, see Table 2) is almost one standard
deviation below the ability mean of the population. This means that relatively low ability
(M=86.5) was needed to pass the test. Second, there are small differences between the
required ability levels for different test versions, but the variation of the cut-off levels
(SD=3.22) across versions is small compared to the standard error.
[Insert Figure 2]
Similarly, Figure 3 shows the plotted cut-off ability scores for the 15 most frequently
administered versions of the Theory of Lesson Preparation Test. As can be noted from the
figure the differences between the cut-off scores of versions are higher than for the Theory of
Driving Test. Version 1 and 6 differ considerably: version 1 requires an ability level of 78, as
version 6 requires 92. The mean cut-off score (M=85.3) is again below the ability mean of the
population.
[Insert Figure 3]
[Insert Figure 4]
The results regarding the Theory of Instruction and Coaching Test show a similar pattern (see
Figure 4): the required ability levels are below the mean ability (M=90.2, see Table 3), but
less pronounced compared to the other theory tests. The different test versions show different
levels of required ability, but these are relatively low, compared to the standard error.
7
For the theory of driving test the standard error at the cut-off scores amounts to 10.01,
whereas the values for the theory of lesson preparation and theory of instruction and coaching
amount to 9.56 and 7.74 respectively (see Table 3). All values fall within the region of good
reliability. The MAcc-coefficients reflect reasonable overall reliability for the whole test,
whereas the average proportion of true variance at the cut-off score measures reflect
questionable reliability for the first two tests and reasonable for the Theory of Instruction and
Coaching test.
[Insert Table 3]
The correlation coefficients between the ability scores on the three theory tests show a
moderate correlation of .56 between the Theory of Driving Test and the Theory of Lesson
Preparation Test. Ability scores on the Theory of Driving Test correlate .43 with the ability
scores on the Theory of Lesson Preparation Test. Finally, ability scores on the Theory of
lesson preparation test correlate .29 with ability levels on the Theory of Instruction and
Coaching Test.
Table 4 shows a psychometric report for the Final Performance Assessment Lesson.
[Insert Table 4]
Principal component analyses resulted in three clearly interpretable factors. Three fairly
reliable scale scores could be composed (see Table 4), representing aspects of coaching,
Motivational Support (alpha= .77), Diagnosis and Task Support (alpha= .79) and Instruction
(alpha= .83). The Motivational Support Scale correlated .46 (p<.001) with Diagnosis and
Task Support and .45 (p<.001) with Instructional skill. Diagnosis and Task Support
correlated .66 (p<.001) with Instructional Skill.
[Insert Table 5]
Table 5 shows the correlations between the ability scores on the theory tests and the scores on
the Performance Assessment Lesson, for the three subscales and the overall assessment score.
The correlation coefficients for Theory of Driving Test do not differ significantly from zero.
The ability scores for Lesson Preparation and Instruction/Coaching show six low but
significant correlations with the subscales and the overall scale for the final performance
assessment lesson (between .12 and .14; p<.05).
Discussion
The central question in this study was whether decisions made about prospective driving
instructors, as they follow from the result on theory tests of the innovated exam are valid and
fair.
First it can be concluded that the overall reliability of estimated ability scores on the theory
tests shows acceptable levels. The reliability around the cut-off scores was also acceptable,
which seems most important, because here the pass/fail decisions are made. The Final
Performance Assessment (lesson) also shows acceptable reliability in terms of the alpha
value, used in classical test theory.
8
Second, the IRT models showed an acceptable fit, suggesting that the tests represent separable
one-dimensional abilities. The moderate correlation between the Theory of Driving Test and
the Theory of Lesson Preparation Test shows that knowledge of the traffic task is an
important predictor for knowledge and decision making regarding lesson preparation. To a
lesser extent high (or poor) achievement on lesson preparation is related to similar
performance on instruction and coaching.
Third, the predictive value of theory test performance for in-car instructional and coaching
performance was very low. The ability scores for lesson planning and instruction and
coaching only show very low, although significant, correlations with the in car-performance
for coaching and instruction. An explanation for this finding may be that only those who
passed the stage one theory exams are allowed to go through the final assessment. In addition,
the effect of half a year of internship may have washed out initial differences between PDI’s.
Regarding fairness, the question was whether the different versions of the theory tests
required the same level of proficiency to pass. A first finding was that the cut-off scores for
pass-fail decisions for the theory tests were well below the average ability level of the
population, implicating relatively low ability requirements. The Theory of Coaching and
Instruction Test and the Theory of driving Test had comparable cut-off scores across version
and were hence equivalent in their ability requirements. For Lesson Preparation there were
larger differences in required ability across test versions, although the differences fell within
range of the standards errors.
As far as the construction and delivery process concerned some problems need further
attention. Many of these are related to the way the assessments are delivered. The exam is
computer-based administered at an exam office, involving items banks from which different
versions are drawn.
To reach representativeness all versions need to reflect all sub domains (for each test at least
9), mental activities (perception, decision making, action, cause-effect reasoning, concept
recognition), and critical situations (learner characteristics, stage of acquisition, traffic
situation) that are distinguished. The relatively small size of the item bank resulted in the
frequent reuse of items, which may have led to overexposure of the items, which may result in
the lowering of item difficulties.
In addition, we observed that a part of the items had poor quality, i.e. low or highly negative
item-test correlation coefficients, and extreme p-values (near zero or one). In the current
examination practice poor items were not excluded ad posteriori from the tests, because
shorter test versions would not have been accepted by stakeholders. However, it would have
been defendable to estimate ability levels based on a smaller ‘cleaned’ subset of items,
yielding a more reliable and still representative score.
An optimal approach to warrant acceptable item quality is to pre-test all items on a
representative sample of target candidates before putting them into item banks. This however
seems problematic because of the risk of early item exposure. In addition, exam costs would
rise. However, in general it can be recommended to use exam data to redesign and improve
the exam on the fly. Following the evidence-based design model of Mislevy and colleagues,
many questions can be answered on the fly: does the competence model reflected in the IRT
model fit? Are any changes needed? Do the cut-off scores represent what we want PDI’s to
know and to be able to? Can certain item characteristics be traced back to the way the item
9
was designed? In short, using an evidence-based Design model, in combination with on the
fly improvements can improve our decisions about prospective driving instructors.
In follow-up research, we intend to take a closer look at other parts of the exam, the
functioning of different item types, the way items are presented, the stimuli used in items, the
responses that are asked and the way these are related to estimates of PDI’s abilities.
To evaluate the long term effects of the innovated exam for instructional practice, learner
driver gain and crash involvement, longitudinal research will be necessary. In such a study
one should take into account the quality of all subsequent educational interventions and
related driver activities to determine whether there is a case for driver training (Beanland et
al., 2013).
References
Almond, R. G., Steinberg, L. S., & Mislevy, R. J. (2002). Enhancing the design and delivery
of assessment systems: A four-process architecture. Journal of Technology, Learning,
and Assessment, 1(5). http://www.bc.edu/research/intasc/jtla/journal/v1n5.shtml.
Bartl, G., Gregersen, N.P., & Sanders, N. (2005) EU MERIT Project: Minimum Requirements
for Driving Instructor Training. Vienna: Institut Gute Fahrt.
Bartl, G., Gregersen, N.P., & Sanders, N. (2005). EU MERIT Project: Minimum
Requirements for Driving Instructor Training. Final Report. Vienna: Institut Gute Fahrt.
Beanland, V., Goode, N., Salmon, P.M., Lenné, M.G. (2013). Is there a case for driver
training? A review of the efficacy of pre- and post-licence driver training. Safety
Science, 51, 127-137.
Crooks, T. J. (1988). The impact of classroom evaluation practices on students. Review of
Educational Research, 58, 438-481.
Fredericksen, J. R., & Collins, A. (1989). A systems approach to educational testing.
Educational Researcher, 18(9), 27-32.
Gower, J.C. (1971). A General Coefficient of Similarity and some of its Properties,
Biometrics, 27, 857-871.
Hatakka, M., Keskinen, E., Gregersen, N.P., Glad, A., Hernetkoski, K. (2002). From control
of the vehicle to personal self-control; broadening the perspectives of driver education.
Transportation Research Part F: Psychology and Behaviour, 5(3), 201–215.
Isner, R.B., Starkey, N.J., Sheppard, P. (2011). Effects of higher-order driving skill training
on young, inexperienced drivers’ on-road driving performance. Accident Analysis
Prevention, 43, 1818–1827.
Isner, R.B., Starkey, N.J., Williamson, A.R. (2009). Video-based road commentary training
improves hazard perception of young drivers in a dual task. Accident Analysis
Prevention, 41, 445–452.
Kolen, M.J., & Brennan, R.L. (1995). Test Equating. New York: Spring.
Madaus, G. F. (1988). The influences of testing on the curriculum. In L.N. Tarner (Ed.).
Critical issues in curriculum (pp. 83-121). Eighty-Seventh Yearbook of the National
Society for the Study of Education, Part I. Chicago: University of Chicago Press.
Mayhew,D.R., Simpson, H.M. (2002). The safety value of driver education and training.
Injury Prevention, 8, 3–8.
Mislevy, R. J., Steinberg, L. S., & Almond, R. G. (2003). On the structure of educational
assessments. Measurement: Interdisciplinary Research and Perspectives, 1, 3–67.
Nägele, R., Vissers, J., & Roelofs, E.C. (2006) Herziening WRM: een model voor
competentiegericht examineren. Eindrapport [Revision Law on motor vehicle
10
education: a model for a competence based exam] . Dossier : X4691.01.001;
Registratienummer : MV-SE20060477. Amersfoort/Arnhem: DHV, Cito.
Norman G, Swanson DB, Case SM. (1996) Conceptual and methodological issues in studies
comparing assessment formats. Teaching and Learning in Medicine, 8(4):208-216.
Roelofs, E, & Vissers, J (2008), Toetsspecificaties theoretische deeltoetsen WRM-examen.
[Specifications of the theoretical exams for driver instructor candidates] Arnhem:
Cito; Amersfoort: DHV groep.
Roelofs, E.C., Onna, M. van, Vissers, J. (2010). Development of the Driver Performance
Assessment: Informing Learner Drivers of their Driving Progress. In L. Dorn (Ed.)
Driver behavior and training, volume IV (pp. 37-50). Hampshire: Ashgate Publishing
Limited.
Roelofs, E. & Sanders, P. (2007). Towards a framework for assessing teacher competence.
European Journal for Vocational Training, 40(1), 123-139.
Schuwirth, L.W.T., Verheggen, M.M., van der Vleuten, C,P,M,, Boshuizen, H.P.A., Dinant.
G.J. (2000). Validation of short case-based testing using a cognitive psychological
methodology. Medical Education, 35, 348–56.
Verhelst, N.D., & Glas, C.A.W. (1995). The one parameter logistic model. In: G.H. Fischer &
I.W. Molenaar (Eds.). Rasch models: Foundations, recent developments and
applications (pp. 215-239). New York: Springer.
Verhelst, N.D., Glas, C.A.W. & Verstralen, H.H.F.M. (1995). OPLM: One Parameter
Logistic Model. Computer program and manual. Arnhem: Cito.
Wang, M.; Haertel, G.; Walberg, H. (1993). Toward a knowledge base for school learning.
Review of educational research, 63, 249-294.
Whetzel, D.L, McDaniel, M.A. (2009). Situational judgment tests: An overview of current
research. Human Resource Management Review, 19, 188–202.
11
Table 1. Instructor competence profile and used tests for the Dutch prospective driver training exam
Elaboration of task domains Assessment method
Stage 1
Computer based:
Knowledge base and
cognitive skills
Stage 2:
On the job:
Performance
assessment
1. Competent in
conscious traffic
participation
1.1 Driving responsibly as a first
driver
The driving instructor is able to
drive a vehicle safely, smoothly,
socially considerate, and in an eco-
friendly way according to Dutch
driving standards.
1.2 Verbalizing mental
processes of driving
The driving instructor is able to
verbalize the mental task
processes that take place when
carrying out driving tasks in
different traffic situations.
Theory of driving test
(60 items): traffic
participation rules
knowledge, case-based
and situational judgment
items (task domain 1.1)
Performance
assessment drive as a
first and a second driver
(all task domains cluster
1)
2. Competent in
lesson preparation
2.1 Adaptive planning
The driving instructor is able to
construct an educational program
for the long term (curriculum) and
for the short term (lesson design)
adapted to the needs of the
individual learner driver (LD).
2.2 Elaborating driving
pedagogy
The driving instructor is able to
prepare a driving specific
pedagogical learning
environment for learner
drivers.
2.3 Organizing learning
The driving instructor is able
to organize lessons in such a
way that activities run smooth
and without interruptions,
ensuring a maximum amount
of productive learning time.
Theory of lesson
preparation test
(60 items): case-based
concept application,
reasoning and
situational judgment
items (all task domains
cluster 2) 1. Performance
assessment lesson with
real learner driver
2. Self –reflection
report internship
3. Reflective interview
internship (all task
domains cluster 2,3 and
4)
3. Competent in
instruction and
coaching
3.1 Providing instruction
The driving instructor is able to
provide instruction that is geared
to the actual developmental level
of the learner driver. It enables the
LD to progress towards self-
regulated performance in
increasingly complex tasks.
3.2 Providing coaching
The driving instructor is able to
monitor learner driver
development and guide the LD
towards self-regulation in
solving driving tasks and
driving related tasks.
Theory of instruction
and coaching test
(60 items): case-based
concept application,
reasoning and
situational judgment
items (all task domains
cluster 3)
4. Competent in
evaluation,
reflection and
revision
4.1 Assessing learner progress
The driving instructor is able to
assess the progress in driver
competence by judging the level of
performance himself and by using
expertise of professional
colleagues.
4.2 Reflection and revision
The driving instructor is able to
reflect on his own actions and
use the results of this reflection
for adapting his approach.
12
Table 1. Number of test versions, total number of items and sample size chosen
Table 2. Cut-off scores and cut-off reliability for the three theoretical tests
Cut-off Score Reliability
Mean SD Min Max Average SE at the
cut-off score
Average
proportion of
true variance at
the cut-off score
MAcc
Theory of
driving
86.5 3.22 81.9 92.9 10.01 .55 .70
Theory of
Lesson
Preparation
85.3 5.47 77.3 93.1 9.56 .59 .75
Theory of
Instruction
and
Coaching
90.2 2.52 86.93 95.5 7.74 .73 .83
Test Number of
test versions
selected
Minimum
number of
responses per
version
Maximum
number of
responses per
version
Total
number of
items
Sample size
Theory of
driving
14 99 484 211 3013
Theory of
Lesson
Preparation
15 38 586 201 2524
Theory of
Instruction and
Coaching
15 32 551 148 2771
13
Table 3. Psychometric report for the Final Performance Assessment Lesson.
N Minimum Maximum Mean SD Alpha
Coaching:
Motivational Support
(6 items)
580 9.00 18.0 14.3 2.2 0.77
Coaching: Diagnosis
and Task Support
(8 items)
580 9.00 24.0 16.9 2.8 0.79
Instructional skill
(15 items)
580 21.00 44.0 34.7 4.3 0.77
Exam score (34 items) 580 50.00 99.0 78.2 8.9 0.88
Table 4. Correlations between performance on theory tests and Performance
Assessment Lesson
(Sub)scale Final Performance
Assessment Lesson
Theory of
driving
Theory of
Lesson
Preparation
Theory of
Instruction
coaching
Coaching: motivational support
(6 items)
.01 .06 .13*
Coaching: diagnosis and support
of task process(8 items)
.07 .12*
.09
Instructional skill
(15 items)
.10 .12*
.11*
Exam score (34 items) .07 .12*
.14**
14
Figure 1. Model of competent task performance
Figure 2. Cut-off scores and standard errors for 13 versions of the Theory of Driving
Test.
Universe of task situations
3. Action
execution
4. Consequences
2. Perception
& decision
making
• Knowledge
• Skills
• Attitudes
• Moods
• Emotions
• Personality traits
1. Basis 3. Action
execution
and task conditions
15
Figure 3. Cut-off scores and standard errors for 15 versions of the Theory of Lesson
Preparation Test
Figure 4. Cut-off scores and standard errors for 15 versions of the theory of instruction
and coaching test

More Related Content

What's hot

Assessment and Evaluation System in Engineering Education of UG Programmes at...
Assessment and Evaluation System in Engineering Education of UG Programmes at...Assessment and Evaluation System in Engineering Education of UG Programmes at...
Assessment and Evaluation System in Engineering Education of UG Programmes at...
ijtsrd
 
Journal of Educational Technology Systems-2015-53-68
Journal of Educational Technology Systems-2015-53-68Journal of Educational Technology Systems-2015-53-68
Journal of Educational Technology Systems-2015-53-68
flashinvegas
 
THE IMPACT OF SIMULATION ON TEACHING EFFECTIVENESS AND STUDENT LEARNING PERFO...
THE IMPACT OF SIMULATION ON TEACHING EFFECTIVENESS AND STUDENT LEARNING PERFO...THE IMPACT OF SIMULATION ON TEACHING EFFECTIVENESS AND STUDENT LEARNING PERFO...
THE IMPACT OF SIMULATION ON TEACHING EFFECTIVENESS AND STUDENT LEARNING PERFO...
IJITE
 
Alternative Certification Research
Alternative Certification ResearchAlternative Certification Research
Alternative Certification Research
arl022
 
THE IMPACT OF SIMULATION ON TEACHING EFFECTIVENESS AND STUDENT LEARNING PERFO...
THE IMPACT OF SIMULATION ON TEACHING EFFECTIVENESS AND STUDENT LEARNING PERFO...THE IMPACT OF SIMULATION ON TEACHING EFFECTIVENESS AND STUDENT LEARNING PERFO...
THE IMPACT OF SIMULATION ON TEACHING EFFECTIVENESS AND STUDENT LEARNING PERFO...
IJITE
 
Efectos de la retroalimentcion verbal
Efectos de la retroalimentcion verbalEfectos de la retroalimentcion verbal
Efectos de la retroalimentcion verbal
Sisercom SAC
 
A Development of Students’ Worksheet Based on Contextual Teaching and Learning
A Development of Students’ Worksheet Based on Contextual Teaching and LearningA Development of Students’ Worksheet Based on Contextual Teaching and Learning
A Development of Students’ Worksheet Based on Contextual Teaching and Learning
IOSRJM
 

What's hot (17)

H045074150
H045074150H045074150
H045074150
 
Contribution300 a
Contribution300 aContribution300 a
Contribution300 a
 
Pertanika
PertanikaPertanika
Pertanika
 
Assessment and Evaluation System in Engineering Education of UG Programmes at...
Assessment and Evaluation System in Engineering Education of UG Programmes at...Assessment and Evaluation System in Engineering Education of UG Programmes at...
Assessment and Evaluation System in Engineering Education of UG Programmes at...
 
Hussain_Addas
Hussain_AddasHussain_Addas
Hussain_Addas
 
Journal of Educational Technology Systems-2015-53-68
Journal of Educational Technology Systems-2015-53-68Journal of Educational Technology Systems-2015-53-68
Journal of Educational Technology Systems-2015-53-68
 
AIT National Seminar with Chris Rust Emeritus Professor "Redesigning programm...
AIT National Seminar with Chris Rust Emeritus Professor "Redesigning programm...AIT National Seminar with Chris Rust Emeritus Professor "Redesigning programm...
AIT National Seminar with Chris Rust Emeritus Professor "Redesigning programm...
 
THE IMPACT OF SIMULATION ON TEACHING EFFECTIVENESS AND STUDENT LEARNING PERFO...
THE IMPACT OF SIMULATION ON TEACHING EFFECTIVENESS AND STUDENT LEARNING PERFO...THE IMPACT OF SIMULATION ON TEACHING EFFECTIVENESS AND STUDENT LEARNING PERFO...
THE IMPACT OF SIMULATION ON TEACHING EFFECTIVENESS AND STUDENT LEARNING PERFO...
 
Alternative Certification Research
Alternative Certification ResearchAlternative Certification Research
Alternative Certification Research
 
THE IMPACT OF SIMULATION ON TEACHING EFFECTIVENESS AND STUDENT LEARNING PERFO...
THE IMPACT OF SIMULATION ON TEACHING EFFECTIVENESS AND STUDENT LEARNING PERFO...THE IMPACT OF SIMULATION ON TEACHING EFFECTIVENESS AND STUDENT LEARNING PERFO...
THE IMPACT OF SIMULATION ON TEACHING EFFECTIVENESS AND STUDENT LEARNING PERFO...
 
Corporate training management
Corporate training managementCorporate training management
Corporate training management
 
Efectos de la retroalimentcion verbal
Efectos de la retroalimentcion verbalEfectos de la retroalimentcion verbal
Efectos de la retroalimentcion verbal
 
Course Outcome and Program Outcome Calculation(new method)
Course Outcome and Program Outcome Calculation(new method)Course Outcome and Program Outcome Calculation(new method)
Course Outcome and Program Outcome Calculation(new method)
 
A Development of Students’ Worksheet Based on Contextual Teaching and Learning
A Development of Students’ Worksheet Based on Contextual Teaching and LearningA Development of Students’ Worksheet Based on Contextual Teaching and Learning
A Development of Students’ Worksheet Based on Contextual Teaching and Learning
 
Exploring Tools for Promoting Teacher Efficacy with mLearning (mlearn 2014 Pr...
Exploring Tools for Promoting Teacher Efficacy with mLearning (mlearn 2014 Pr...Exploring Tools for Promoting Teacher Efficacy with mLearning (mlearn 2014 Pr...
Exploring Tools for Promoting Teacher Efficacy with mLearning (mlearn 2014 Pr...
 
Developing Assessment Instrument as a Mathematical Power Measurement
Developing Assessment Instrument as a Mathematical Power MeasurementDeveloping Assessment Instrument as a Mathematical Power Measurement
Developing Assessment Instrument as a Mathematical Power Measurement
 
Dep ed order do-s2016_55
Dep ed order   do-s2016_55Dep ed order   do-s2016_55
Dep ed order do-s2016_55
 

Similar to Competence based exam for prospective driving instructors

A7 UsryR
A7 UsryRA7 UsryR
A7 UsryR
Sheet32
 
14Using Gagne’s Nine Events in improving Computer-Based Ma.docx
14Using Gagne’s Nine Events in improving Computer-Based Ma.docx14Using Gagne’s Nine Events in improving Computer-Based Ma.docx
14Using Gagne’s Nine Events in improving Computer-Based Ma.docx
hyacinthshackley2629
 
TPEP WSU October 2011
TPEP WSU October 2011TPEP WSU October 2011
TPEP WSU October 2011
WSU Cougars
 
Outcomnes-based Education
Outcomnes-based EducationOutcomnes-based Education
Outcomnes-based Education
Carlo Magno
 
Collecting Information Please respond to the followingUsi.docx
Collecting Information Please respond to the followingUsi.docxCollecting Information Please respond to the followingUsi.docx
Collecting Information Please respond to the followingUsi.docx
mary772
 
Opportunities and Challenges Inherent In Technical and Vocational Schools; Co...
Opportunities and Challenges Inherent In Technical and Vocational Schools; Co...Opportunities and Challenges Inherent In Technical and Vocational Schools; Co...
Opportunities and Challenges Inherent In Technical and Vocational Schools; Co...
inventionjournals
 

Similar to Competence based exam for prospective driving instructors (20)

Competence based driving instructor exam
Competence based driving instructor examCompetence based driving instructor exam
Competence based driving instructor exam
 
Coaching Young Drivers in a Second Phase Training Programme
Coaching Young Drivers in a Second Phase Training ProgrammeCoaching Young Drivers in a Second Phase Training Programme
Coaching Young Drivers in a Second Phase Training Programme
 
Designing assessment and assessment-criteria
Designing assessment and assessment-criteriaDesigning assessment and assessment-criteria
Designing assessment and assessment-criteria
 
Assessment Of Business Programs A Review Of Two Models
Assessment Of Business Programs  A Review Of Two ModelsAssessment Of Business Programs  A Review Of Two Models
Assessment Of Business Programs A Review Of Two Models
 
A7 UsryR
A7 UsryRA7 UsryR
A7 UsryR
 
A review of classroom observation techniques used in postsecondary settings..pdf
A review of classroom observation techniques used in postsecondary settings..pdfA review of classroom observation techniques used in postsecondary settings..pdf
A review of classroom observation techniques used in postsecondary settings..pdf
 
Charlotte danielson The framework for teaching evaluation instrument 2013 edi...
Charlotte danielson The framework for teaching evaluation instrument 2013 edi...Charlotte danielson The framework for teaching evaluation instrument 2013 edi...
Charlotte danielson The framework for teaching evaluation instrument 2013 edi...
 
Charlotte Danielson Evaluation Instrument
Charlotte Danielson Evaluation InstrumentCharlotte Danielson Evaluation Instrument
Charlotte Danielson Evaluation Instrument
 
MODELLING TRAINING EFFECTIVENESS AND ITS IMPACT ON DEVELOPMENT OF MANAGERS IN...
MODELLING TRAINING EFFECTIVENESS AND ITS IMPACT ON DEVELOPMENT OF MANAGERS IN...MODELLING TRAINING EFFECTIVENESS AND ITS IMPACT ON DEVELOPMENT OF MANAGERS IN...
MODELLING TRAINING EFFECTIVENESS AND ITS IMPACT ON DEVELOPMENT OF MANAGERS IN...
 
Examination reform policy
Examination reform policy Examination reform policy
Examination reform policy
 
Assessing Professional Skill Development In Capstone Design Courses
Assessing Professional Skill Development In Capstone Design CoursesAssessing Professional Skill Development In Capstone Design Courses
Assessing Professional Skill Development In Capstone Design Courses
 
14Using Gagne’s Nine Events in improving Computer-Based Ma.docx
14Using Gagne’s Nine Events in improving Computer-Based Ma.docx14Using Gagne’s Nine Events in improving Computer-Based Ma.docx
14Using Gagne’s Nine Events in improving Computer-Based Ma.docx
 
EMBED Implementation Guidelines
EMBED Implementation GuidelinesEMBED Implementation Guidelines
EMBED Implementation Guidelines
 
TPEP WSU October 2011
TPEP WSU October 2011TPEP WSU October 2011
TPEP WSU October 2011
 
Outcomnes-based Education
Outcomnes-based EducationOutcomnes-based Education
Outcomnes-based Education
 
EXPERIENCE WITH THE APPLICATION OF AHP METHOD IN THE EMPLOYEE'S COMPETENCY PR...
EXPERIENCE WITH THE APPLICATION OF AHP METHOD IN THE EMPLOYEE'S COMPETENCY PR...EXPERIENCE WITH THE APPLICATION OF AHP METHOD IN THE EMPLOYEE'S COMPETENCY PR...
EXPERIENCE WITH THE APPLICATION OF AHP METHOD IN THE EMPLOYEE'S COMPETENCY PR...
 
Four Levels of Kirkpatrick's Evaluation
Four Levels  of Kirkpatrick's EvaluationFour Levels  of Kirkpatrick's Evaluation
Four Levels of Kirkpatrick's Evaluation
 
Leadership and learning with revisions dr. lisa bertrand-nfeasj (done)
Leadership and learning with revisions dr. lisa bertrand-nfeasj (done)Leadership and learning with revisions dr. lisa bertrand-nfeasj (done)
Leadership and learning with revisions dr. lisa bertrand-nfeasj (done)
 
Collecting Information Please respond to the followingUsi.docx
Collecting Information Please respond to the followingUsi.docxCollecting Information Please respond to the followingUsi.docx
Collecting Information Please respond to the followingUsi.docx
 
Opportunities and Challenges Inherent In Technical and Vocational Schools; Co...
Opportunities and Challenges Inherent In Technical and Vocational Schools; Co...Opportunities and Challenges Inherent In Technical and Vocational Schools; Co...
Opportunities and Challenges Inherent In Technical and Vocational Schools; Co...
 

More from Erik Roelofs

Roelofs driver training and learning design from a developmental perspective
Roelofs   driver training and learning design from a developmental perspectiveRoelofs   driver training and learning design from a developmental perspective
Roelofs driver training and learning design from a developmental perspective
Erik Roelofs
 

More from Erik Roelofs (7)

Roelofs driver training and learning design from a developmental perspective
Roelofs   driver training and learning design from a developmental perspectiveRoelofs   driver training and learning design from a developmental perspective
Roelofs driver training and learning design from a developmental perspective
 
Hoofdstuk 2 toetsen op school
Hoofdstuk 2 toetsen op schoolHoofdstuk 2 toetsen op school
Hoofdstuk 2 toetsen op school
 
Op weg naar congruentie tussen opleiden en beoordelen van docenten: een raamw...
Op weg naar congruentie tussen opleiden en beoordelen van docenten: een raamw...Op weg naar congruentie tussen opleiden en beoordelen van docenten: een raamw...
Op weg naar congruentie tussen opleiden en beoordelen van docenten: een raamw...
 
Validity of an on-road driver performance assessment within an initial driver...
Validity of an on-road driver performance assessment within an initial driver...Validity of an on-road driver performance assessment within an initial driver...
Validity of an on-road driver performance assessment within an initial driver...
 
Towards a framework for assessing teacher competence
Towards a framework for assessing teacher competenceTowards a framework for assessing teacher competence
Towards a framework for assessing teacher competence
 
Vaststellen van didactische bekwaamheid tijdens de LIO-stage
Vaststellen van didactische bekwaamheid tijdens de LIO-stageVaststellen van didactische bekwaamheid tijdens de LIO-stage
Vaststellen van didactische bekwaamheid tijdens de LIO-stage
 
Een procesmodel voor de beoordeling van competent handelen
Een procesmodel voor de beoordeling van competent handelenEen procesmodel voor de beoordeling van competent handelen
Een procesmodel voor de beoordeling van competent handelen
 

Recently uploaded

Vip Mumbai Call Girls Mumbai Call On 9920725232 With Body to body massage wit...
Vip Mumbai Call Girls Mumbai Call On 9920725232 With Body to body massage wit...Vip Mumbai Call Girls Mumbai Call On 9920725232 With Body to body massage wit...
Vip Mumbai Call Girls Mumbai Call On 9920725232 With Body to body massage wit...
amitlee9823
 
Vip Mumbai Call Girls Mira Road Call On 9920725232 With Body to body massage ...
Vip Mumbai Call Girls Mira Road Call On 9920725232 With Body to body massage ...Vip Mumbai Call Girls Mira Road Call On 9920725232 With Body to body massage ...
Vip Mumbai Call Girls Mira Road Call On 9920725232 With Body to body massage ...
amitlee9823
 
Somya Surve Escorts Service Bilaspur ❣️ 7014168258 ❣️ High Cost Unlimited Har...
Somya Surve Escorts Service Bilaspur ❣️ 7014168258 ❣️ High Cost Unlimited Har...Somya Surve Escorts Service Bilaspur ❣️ 7014168258 ❣️ High Cost Unlimited Har...
Somya Surve Escorts Service Bilaspur ❣️ 7014168258 ❣️ High Cost Unlimited Har...
nirzagarg
 
Call Girls In Kotla Mubarakpur Delhi ❤️8448577510 ⊹Best Escorts Service In 24...
Call Girls In Kotla Mubarakpur Delhi ❤️8448577510 ⊹Best Escorts Service In 24...Call Girls In Kotla Mubarakpur Delhi ❤️8448577510 ⊹Best Escorts Service In 24...
Call Girls In Kotla Mubarakpur Delhi ❤️8448577510 ⊹Best Escorts Service In 24...
lizamodels9
 
Vip Mumbai Call Girls Navi Mumbai Call On 9920725232 With Body to body massag...
Vip Mumbai Call Girls Navi Mumbai Call On 9920725232 With Body to body massag...Vip Mumbai Call Girls Navi Mumbai Call On 9920725232 With Body to body massag...
Vip Mumbai Call Girls Navi Mumbai Call On 9920725232 With Body to body massag...
amitlee9823
 
ELECTRICITÉ TMT 55.pdf electrick diagram manitout
ELECTRICITÉ TMT 55.pdf electrick diagram manitoutELECTRICITÉ TMT 55.pdf electrick diagram manitout
ELECTRICITÉ TMT 55.pdf electrick diagram manitout
ssjews46
 
Top Rated Call Girls South Mumbai : 9920725232 We offer Beautiful and sexy Ca...
Top Rated Call Girls South Mumbai : 9920725232 We offer Beautiful and sexy Ca...Top Rated Call Girls South Mumbai : 9920725232 We offer Beautiful and sexy Ca...
Top Rated Call Girls South Mumbai : 9920725232 We offer Beautiful and sexy Ca...
amitlee9823
 
Rekha Agarkar Escorts Service Kollam ❣️ 7014168258 ❣️ High Cost Unlimited Har...
Rekha Agarkar Escorts Service Kollam ❣️ 7014168258 ❣️ High Cost Unlimited Har...Rekha Agarkar Escorts Service Kollam ❣️ 7014168258 ❣️ High Cost Unlimited Har...
Rekha Agarkar Escorts Service Kollam ❣️ 7014168258 ❣️ High Cost Unlimited Har...
nirzagarg
 
+97470301568>>buy vape oil,thc oil weed,hash and cannabis oil in qatar doha}}
+97470301568>>buy vape oil,thc oil weed,hash and cannabis oil in qatar doha}}+97470301568>>buy vape oil,thc oil weed,hash and cannabis oil in qatar doha}}
+97470301568>>buy vape oil,thc oil weed,hash and cannabis oil in qatar doha}}
Health
 
Call Girls Kanakapura Road Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Kanakapura Road Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Kanakapura Road Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Kanakapura Road Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
amitlee9823
 
➥🔝 7737669865 🔝▻ Moradabad Call-girls in Women Seeking Men 🔝Moradabad🔝 Esc...
➥🔝 7737669865 🔝▻ Moradabad Call-girls in Women Seeking Men  🔝Moradabad🔝   Esc...➥🔝 7737669865 🔝▻ Moradabad Call-girls in Women Seeking Men  🔝Moradabad🔝   Esc...
➥🔝 7737669865 🔝▻ Moradabad Call-girls in Women Seeking Men 🔝Moradabad🔝 Esc...
amitlee9823
 

Recently uploaded (20)

Vip Mumbai Call Girls Mumbai Call On 9920725232 With Body to body massage wit...
Vip Mumbai Call Girls Mumbai Call On 9920725232 With Body to body massage wit...Vip Mumbai Call Girls Mumbai Call On 9920725232 With Body to body massage wit...
Vip Mumbai Call Girls Mumbai Call On 9920725232 With Body to body massage wit...
 
Why Does My Porsche Cayenne's Exhaust Sound So Loud
Why Does My Porsche Cayenne's Exhaust Sound So LoudWhy Does My Porsche Cayenne's Exhaust Sound So Loud
Why Does My Porsche Cayenne's Exhaust Sound So Loud
 
Vip Mumbai Call Girls Mira Road Call On 9920725232 With Body to body massage ...
Vip Mumbai Call Girls Mira Road Call On 9920725232 With Body to body massage ...Vip Mumbai Call Girls Mira Road Call On 9920725232 With Body to body massage ...
Vip Mumbai Call Girls Mira Road Call On 9920725232 With Body to body massage ...
 
Is Your Volvo XC90 Displaying Anti-Skid Service Required Alert Here's Why
Is Your Volvo XC90 Displaying Anti-Skid Service Required Alert Here's WhyIs Your Volvo XC90 Displaying Anti-Skid Service Required Alert Here's Why
Is Your Volvo XC90 Displaying Anti-Skid Service Required Alert Here's Why
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Somya Surve Escorts Service Bilaspur ❣️ 7014168258 ❣️ High Cost Unlimited Har...
Somya Surve Escorts Service Bilaspur ❣️ 7014168258 ❣️ High Cost Unlimited Har...Somya Surve Escorts Service Bilaspur ❣️ 7014168258 ❣️ High Cost Unlimited Har...
Somya Surve Escorts Service Bilaspur ❣️ 7014168258 ❣️ High Cost Unlimited Har...
 
Call Girls In Kotla Mubarakpur Delhi ❤️8448577510 ⊹Best Escorts Service In 24...
Call Girls In Kotla Mubarakpur Delhi ❤️8448577510 ⊹Best Escorts Service In 24...Call Girls In Kotla Mubarakpur Delhi ❤️8448577510 ⊹Best Escorts Service In 24...
Call Girls In Kotla Mubarakpur Delhi ❤️8448577510 ⊹Best Escorts Service In 24...
 
Vip Mumbai Call Girls Navi Mumbai Call On 9920725232 With Body to body massag...
Vip Mumbai Call Girls Navi Mumbai Call On 9920725232 With Body to body massag...Vip Mumbai Call Girls Navi Mumbai Call On 9920725232 With Body to body massag...
Vip Mumbai Call Girls Navi Mumbai Call On 9920725232 With Body to body massag...
 
ELECTRICITÉ TMT 55.pdf electrick diagram manitout
ELECTRICITÉ TMT 55.pdf electrick diagram manitoutELECTRICITÉ TMT 55.pdf electrick diagram manitout
ELECTRICITÉ TMT 55.pdf electrick diagram manitout
 
Call Girls in Malviya Nagar Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts Ser...
Call Girls in Malviya Nagar Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts Ser...Call Girls in Malviya Nagar Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts Ser...
Call Girls in Malviya Nagar Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts Ser...
 
Top Rated Call Girls South Mumbai : 9920725232 We offer Beautiful and sexy Ca...
Top Rated Call Girls South Mumbai : 9920725232 We offer Beautiful and sexy Ca...Top Rated Call Girls South Mumbai : 9920725232 We offer Beautiful and sexy Ca...
Top Rated Call Girls South Mumbai : 9920725232 We offer Beautiful and sexy Ca...
 
Rekha Agarkar Escorts Service Kollam ❣️ 7014168258 ❣️ High Cost Unlimited Har...
Rekha Agarkar Escorts Service Kollam ❣️ 7014168258 ❣️ High Cost Unlimited Har...Rekha Agarkar Escorts Service Kollam ❣️ 7014168258 ❣️ High Cost Unlimited Har...
Rekha Agarkar Escorts Service Kollam ❣️ 7014168258 ❣️ High Cost Unlimited Har...
 
John Deere 7430 7530 Tractors Diagnostic Service Manual W.pdf
John Deere 7430 7530 Tractors Diagnostic Service Manual W.pdfJohn Deere 7430 7530 Tractors Diagnostic Service Manual W.pdf
John Deere 7430 7530 Tractors Diagnostic Service Manual W.pdf
 
+97470301568>>buy vape oil,thc oil weed,hash and cannabis oil in qatar doha}}
+97470301568>>buy vape oil,thc oil weed,hash and cannabis oil in qatar doha}}+97470301568>>buy vape oil,thc oil weed,hash and cannabis oil in qatar doha}}
+97470301568>>buy vape oil,thc oil weed,hash and cannabis oil in qatar doha}}
 
Marathi Call Girls Santacruz WhatsApp +91-9930687706, Best Service
Marathi Call Girls Santacruz WhatsApp +91-9930687706, Best ServiceMarathi Call Girls Santacruz WhatsApp +91-9930687706, Best Service
Marathi Call Girls Santacruz WhatsApp +91-9930687706, Best Service
 
Is Your BMW PDC Malfunctioning Discover How to Easily Reset It
Is Your BMW PDC Malfunctioning Discover How to Easily Reset ItIs Your BMW PDC Malfunctioning Discover How to Easily Reset It
Is Your BMW PDC Malfunctioning Discover How to Easily Reset It
 
(ISHITA) Call Girls Service Jammu Call Now 8617697112 Jammu Escorts 24x7
(ISHITA) Call Girls Service Jammu Call Now 8617697112 Jammu Escorts 24x7(ISHITA) Call Girls Service Jammu Call Now 8617697112 Jammu Escorts 24x7
(ISHITA) Call Girls Service Jammu Call Now 8617697112 Jammu Escorts 24x7
 
Lecture-20 Kleene’s Theorem-1.pptx best for understanding the automata
Lecture-20 Kleene’s Theorem-1.pptx best for understanding the automataLecture-20 Kleene’s Theorem-1.pptx best for understanding the automata
Lecture-20 Kleene’s Theorem-1.pptx best for understanding the automata
 
Call Girls Kanakapura Road Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Kanakapura Road Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Kanakapura Road Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Kanakapura Road Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
 
➥🔝 7737669865 🔝▻ Moradabad Call-girls in Women Seeking Men 🔝Moradabad🔝 Esc...
➥🔝 7737669865 🔝▻ Moradabad Call-girls in Women Seeking Men  🔝Moradabad🔝   Esc...➥🔝 7737669865 🔝▻ Moradabad Call-girls in Women Seeking Men  🔝Moradabad🔝   Esc...
➥🔝 7737669865 🔝▻ Moradabad Call-girls in Women Seeking Men 🔝Moradabad🔝 Esc...
 

Competence based exam for prospective driving instructors

  • 1. 1 Development and Evaluation of a Competence Based Exam for Prospective Driving Instructors Erik Roelofs1 , Maria Bolsinova1 , Marieke van Onna1 , Jan Vissers2 1 Cito, National Institute for Educational Measurement, The Netherlands 2 Royal Haskoning DHV, Amersfoort, The Netherlands Introduction A growing consensus among driver training and road safety researchers is that driver training should place greater emphasis on higher-order, cognitive and motivational functions underlying driving behaviour (Mayhew and Simpson, 2002; Hatakka et al., 2002). This changed conception of driver training has been laid down in the Goals for Driver Education matrix (Hatakka et al., 2002). Recent research seems to support this consensus (Isner, Starkey, & Shepard, 2011; Beanland, Goode, Salmon, & Lenné, 2013). Innovative training initiatives appear to counteract overconfidence and address motivational factors such as driving anger, sensation seeking, and boredom (e.g. Isner et al., 2009). Parallel to the doubts raised about the quality of driver training, the quality of driver instructor preparation programs is criticized. The MERIT review study (Bartl, Gregeresen, & Sanders, 2005) showed that huge variations existed in quality of driver instructor education throughout Europe. The content did not cover higher order skills. Most programs relied on teacher- focused approaches, which seem to fall short in developing higher order skills. In many European countries the quality of the education of driving instructors is regulated by means of the instructor exam. One may view this as a problem, but on the other side, this also offers opportunities for improvement. A valid and reliable exam that only allows proficient prospective instructors to enter the profession, may have a positive backwash effect on driver instructor education programs, as in other fields of education: teachers teach and students learn what will be tested (Crooks, 1988; Fredericksen & Collins, 1989; Madaus, 1988). In the Netherlands, the first steps in this direction have been made in the last ten years. As part of a new law on driving education in 2003, competence-based outcome standards for prospective driving instructors have been formulated (Nägele, Vissers, & Roelofs, 2006). The most far-reaching change underlying these standards has been the emphasis on performance in critical job-situations with real learner drivers. In addition, supporting knowledge was defined in terms of relevant concepts, principles and decision making skills to be applied in authentic instructional situations. Based on the standards, a two stage competence-based exam was designed and put into action in the fall of 2009. Since then, over 4000 prospective driving instructors (PDI’s) have gone through one or more tests. The question is whether the assessments have resulted in valid and fair decisions about PDI’s. Regarding the tenability of decisions, this paper focuses on the separate theoretical assessments, comprising stage 1. In addition, their predictive value for instructor performance as demonstrated at the final performance assessment lesson (stage 2) is studied.
  • 2. 2 The fairness question concentrates on the comparability of different versions of assessments. In the exam under study, items banks are used, from which different sets are drawn to compose exam versions to prevent effects of item exposure and cheating. The question then arises whether one cut-off score implies the same level of required proficiency for different versions. To solve this problem, psychometrical equating methods are common to determine how scores on two different tests can be projected on one (latent) scale (Kolen, & Brennan, 1995). In summary, four research questions are addressed: 1. To what extent are the individual parts of the exam psychometrically reliable? 2. To what extent do the different theoretical tests inter correlate? 3. To what extent do results on theoretical tests and performance assessment for instructional ability correlate? 4. Do the used cut-off score across different versions of the theoretical tests reflect equivalent required levels of proficiency? Design Features of the Competence Based Exam The exam consists of two stages. The first stage comprises the assessment of the theoretical knowledge base of prospective instructors regarding driving and driving pedagogy. After having passed the first stage PDI’s receive a provisional instructor license enabling them to enrol in a half year internship at a (certified) professional driving school. In the second stage, after having finished their internships, PDI’s are judged on their professional instructional abilities, during a masterpiece lesson involving one of their own learner drivers, whom they have been teaching as an intern. If they pass, the will get a full license for the next five years. Below the design features are described. A summary of all exam parts is provided in Table 1. The Design Process All individual assessments of the exam were designed by means of the evidence-centred design model (ECD, Almond, Steinberg, & Mislevy,2002; Mislevy, & Haertel, 2006). The ECD model identifies five layers in the design process: domain analysis, domain modelling, conceptual assessment framework, assessment implementation, and assessment delivery. These design layers have been gone through successively, whereby a continuous dialogue took place between the assessment designers and different stake holders: a board of instructor educators, exam institutes, ICT specialists, psychometricians, educational scholars, academic teacher educators, driving examiners, and driving instructors. The Conceptual Assessment Framework In the exam the layer of the conceptual assessment framework for assessment task design was of central importance. The conceptual assessment framework helps to sort out the relationships among attributes of a candidate’s competence, observations which show competence and situations which elicit relevant driver performance. The central models for task design are the Competence or Student Model, the Evidence Model, and the Task Model. The driving instructor competence model. The Competence Model encompasses variables representing the aspects of instructor competence that are the targets of inference in the assessment and their inter-relationships. Starting from a literature search on what comprises
  • 3. 3 good teaching in general and more specifically driving instruction a competence model was constructed. This resulted in the formulation of four domains of competence summarized in Table 1: 1) Conscious traffic participation as first and second driver, 2) Lesson preparation, 3) Instruction and coaching, 4) Evaluation, reflection and revision. A model of competent task performance formed the basis for two competence models: driving competence (Roelofs, Van Onna, Vissers, 2010) and instructional competence (Roelofs & Sanders, 2007). A basic tenet in the model (see Figure 1) is that instructor competence is reflected in the consequences of an instructor’s actions. The most important consequences are students’ learning activities during delivery of instruction or coaching, and safety, flow and comfort in traffic during driving. Starting from the consequences, the remaining elements of the model can be mapped backwards. First, the component ‘actions’ refers to professional activities, e.g. delivering instruction or providing coaching to students. Second, any instructor activity takes place within a specific context in which many decisions need to be made, on a long-term basis (planning ahead) or immediately during an in-car situation. For instance, instructors will have to plan their instruction and adapt it depending on differing circumstances (e.g. different learning paces, traffic situations). Third, when making decisions and performing activities, teachers will have to draw from a professional knowledge base (e.g. pedagogical principles, psychology of driving, rules and regulations). [Insert Figure 1] Assessment task models and exam assembly. The Task Model describes the kinds of assessments tasks (items) that embody an assessment (test). They follow directly from the cognitive activity and interactive activity as mentioned competence model. Three types of assessments tasks were designed. 1) Case based items. (Schuwirth et al. 2000; Norman, Swanson, & Case, 1996). These items address knowledge of concepts and cause-effect rules in cases embedded in a rich driving instruction context. These questions were used for theoretical assessment in all four task domains. An example of an item is: “An instructor starts a lesson with an explanation on the first topic. He does not preview on what the learner driver is going to be able to at the end of the lesson. What is the most likely consequence of his approach?” To respond, the PDI would choose out of four options: A. the learner will learn less that desirable, B. the learner will not fully understand your explanation, C. the learner will have less time to practice new driving tasks, D. the learner cannot direct his attention to the essential parts of the lesson (correct). 2) Situational judgment items. (Whetzel & McDaniel, 2009). These items address decision making skills in a rich driving instruction context. An example regarding lesson planning is: “The learner driver is starting to learn manoeuvring in traffic situations with low traffic intensity as illustrated on the picture. During the upcoming lesson you are going to instruct how to park backwards into a parking bay. Which of the parking bays [pictures shown with or without other vehicle parked alongside an empty bay] can you choose best for a learner driver in this stage of driver education?” To respond, the PDI would choose one out of four pictures. One option is optimal, one other is suboptimal, and the remaining two are unacceptable. 3) Performance assessment assignments. To address the PDI’s own driving competence and his ability to verbalize mental task processes (perception, anticipation, decision, action
  • 4. 4 execution) the PDI had to complete a 60 minute drive, on which his performance was judged by a trained assessor using a standardized scoring form. Five performance criteria were employed on this form: 1) safety, 2) aiding traffic flow, 3) driving socially considerately, 4) ECO friendly driving, and 5) vehicle control (Roelofs, Van Onna, & Vissers, J, 2010). At two intermediate stops the PDI was asked to verbalize his mental processes that went on while solving the previous traffic situations. Finally, professional actions regarding instruction, coaching and evaluation were judged by having the PDI carry out a lesson with a real learner driver. The lesson performance was judged by trained assessors. To this end a 34 item scoring form was used to judge the quality of coaching and instruction. Evidence model. The Evidence Model describes how to extract the key items of evidence from instructor behaviour, and the relationship of these observable variables to the competence model variables. All three theory tests consisted of 60 multiple choice items. The Theory of Driving Test consisted of one-best-answer type items, scored dichotomously (0,1). The cut-off score for passing the test was 42 items. Both the Theory of Lesson Preparation test and the Theory of Instruction and Coaching Test included situational judgment items, which were of the partial credit model. The best answer yielded 7 points, a suboptimal answer yielded 3 points, while the distractors yielded no points. The case-based knowledge items had one best answer; all items were scored 0 and 7, for incorrect and correct answers respectively. The cut-off score for passing the test was 266 points (out of 420 points). The items on the Performance Assessment Lesson were scored on a three point scale, representing ‘counterproductive performance’, ‘beginning productive performance’ and ‘optimal performance’. A detailed scoring guide was available for assessors. Initial rater agreement scores using Gower’s similarity index (Gower, 1971) showed acceptable levels of agreement, .67 for instruction and .75 for coaching. The cut-off score for passing the performance assessment was 71 points (out of 102 points), under the condition that on no more than one out of seven categories the results were below the specified cut-off score (Mean=2.0). Methods Subjects and Data All data from candidates who enrolled the program between January 1st 1010 and October 1st 2012 were selected. Data from 2009 were discarded of, because the computer-based assessment platform was not completely stable at that time. This resulted in assessment data for 4741 prospective driving instructors. 79 per cent of them were male and 21 per cent female. The mean age was 34.9 years (SD=10.9). Of them 3079 (74.4%) were born in the Netherlands. The remaining 25.6% originally came from 79 different countries. The majority of them were immigrants from Morocco (n=199), Suriname (n=190), Turkey (n=151), Afghanistan (n=112), Iraq (n=89), and Iran (46). A total of 4644 PDI’s completed at least one of the theory tests. Of them 2977 passed all their theory exams. Of them 1941 PDI’s took part in de Performance Assessment instruction and coaching. From the remaining PDI’s about half (n=508) did not participate in the Performance Assessment Lesson within more than a year after their last successful theory test. The remaining part (n=528) did not finish their internship. 368 PDI’s got dispensation to
  • 5. 5 participate in the performance assessment, although they failed one of theory tests. In total, 2315 PDI’s participated at least once in the Performance Assessment Lesson. Analyses IRT-analyses were performed on the data of the three theory tests. Each test had been administered in many different versions, based on 60 item representative samples drawn from an item bank. Available software does not allow to perform IRT-analysis on a very large number of versions - there were more than 700 versions for each test- of a test, especially if some of these versions were taken by only one PDI. Therefore, for each of the theory tests only versions with a substantive number of participants were selected. In Table 2 the number of test versions, total number of items and sample size chosen for analysis are shown. [Insert Table 2] For the tests which included items with partial scoring (0, 3, 7), a partial credit model was fitted first. The model had a very poor fit in both cases. Responses to all items in the tests Theory of Instruction and Coaching and Theory of Lesson Preparation were dichotomized, because this allows to fit more flexible IRT models that can fit the data better. All responses for which 7 points were given were recoded as 1 (correct response), and all other as 0 (incorrect response). The number of items correct had a very high correlation with the original number of points (.986 for both tests). A new cut-off score for the number of items correct was chosen in such a way that the number of subjects having less than 266 points but passing the test with a new cut-off score in terms of items correct and the number of subjects having more than 266 points but failing the test according to the new criterion was minimal. For the both tests, the new cut-off score was 35 items correct, which gave 4.65% of misclassifications for the test of Theory of Instruction and Coaching and 3.95% for the test of Theory of Lesson Preparation. A one parameter logistic model (OPLM, Verhelst, & Glas, 1995; Verhelst, Glas, & Verstralen, 1995) was fitted to the data of each theory test. In the Theory of Driving Test, one of the items was excluded from the analysis because it was answered correctly by all PDI’s. In the Theory of Instruction and Coaching Test, two items were excluded because they had strong negative correlations with the test score. The model had a reasonable fit for all three tests. In the perspective of IRT, the concept of reliability differs from that of the Classical Test Theory. The measurement precision depends on the position on the latent ability scale. The ability of a subject for whom all test items are too difficult will be measured with a larger error than the ability of someone for whom the difficulty of the items is appropriate. In the case of three theory tests, it is important to measure accurately at the ability level corresponding to the cut-off score, representing the pass-fail boundary. A standard error of estimate of ability corresponding to the required number of items correct (SE) has been chosen as a measure of reliability. To enable interpretation of this value it has to be compared with the standard deviation of the ability in the population (SD). As a measure, we look at the so called proportion of true variance: (SD2 – SE2 )/SD2 . Additionally, as a measure of global accuracy of measurement the MAcc index was used, reported by the OPLM software, which is the expected reliability coefficient alpha as defined in classical test theory for a given population of PDI’s.
  • 6. 6 Using an IRT model allows to locate all different versions of the test on the same scale, and therefore compare different versions by their difficulty and the level of ability needed to pass the test. For each test version a level of ability at the cut-off score was estimated. Unlike the raw test scores, the latent estimates from different test versions are directly comparable. All latent abilities were scaled in such a way, that the population distribution had a mean of 100 and standard deviation of 15. To analyse relations between the three theory tests, a subsample of n=1980 which have taken all three tests were selected. Correlations between the three latent abilities scores were computed. To determine the reliability of the final Performance Assessment Lesson, all available raw data (n=580) for PDI’s were obtained and processed into data files. The exam institute only kept the final pass/fail outcome in their fata files, but still had score forms of 580 PDI’s. Principal component analysis was carried out to reduce the data to a limited number of interpretable factors, yielding maximum reliable criterion variables. Correlations between three latent abilities and the score on the resulting scales for instruction and coaching were computed. Results Figure 2 shows a normal distribution of ability scores, (mean=100; SD =15). In this figure the cut-off scores for the 14 most frequently administered versions of the Theory of Driving Test are plotted. The dots represent the cut-off scores for each test version expressed in terms of ability (Theta). The lines represent the standard errors around the cut-off score. Two things can be noted. First, the cut-off score for all test versions fall below the average ability in the total population. The mean cut-off score (M=85.3, see Table 2) is almost one standard deviation below the ability mean of the population. This means that relatively low ability (M=86.5) was needed to pass the test. Second, there are small differences between the required ability levels for different test versions, but the variation of the cut-off levels (SD=3.22) across versions is small compared to the standard error. [Insert Figure 2] Similarly, Figure 3 shows the plotted cut-off ability scores for the 15 most frequently administered versions of the Theory of Lesson Preparation Test. As can be noted from the figure the differences between the cut-off scores of versions are higher than for the Theory of Driving Test. Version 1 and 6 differ considerably: version 1 requires an ability level of 78, as version 6 requires 92. The mean cut-off score (M=85.3) is again below the ability mean of the population. [Insert Figure 3] [Insert Figure 4] The results regarding the Theory of Instruction and Coaching Test show a similar pattern (see Figure 4): the required ability levels are below the mean ability (M=90.2, see Table 3), but less pronounced compared to the other theory tests. The different test versions show different levels of required ability, but these are relatively low, compared to the standard error.
  • 7. 7 For the theory of driving test the standard error at the cut-off scores amounts to 10.01, whereas the values for the theory of lesson preparation and theory of instruction and coaching amount to 9.56 and 7.74 respectively (see Table 3). All values fall within the region of good reliability. The MAcc-coefficients reflect reasonable overall reliability for the whole test, whereas the average proportion of true variance at the cut-off score measures reflect questionable reliability for the first two tests and reasonable for the Theory of Instruction and Coaching test. [Insert Table 3] The correlation coefficients between the ability scores on the three theory tests show a moderate correlation of .56 between the Theory of Driving Test and the Theory of Lesson Preparation Test. Ability scores on the Theory of Driving Test correlate .43 with the ability scores on the Theory of Lesson Preparation Test. Finally, ability scores on the Theory of lesson preparation test correlate .29 with ability levels on the Theory of Instruction and Coaching Test. Table 4 shows a psychometric report for the Final Performance Assessment Lesson. [Insert Table 4] Principal component analyses resulted in three clearly interpretable factors. Three fairly reliable scale scores could be composed (see Table 4), representing aspects of coaching, Motivational Support (alpha= .77), Diagnosis and Task Support (alpha= .79) and Instruction (alpha= .83). The Motivational Support Scale correlated .46 (p<.001) with Diagnosis and Task Support and .45 (p<.001) with Instructional skill. Diagnosis and Task Support correlated .66 (p<.001) with Instructional Skill. [Insert Table 5] Table 5 shows the correlations between the ability scores on the theory tests and the scores on the Performance Assessment Lesson, for the three subscales and the overall assessment score. The correlation coefficients for Theory of Driving Test do not differ significantly from zero. The ability scores for Lesson Preparation and Instruction/Coaching show six low but significant correlations with the subscales and the overall scale for the final performance assessment lesson (between .12 and .14; p<.05). Discussion The central question in this study was whether decisions made about prospective driving instructors, as they follow from the result on theory tests of the innovated exam are valid and fair. First it can be concluded that the overall reliability of estimated ability scores on the theory tests shows acceptable levels. The reliability around the cut-off scores was also acceptable, which seems most important, because here the pass/fail decisions are made. The Final Performance Assessment (lesson) also shows acceptable reliability in terms of the alpha value, used in classical test theory.
  • 8. 8 Second, the IRT models showed an acceptable fit, suggesting that the tests represent separable one-dimensional abilities. The moderate correlation between the Theory of Driving Test and the Theory of Lesson Preparation Test shows that knowledge of the traffic task is an important predictor for knowledge and decision making regarding lesson preparation. To a lesser extent high (or poor) achievement on lesson preparation is related to similar performance on instruction and coaching. Third, the predictive value of theory test performance for in-car instructional and coaching performance was very low. The ability scores for lesson planning and instruction and coaching only show very low, although significant, correlations with the in car-performance for coaching and instruction. An explanation for this finding may be that only those who passed the stage one theory exams are allowed to go through the final assessment. In addition, the effect of half a year of internship may have washed out initial differences between PDI’s. Regarding fairness, the question was whether the different versions of the theory tests required the same level of proficiency to pass. A first finding was that the cut-off scores for pass-fail decisions for the theory tests were well below the average ability level of the population, implicating relatively low ability requirements. The Theory of Coaching and Instruction Test and the Theory of driving Test had comparable cut-off scores across version and were hence equivalent in their ability requirements. For Lesson Preparation there were larger differences in required ability across test versions, although the differences fell within range of the standards errors. As far as the construction and delivery process concerned some problems need further attention. Many of these are related to the way the assessments are delivered. The exam is computer-based administered at an exam office, involving items banks from which different versions are drawn. To reach representativeness all versions need to reflect all sub domains (for each test at least 9), mental activities (perception, decision making, action, cause-effect reasoning, concept recognition), and critical situations (learner characteristics, stage of acquisition, traffic situation) that are distinguished. The relatively small size of the item bank resulted in the frequent reuse of items, which may have led to overexposure of the items, which may result in the lowering of item difficulties. In addition, we observed that a part of the items had poor quality, i.e. low or highly negative item-test correlation coefficients, and extreme p-values (near zero or one). In the current examination practice poor items were not excluded ad posteriori from the tests, because shorter test versions would not have been accepted by stakeholders. However, it would have been defendable to estimate ability levels based on a smaller ‘cleaned’ subset of items, yielding a more reliable and still representative score. An optimal approach to warrant acceptable item quality is to pre-test all items on a representative sample of target candidates before putting them into item banks. This however seems problematic because of the risk of early item exposure. In addition, exam costs would rise. However, in general it can be recommended to use exam data to redesign and improve the exam on the fly. Following the evidence-based design model of Mislevy and colleagues, many questions can be answered on the fly: does the competence model reflected in the IRT model fit? Are any changes needed? Do the cut-off scores represent what we want PDI’s to know and to be able to? Can certain item characteristics be traced back to the way the item
  • 9. 9 was designed? In short, using an evidence-based Design model, in combination with on the fly improvements can improve our decisions about prospective driving instructors. In follow-up research, we intend to take a closer look at other parts of the exam, the functioning of different item types, the way items are presented, the stimuli used in items, the responses that are asked and the way these are related to estimates of PDI’s abilities. To evaluate the long term effects of the innovated exam for instructional practice, learner driver gain and crash involvement, longitudinal research will be necessary. In such a study one should take into account the quality of all subsequent educational interventions and related driver activities to determine whether there is a case for driver training (Beanland et al., 2013). References Almond, R. G., Steinberg, L. S., & Mislevy, R. J. (2002). Enhancing the design and delivery of assessment systems: A four-process architecture. Journal of Technology, Learning, and Assessment, 1(5). http://www.bc.edu/research/intasc/jtla/journal/v1n5.shtml. Bartl, G., Gregersen, N.P., & Sanders, N. (2005) EU MERIT Project: Minimum Requirements for Driving Instructor Training. Vienna: Institut Gute Fahrt. Bartl, G., Gregersen, N.P., & Sanders, N. (2005). EU MERIT Project: Minimum Requirements for Driving Instructor Training. Final Report. Vienna: Institut Gute Fahrt. Beanland, V., Goode, N., Salmon, P.M., Lenné, M.G. (2013). Is there a case for driver training? A review of the efficacy of pre- and post-licence driver training. Safety Science, 51, 127-137. Crooks, T. J. (1988). The impact of classroom evaluation practices on students. Review of Educational Research, 58, 438-481. Fredericksen, J. R., & Collins, A. (1989). A systems approach to educational testing. Educational Researcher, 18(9), 27-32. Gower, J.C. (1971). A General Coefficient of Similarity and some of its Properties, Biometrics, 27, 857-871. Hatakka, M., Keskinen, E., Gregersen, N.P., Glad, A., Hernetkoski, K. (2002). From control of the vehicle to personal self-control; broadening the perspectives of driver education. Transportation Research Part F: Psychology and Behaviour, 5(3), 201–215. Isner, R.B., Starkey, N.J., Sheppard, P. (2011). Effects of higher-order driving skill training on young, inexperienced drivers’ on-road driving performance. Accident Analysis Prevention, 43, 1818–1827. Isner, R.B., Starkey, N.J., Williamson, A.R. (2009). Video-based road commentary training improves hazard perception of young drivers in a dual task. Accident Analysis Prevention, 41, 445–452. Kolen, M.J., & Brennan, R.L. (1995). Test Equating. New York: Spring. Madaus, G. F. (1988). The influences of testing on the curriculum. In L.N. Tarner (Ed.). Critical issues in curriculum (pp. 83-121). Eighty-Seventh Yearbook of the National Society for the Study of Education, Part I. Chicago: University of Chicago Press. Mayhew,D.R., Simpson, H.M. (2002). The safety value of driver education and training. Injury Prevention, 8, 3–8. Mislevy, R. J., Steinberg, L. S., & Almond, R. G. (2003). On the structure of educational assessments. Measurement: Interdisciplinary Research and Perspectives, 1, 3–67. Nägele, R., Vissers, J., & Roelofs, E.C. (2006) Herziening WRM: een model voor competentiegericht examineren. Eindrapport [Revision Law on motor vehicle
  • 10. 10 education: a model for a competence based exam] . Dossier : X4691.01.001; Registratienummer : MV-SE20060477. Amersfoort/Arnhem: DHV, Cito. Norman G, Swanson DB, Case SM. (1996) Conceptual and methodological issues in studies comparing assessment formats. Teaching and Learning in Medicine, 8(4):208-216. Roelofs, E, & Vissers, J (2008), Toetsspecificaties theoretische deeltoetsen WRM-examen. [Specifications of the theoretical exams for driver instructor candidates] Arnhem: Cito; Amersfoort: DHV groep. Roelofs, E.C., Onna, M. van, Vissers, J. (2010). Development of the Driver Performance Assessment: Informing Learner Drivers of their Driving Progress. In L. Dorn (Ed.) Driver behavior and training, volume IV (pp. 37-50). Hampshire: Ashgate Publishing Limited. Roelofs, E. & Sanders, P. (2007). Towards a framework for assessing teacher competence. European Journal for Vocational Training, 40(1), 123-139. Schuwirth, L.W.T., Verheggen, M.M., van der Vleuten, C,P,M,, Boshuizen, H.P.A., Dinant. G.J. (2000). Validation of short case-based testing using a cognitive psychological methodology. Medical Education, 35, 348–56. Verhelst, N.D., & Glas, C.A.W. (1995). The one parameter logistic model. In: G.H. Fischer & I.W. Molenaar (Eds.). Rasch models: Foundations, recent developments and applications (pp. 215-239). New York: Springer. Verhelst, N.D., Glas, C.A.W. & Verstralen, H.H.F.M. (1995). OPLM: One Parameter Logistic Model. Computer program and manual. Arnhem: Cito. Wang, M.; Haertel, G.; Walberg, H. (1993). Toward a knowledge base for school learning. Review of educational research, 63, 249-294. Whetzel, D.L, McDaniel, M.A. (2009). Situational judgment tests: An overview of current research. Human Resource Management Review, 19, 188–202.
  • 11. 11 Table 1. Instructor competence profile and used tests for the Dutch prospective driver training exam Elaboration of task domains Assessment method Stage 1 Computer based: Knowledge base and cognitive skills Stage 2: On the job: Performance assessment 1. Competent in conscious traffic participation 1.1 Driving responsibly as a first driver The driving instructor is able to drive a vehicle safely, smoothly, socially considerate, and in an eco- friendly way according to Dutch driving standards. 1.2 Verbalizing mental processes of driving The driving instructor is able to verbalize the mental task processes that take place when carrying out driving tasks in different traffic situations. Theory of driving test (60 items): traffic participation rules knowledge, case-based and situational judgment items (task domain 1.1) Performance assessment drive as a first and a second driver (all task domains cluster 1) 2. Competent in lesson preparation 2.1 Adaptive planning The driving instructor is able to construct an educational program for the long term (curriculum) and for the short term (lesson design) adapted to the needs of the individual learner driver (LD). 2.2 Elaborating driving pedagogy The driving instructor is able to prepare a driving specific pedagogical learning environment for learner drivers. 2.3 Organizing learning The driving instructor is able to organize lessons in such a way that activities run smooth and without interruptions, ensuring a maximum amount of productive learning time. Theory of lesson preparation test (60 items): case-based concept application, reasoning and situational judgment items (all task domains cluster 2) 1. Performance assessment lesson with real learner driver 2. Self –reflection report internship 3. Reflective interview internship (all task domains cluster 2,3 and 4) 3. Competent in instruction and coaching 3.1 Providing instruction The driving instructor is able to provide instruction that is geared to the actual developmental level of the learner driver. It enables the LD to progress towards self- regulated performance in increasingly complex tasks. 3.2 Providing coaching The driving instructor is able to monitor learner driver development and guide the LD towards self-regulation in solving driving tasks and driving related tasks. Theory of instruction and coaching test (60 items): case-based concept application, reasoning and situational judgment items (all task domains cluster 3) 4. Competent in evaluation, reflection and revision 4.1 Assessing learner progress The driving instructor is able to assess the progress in driver competence by judging the level of performance himself and by using expertise of professional colleagues. 4.2 Reflection and revision The driving instructor is able to reflect on his own actions and use the results of this reflection for adapting his approach.
  • 12. 12 Table 1. Number of test versions, total number of items and sample size chosen Table 2. Cut-off scores and cut-off reliability for the three theoretical tests Cut-off Score Reliability Mean SD Min Max Average SE at the cut-off score Average proportion of true variance at the cut-off score MAcc Theory of driving 86.5 3.22 81.9 92.9 10.01 .55 .70 Theory of Lesson Preparation 85.3 5.47 77.3 93.1 9.56 .59 .75 Theory of Instruction and Coaching 90.2 2.52 86.93 95.5 7.74 .73 .83 Test Number of test versions selected Minimum number of responses per version Maximum number of responses per version Total number of items Sample size Theory of driving 14 99 484 211 3013 Theory of Lesson Preparation 15 38 586 201 2524 Theory of Instruction and Coaching 15 32 551 148 2771
  • 13. 13 Table 3. Psychometric report for the Final Performance Assessment Lesson. N Minimum Maximum Mean SD Alpha Coaching: Motivational Support (6 items) 580 9.00 18.0 14.3 2.2 0.77 Coaching: Diagnosis and Task Support (8 items) 580 9.00 24.0 16.9 2.8 0.79 Instructional skill (15 items) 580 21.00 44.0 34.7 4.3 0.77 Exam score (34 items) 580 50.00 99.0 78.2 8.9 0.88 Table 4. Correlations between performance on theory tests and Performance Assessment Lesson (Sub)scale Final Performance Assessment Lesson Theory of driving Theory of Lesson Preparation Theory of Instruction coaching Coaching: motivational support (6 items) .01 .06 .13* Coaching: diagnosis and support of task process(8 items) .07 .12* .09 Instructional skill (15 items) .10 .12* .11* Exam score (34 items) .07 .12* .14**
  • 14. 14 Figure 1. Model of competent task performance Figure 2. Cut-off scores and standard errors for 13 versions of the Theory of Driving Test. Universe of task situations 3. Action execution 4. Consequences 2. Perception & decision making • Knowledge • Skills • Attitudes • Moods • Emotions • Personality traits 1. Basis 3. Action execution and task conditions
  • 15. 15 Figure 3. Cut-off scores and standard errors for 15 versions of the Theory of Lesson Preparation Test Figure 4. Cut-off scores and standard errors for 15 versions of the theory of instruction and coaching test