33. Ai not Ai Sum
Bi
P(Ai and Bi) =
0,25
P(not Ai and Bi) =
0,125
P(Bi) =
0,375
not Bi
P(Ai and not Bi) =
0,25
P(not Ai and not Bi) =
0,375
P(not Bi) =
0,625
Addition P(Ai) =
0,5
P(not Ai) =
0,5
1
1
P(Ai|Bi)
38. 33
P(A|B)
A not A
B A and B not A and B
A not A Sum
B
[P(Ai|Bi)*…*P(Ak|Bk)]
*
[P(Ai|not Bi) *…*P(A(n-k)|not
B(n-k))]
[P(not Ai|Bi) *…*P(not Ak|
Bk)]
*
[P(not Ai|not Bi) *…*P(not
A(n-k)|not B(n-k))]
P(B)
Table 4
Good morning, ladies and gentleman. I am Bruno Ronsivalle and I am R&D manager in ABI, the Italian Banking Association. Today, we are going to talk about a calculation model to define passing thresholds in evaluation. [CLIC]
Almost always, the passing threshold determines the success of an assessment test. [CLIC] This value represents the limit which establishes if a performance is bad or good. [CLIC]
But how can this value be identified? [CLIC] Which criteria can establish how many correct answers a candidate must give to show the competencies attainment?[CLIC]
Many trainers base their assessment strategies on school and university traditional criteria. [CLIC] But very often, instructional designers can hardly describe [CLIC] how they make their assessment choices [CLIC] why they opt for a certain passing threshold [CLIC] or which is their reference theoretical frame. [CLIC]
Usually, most of them adopt the “sufficiency” criterion: [CLIC] a student passes the test only if he or she correctly answers 60 of 100 questions. [CLIC] The belief is that a 60% correct resolution of the test represents the evidence a student gets the competence under consideration, [CLIC] as if it was an rule of nature!
Clearly such criterion is not based on a scientific argument. [CLIC] Because there’s not a rule of nature able to determine a reliable “passing score” value. [CLIC]
We believe that nothing can be “a priori” established [CLIC] and the “passing score criterion” is not only unfounded but in some cases false and meaningless. [CLIC]
It’s a fact that every time I ask a rational explanation to this assumption, I always get the same answer: [CLIC]
“ Just common sense”. Isn’t that simple?
In order to get a less elusive answer to this question, we are going to show an alternative and more rigorous method to define the test passing thresholds. [CLIC]
Let’s start providing a definition of “Passing threshold”. [CLIC] What does “identifying a passing threshold” mean? [CLIC] It means determining a balance point where probabilistic values about students getting competencies can be formally certified. [CLIC]
But several problems can affect the “passing score criterion” objectivity. [CLIC] For example, students could randomly answer or the passing threshold could be inadequate and so on. [CLIC] All these variables affect the test validity. So, how can we determine such balance point? [CLIC]
The method I intend to propose is based on the calculation of the conditional probabilities and implies a general premise: [CLIC] it’s not possible to “a priori” define a universal and always valid passing threshold because the significance of a test is conditioned by at least four factors. [CLIC]
The expected level of competence [CLIC] The test soundness [CLIC] The “luck” factor [CLIC] The items number [CLIC]
Let’s start with the first factor: the expected level of competence [CLIC]
What do we know about our students competencies before administering the test? [CLIC] Which data our expectations are based on? [CLIC]
We can reformulate this variable in a more schematic way: given a competence “A”, [CLIC] that is the knowledge and abilities this competence can be broken down (for example A1, A2, A3, … , An, etc), [CLIC] P(A) expresses the marginal probability a examinee who randomly answers to the test gets the competence A. [Be careful: we’re talking about “marginal probability” because we’re making reference to an “a priori” probability, as it’s defined before the test administration through the statistic results analysis of similar tests.] [CLIC]
The second factor refers to the test capability of effectively measuring the competence “A”. [CLIC]
In other terms, we have to verify [CLIC] if the examinee who actually has the competence “A” [CLIC] is also able to correctly answer to any question of a test “B”. [CLIC]
The value of the probability P(B|A) expresses a degree of certainty (from 0 to 1) about the “validity” of our measurement tool. [CLIC] To determine this value, we need to administer the test to a sample survey and conduct an item analysis in order to interpret data. [CLIC]
A third element is the so-called “luck” factor. [CLIC]
Are we completely sure students correctly answering the test aren’t just “lucky”? [CLIC] This variable can be expressed by the marginal probability that non competent students (not A) can solve any item of the test by randomly choosing the correct answer. [CLIC]
The value of the probability P(B|not A) depends on the item typologies and it can be directly associated to the number of wrong options making more or less difficult the test resolution. [CLIC] In case of items with two options (for example, a true/false test item) this probability equals 0,5; [CLIC] if we decide to use only multiple choice items with four options (only one is correct) we’ll have a probability of 0,25. [CLIC]
There’s a fourth relevant factor: the items number. [CLIC]
The number of step defined to observe the competence A, [CLIC] correspond to specific knowledge and abilities related to the competence. [CLIC] Such variable is essential to determine the accuracy and completeness of the measurement activities around the competence A: [CLIC] bigger the tests number, bigger the probability to correctly measure the assessment system coherence.
So, let’s imagine a test composed of 100 multiple choice items with three wrong options. [CLIC] Each item is related to just one competence or ability “Ai” composing the competence “A”. [CLIC] Let’s also imagine, as we never administered the test before, we don’t know anything about the expected level of competence [CLIC] and about the item validity either. [CLIC]
On the basis of these premises: the marginal probability students already get the knowledge “Ai” equals the maximum degree of uncertainty. [CLIC] Thence P(Ai) = 0,5; the marginal probability students already having the knowledge/ability (Ai) are able to correctly answer the corresponding item (Bi) equals the maximum degree of uncertainty. [CLIC] Thence P(Bi|Ai) = 0,5; the marginal probability students not having the knowledge/ability (Ai) randomly give the correct answer to the corresponding item (Bi) equals the ratio between number of attempts, 1, and number of available options, 4. [CLIC] Thence P(Bi|not Ai) = ¼ = 0,25. [CLIC]
At this point, let’s describe the calculation model to determine the passing threshold. [CLIC] First of all, this model is based on the conditional probabilities Bayes theorem. [CLIC] According to Bayes, the balance value of the passing threshold must combine some conditions in order to effectively check the chances a student actually attained the required competencies. Such balance point has to be related to the test typology and calculated at every turn, depending on the factors virtually perturbing the assessment phase. [CLIC]
Now, come back to our example. [CLIC] We have to define the minimum threshold value to pass a test composed of 100 multiple-choice items and identify the examinees getting the competence A. [CLIC] How to do that? [CLIC]
The model we propose includes a four steps method that allows calculating and crossing the different probabilistic conditions to get the test passing threshold. [CLIC]
In the first step we calculate the probability [CLIC] a student who gives the correct answer (Bi) [CLIC] gets the ability Ai. [CLIC]
Then it’s necessary to make all possible combinations clear. [CLIC] The table show four combinations: [CLIC] Ai and Bi: the examinee gives the correct answer and gets the competence. [CLIC] not Ai and Bi: the examinee gives the correct answer but doesn’t get the competence.[CLIC] Ai and not Bi: the examinee gives the wrong answer and gets the competence.[CLIC] not Ai and not Bi: the examinee gives the wrong answer and doesn’t get the competence.[CLIC]
[CLIC] After that, we can determine the probabilities of each combination. [CLIC]
[CLIC] At last, we can calculate the probability a student who gives the correct answer Bi gets the ability Ai. [CLIC]
In the second step, on the base of the previous table, the Bayes’ Theorem will help us calculating the probability [CLIC] a wrong answer (not Bi) corresponds to the evidence [CLIC] the examinee gets the knowledge Ai. [CLIC]
After defining the probability for the single item, the third step is the definition of the formula to calculate [CLIC] the probability a student who gives all the correct answer to the test B [CLIC] gets the competence A. So, in this step, the analysis must be extended to the whole test. How? [CLIC]
[CLIC] Our goal is defining an algorithm, [CLIC] based on Bayes’ Theorem, to calculate the probability , [CLIC] where B is the composed event including the k correct answers to the items test. [CLIC]
[CLIC] So, it will be necessary to: [CLIC] determine the possible combinations of the variable B with “A” and “not A”; [CLIC] determine the formula and define every single combination probability; [CLIC]
[CLIC] and, at last, determine the general algorithm to calculate P(A|B). In this specific case, the formula will be this. [CLIC] [CLIC] “n” equals the number of the items we decided our test is composed of. In this case, 100 multiple choice items. [CLIC]
In the last step we have to determine the threshold “k”. Once defined the formula to calculate P(A|B), we need to identify the value of “k” [CLIC] that is the minimum passing threshold. This value must satisfy some conditions: [CLIC] it must be inferior to “n”, the whole items number (in this case 100); [CLIC] the passing threshold must guarantee the examinee certainly got the competence A. It means that P(A|B) equals 1; [CLIC] At last, k must be the minimum correct answers value to give among a values set, thence P(A|B) equals 1. The general formula we’ll be this [CLIC]. [CLIC] But how can we indentify the exact point where P(A|B) equals 1? [CLIC]
[CLIC] Thanks to the graphic representation of the function f(k) [CLIC] we’ll be able to geometrically identify the exact point where P(A|B) gets the value 1. [CLIC] In this case, this point “k” is in the interval between 68 and 70. [CLIC] That means every examinee must give at least 69 of 100 correct answers in order to show for certain he/she gets the competence A. The problem has been solved! [CLIC]
For further information you can write me to this e-mail address. Thank very much for your attention!