The methods we apply to measure the ability of subjects to discriminate two stimuli.
The Experiment for detection theory We analyze experiments that measure the ability to distinguish between stimuli. Characteristic: observers can be more or less accurate. Measurement: sensitivity measures High sensitivity better discrimination Low sensitivity poorer discrimination The Yes-No Experiment
Example 1 : Face Recognition Understanding Yes-no data
Summarizing the Data Hit: Correctly recognizing an Old item. Miss: Failing to recognize an Old item. False alarm: Mistakenly recognizing a New item as old. Correct rejection: Correctly responding “no” to New item.
Only 2 of 4 numbers in the table provide independent information about the participant’s performance: The hit rate : H=P(“yes”|S2) The False alarm rate : F=P(“yes”|S1) denoted as (false-alarm, hit) pair : (F,H)=(.8, .4)
Measuring Sensitivity The simplest possibility is to ignore one of our two response rates using H to measure performance clearly inadequate Two important characteristic: it can only be measured between two alternative stimuli and must therefore depend on both H and F. S1 and S2 trials should have equal importance.
Therefore… Sensitivity measure/ index/ statistic: A function of H and F that attempts to capture this ability of the observer.
Two Simple Solution We look for the measure that goes up when H goes up, down when F goes up, and assigns equal importance to these statistics: Simply subtracting F from H sensitivity=H-F The proportion of correct responses, p(c) p(c) = 0.5[H+(1-F)] = 0.5(H-F)+0.5 p*(c) = (hit+CR)/total trials...number of two trials are unequal
Which is better? Neither, because p(c) depends directly on H-F, and not on either H or F separately. The two measures that are monotonically related in this way are said to be equivalent measures of accuracy.
A Detection Theory Solution The most widely used sensitivity measure of detection theory : d’ (“dee-prime”) d’ = z(H) - z(F) Bears an obvious family resemblance with p(c). The Z transformation: A symmetry property : z(1-P) = -z(p) We often refer to normal-distribution models by the abbreviation SDT.
Calculate d’ with p(c) It is sometimes important to calculate d’ when only p(c) is known: Assumption the hit rate equals the correct rejection rate. As a result, H=1-F d’=2*z[p(c)] This calculation is not correct in general.
About d’… When observers cannot discriminate at all, H=F and d’=0. The largest possible finite value of d’ depends on the number of decimal places to which H and F are carried. When H=.99 and F=.01, d’=4.65; many experimenters consider this an effective ceiling. Perfect accuracy implies an infinite d’
To avoid infinite d’ value… Two adjustment: Convert proportions of 0 and 1 to 1/(2N) and 1-1/(2N). To add 0.5 to all data cells regardless of whether zeroes are present. Most experiment avoid chance and perfect performance.
Implied Rocs ROC Space and Isosensitivity Curves
ROC space and Isosensitivity Curves A good sensitivity measure should be invariant when factors other than sensitivity change. Detection theory: assumed to have a fixed sensitivityeven under different response bias. When differ in response bias, the participants can produce the various performance pairs that leads to the fixed d’.
Isosensitivity curve Isosensitivity curve: the locus of (false-alarm, hit) pairs yielding a constant d’. All points on the curve have the same sensitivity. We show it in the unit square called ROC square.
ROC space ROC: Receiver/Relative Operating Characteristic ROC space: the region in which ROCs must lie, is the unit square.
ROC curve Different curves represent different d’. When performance is at chance (d’=0), the ROC is the major diagnosal the chance line. The curves shift toward the upper left corner , where accuracy is perfect, as d’ increase. These ROC curves summarize the predictions of detection theory.
The characteristics of isosensitivity curves The price of complete success in recognizing one stimulus class is complete failure in recognizing the other. The slope of these curves decreases as the tendency to respond “yes” increases.
ROCs in Transformed Coordinates To find an algebraic expression for the ROC, we would need to solve this equation for H as a function of F: The Transformed ROC, a zROC: z(H) = z(F)+d’ The range of values in these new units is from minus to plus infinity.
Shape: a straight line with unit slope The linearity of zROCs can be used to predict how much the FA rate will raise if the hit rate increases. d’ : the intercept of the straight-line ROC
ROC Implied by p(c) Use the equation: H = F+[2p(c)-1] a straight line of unit slope with intercepts equal 2p(c)-1. Predict how much the FA rate will go up if the hit rate increases. different from detection theory.
Which Implied ROCs Are Correct? The detection theory curves do a much better job than those for p(c). The unit slope of zROC curves are not always observed experimentally. d’ might change along response bias. The unit slope property reflects the equal importance of S1 and S2 trials to the corresponding sensitivity measure.
When ROCs do have unit slope, they are symmetrical around the minor diagonal. d’(1-H, 1-F) = d’(F, H)
Sensitivity as Perceptual Distance d’ has the mathematical properties of distance measures: Distance between an object and itself is 0 All distances are positive (positivity) Distance between object x and y is the same as between y and x (symmetry) d’(x,w) ≦d’(x,y) + d’(y,w) ratio scaling properties unboundedness
The triangle inequality can be replaced by: d’(x,w)n = d’(x,y)n + d’(y,w)n When n=2, this is the Euclidean distance formula. When n=1, this is the “city-block” metric
The signal detection model Underlying Distributions and the Decision Space
Underlying Distributions and the Decision Space Detection theory assumes that a participant in our memory experiment is judging its familiarity. Repeated presentation generate a distribution of values instead of the same result all the time.
The two distributions together comprise the decision space – the internal or underlying problem facing the observer. The participant can assess the familiarity value of the stimulus, but does not know which distribution led to that value. What is the best strategy foe deciding on a response?
Response Selection in the Decision Space The optimal rule is to establish a criterion that divides the familiarity dimension into two parts.
The decision space provides an interpretation of how ROCs are produced. The participant can change the proportion of “yes” responses, and generate different points on an ROC, by moving the criterion. The distribution most often used satisfy this requirement by assuming that any value on the decision axis can arise from either distribution.
Sensitivity in the Decision Space The mean difference between two distributions is a measure of sensitivity. This distance is in fact identical to d’ Distance along a line can be measured from any zero point so we measure mean distances relative to the criterion k. The mean difference= (M2-k)-(M1-k)
Underlying Distribution and Transformations The cumulative distribution function When M-k=0, the yes rate is 50%; large positive differences correspond to high “yes” rates and large negative differences to low ones.
Cumulative distribution function d’: the distance between these abscissa points, z(H)-z(F). It transforms a proportion into distance.
Thus, we conclude… The definition of the detection theory: a theory relating choice behavior to a psychological decision space. An observer’s choices are determined by sensitivities and response biases.
You can find d’ using the “inverse normal” functions of speadsheet program. A more complex program is d’ plus.
Essay: The Provenance of Detection Theory The term “psychophysics was invented by Gustav Fechner. He first take a mathematical approach to relating the internal and external worlds on the basis of experimental data. Fechner developed experimental methods for estimating the difference threshold, or jnd.
Attempts to measure jnds led to two complications: The threshold appeared not to be a fixed quantity. Different methods produced different values for the jnd.
Redefinition to survive the first problem Retained the literal notion of a sensory threshold, building mechanical and mathematical models to explain the gradual nature of observed function. Substituted a continuum of experience for the discrete processes of the threshold.
Statistical decision theory Purpose: to solve the problem that stimulus environments are noisy does not tell an observer how best to cope with them. Statistical decision theory pointed out that information derived from noisy signals could lead to action only when evaluated against well-defined goals. Decisions should depend not only on the stimulus, but on the expected outcomes of action.
By separating the world of stimuli and their perturbations from that of the decision process, detection theory was able to offer measures of performance that were not specific to procedure and that were independent of motivation. Procedure and motivation could influence data, but affected only the decision process