SlideShare a Scribd company logo
Adding human expertise to the quantitative analysis of fingerprints            Busey and Chen


                                   PROGRAM NARRATIVE

                                        A. Research Question

    Machine learning algorithms take a number of approaches to the quantitative analysis of

fingerprints. These include identifying and matching minutiae (refs), matching patterns of local

orientation based on dynamic masks (refs), and neural network approaches that attempt to learn

the structure of fingerprints (refs). While these techniques provide good results in biometric

applications and serve a screening role in forensic cases, they are less useful when applied to

severely degraded fingerprints, which must be matched by human experts. Indeed, statistical

approaches and human experts have different strengths. Despite the enormous computational

power available today for use by computer analysis systems, the human visual system remains

unequaled in its flexibility and pattern recognition abilities. Three possible reasons for this

success come from the experts knowledge of where the most important regions are located on a

particular set of prints, the ability to tune their visual systems to specific features, and the

integration of information across different features. In the present project, we propose to

integrate the knowledge of experts into the quantitative analysis of fingerprints to a degree not

achieved by other approaches. There is much that fingerprint examiners can add to machine

learning algorithms and, as we describe below, many ways in which statistical learning
algorithms can assist human experts. Thus the central research question of this proposal is: How

can the integration of information derived from experts improve the quantitative analysis of

fingerprints?


                                B. Research goals and objectives

    The goal of the present proposal is to integrate data from human experts with statistical

learning algorithms to improve the quantitative analysis of inked and latent prints. We introduce

a novel procedure developed by one investigator (Tom Busey) and use it to guide the input to

statistical learning algorithms developed and extended by our other investigator (Chen Yu). The

fundamental idea behind our approach is that the quantitative evaluation of the information

                                                Page 1
Adding human expertise to the quantitative analysis of fingerprints          Busey and Chen

contained in latent and inked prints can be vastly improved by using elements of human

expertise to assist the statistical modeling, as well as to introduce a new dimension of time that is

not contained in the static latent print analysis. The main benefit, as we discuss in sections C.x.x,

is that the format of the data extracted from experts allows the application of novel quantitative

models that are adapted from related areas. To apply this knowledge derived from experts, we

will use our backgrounds in vision, perception, machine learning and behavioral testing to design

experiments that extract relevant information from experts and use this to improve the

quantitative analysis techniques applied to fingerprints by integrating the two sources of

information.

   Our research interests differ somewhat from the existing approaches and reflects the

adaptations that are necessary to incorporate human expert knowledge. Existing statistical

algorithms developed to match fingerprints rely on several different classes of algorithms, Some

extract minutiae and other robust sources of information such as the number of ridges between

minutiae (refs). Others rely on the computation of local curvature of the ridges, and then partition

these into different classes (MASK refs). Virtually all approaches make reasoned and reasonable

guesses as to what the important sources of information might be, such as minutiae, local ridge

orientation or local ridge width (dgs paper). The present approach takes a more agnostic

approach to what might be the important sources of information in fingerprints, and we will
develop statistical models that take advantage of the data derived from experts. However, a

major goal of the grant is to demonstrate how expert knowledge can be applied to any extant

model, and to suggest how this might be accomplished. Thus we will spend substantial time

documenting our application of expert knowledge for our statistical models. In addition, we will

make all of our expert data available for other researchers and practitioners. It is likely that the

data will have implications for training, although this is not the focus of the present proposal.


                                C. Research design and methods

   At the heart of our approach is idea that human expertise, properly represented, can improve

the quantitative analyses of fingerprints. In a later section we describe how we apply human

                                                Page 2
Adding human expertise to the quantitative analysis of fingerprints        Busey and Chen

expert knowledge to various statistical analyses, but first we need to answer the question of

whether human experts can add something to the quantitative analyses of prints.

   The answer to this question can be broken down into two parts. First, do human visual

systems in general possesses attributes not captured by current statistical approaches, and second,

do human experts have additional capacities not shared by novices, capacities that could further

inform statistical approaches. Below we briefly summarize what the visual science literature tells

us about how humans recognize patterns, and then describe our own work that has addressed the

differences between experts and novices. As we will show, human experts have much to add to

quantitative approaches.

   We should stress that while we will gather data from human experts to improve our

quantitative analyses of fingerprints, the goal of this grant is not to study human experts in order

to determine whether or how they differ from novices, nor are we interested in questions about

the reliability or accuracy of human experts. Instead, we will generalize our previous results that

demonstrate strong differences in the visual processing of fingerprints in experts, and apply this

expertise to our own statistical analyses. As a result, we will only gather data from human

experts (latent print examiners with at least 5 years of post-apprentice work in the field) under

the assumption that this will provide maximum improvement to our statistical methods. We can

demonstrate the effectiveness of this knowledge by simply re-running the statistical analyses
without the benefit of knowledge from experts. There are various metrics attached to each

analysis technique that demonstrate the superiority of expert-enhanced analyeses, such as the

correct recognition/false recognoition tradeoff graphs, or the dimensionality

reduction/reconstruction successes of data reduction techniques.

   We will also apply novel approaches adapted from the related domain of language analyses.

It might seem odd to apply techniques developed for linguistic analyses to a visual domain such

as pattern recognition, but the principles that underlie both domains are very similar. Both

involve large numbers of features that have complex statistical relations. In the case of language,

the features are often words, phonemes or other acoustical signals. Fingerprints are defined by a


                                              Page 3
Adding human expertise to the quantitative analysis of fingerprints         Busey and Chen

complex but very regular dictionary of features that also share a complex and meaningful

correlational structure. One of us (Chen) is a highly-published expert in the field machine

learning algorithms as applied to multimodal data, and several papers inlcuded as appendicies

detail this expertisze. His work on multimodal applications between visual and auditory domains

make him well-suited to address the relation between human data and machnie leanring

algorythms. Both linguistic and visual informaiton contain highly-structured data that consist of

regularities that are extracted by perceivers, and this is not unlike the temporal sequence that

experts go through when they perform a latent print examination, as we describe in a later

section. First, however, we address how we might document the principles of human expertise.


Can we use elements of the human visual system to improve our statistical analyses?

   The answer to this question is straightforward, in part because of the overwhelming evidence

that human-based recognition systems contain processes that are not captured by current

statistical approaches. One of us (Busey) has published many articles addressing different

aspects of human sensation, perception and cognition, and thus is well-suited to manage the

acquisition and application of human expertise to statistical approaches. Below we briefly

summarize the properties of the human visual system and in a later section we describe how we

plan to extract fundamental principles from this design in order to improve our statistical

analyses of fingerprints.
   An analyses of the human visual system by vision scientists demonstrates that the recognition

process proceeds via an hierarchical series of stages, each with important non-linearities (nature

ref), that produce areas that respond to objects of greater and greater complexity. This process

also provides increasing spatial independence, allowing brain areas to integrate over larger and

larger regions. This will become important for holistic or configural processing, as discussed in a

later section. (also talk about feature-based attention)

   A second benefit of this hierarchical approach is that objects achieve limited scale and

contrast invariance. Statistical approaches often deal with this through local contrast or

brightness normalization, but this is a separate process. Scale invariance is often achieved by

                                               Page 4
Adding human expertise to the quantitative analysis of fingerprints          Busey and Chen

explicitly measuring the width of ridges (grayscale ref), again a separate process.

   A third strength of the human visual system is that it appears to have the ability to form new

feature templates through an analyses of the statistical information contained in the fingerprints.

This process, called unitization, will tend to improve feature detection in noisy environments as

is often found with latent prints.


Do forensic scientists have visual capabilities not shared by novices?

   The prior summary of the elements of the human visual system suggests that current

statistical approaches can be improved by adapting some of the principles underlying the human

visual system. There are, however, other processes that are specifically developed by latent print

examiners that may also be profitably applied to statistical models. Below we summarize the

results of two empirical studies that have recently been published in the highly respected journal

Vision Research (Busey & Vanderkolk, 2005). The results demonstrate not only that experts are

better than novices, but suggest the nature of the processes that produce this superior

performance.

   Visual expertise takes many forms. It could be different for different parts of the

identification process, and may not even be verbalizable by the expert since many elements of

perceptual expertise remain cognitively impenetrable (refs). A major focus of our research is to

capture elements of this expertise and use this as a training signal for our statistical learning
algorithms. What is novel to our approach is our ability to capture the expertise at a very deep

and rich level. In the next section we describe our prior work documenting the nature of the

processes that enable experts to perform at levels much superior to novices, and then in Section

C.2 we describe how we capture this expertise in a way that we can use it to improve our

statistical learning algorithms.


C.1. Documenting expertise in human latent print examiners

   Initially, experts tend to focus on the entire print, which leads to benefits that we have

previously identified as configural processing (Busey & Vanderkolk, 2005). Configural


                                                Page 5
Adding human expertise to the quantitative analysis of fingerprints         Busey and Chen

processing takes several forms, but the basic idea behind this process is that instead of focusing

on individual features or minutiae, the observer instead integrates information over a large

region, to identify important relations such as relative locations of features or curvature of ridge

flow. Fingerprint examiners often talk about 'viewing the image in its totality', which is different

language for the same process.

   While configural processing reveals the overall structure of an image and selects important

regions for further inspection, the real work comes in comparing small regions in one print to

regions in the other. These regions may be selected on the basis of minutiae identified in the

print, or high-quality Level 3 detail. We know from related work on perceptual learning in the

visual system that one of the processes by which expertise develops is through the development

of new feature detectors. Experts spend a great deal of time viewing prints, and this has the

potential to result in profound changes in how their visual systems process fingerprints. (config

processing refs)

   One process by which experts could improve how they extract latent print information from

noisy prints is termed unitization, in which novel feature detector are created through experience

(unitization refs). Fingerprints contain remarkable regularities and the human visual system


C.1.a. Do experts have information valuable to training networks or documenting the

quantitative nature of fingerprints?

   Fingerprint examiners have received almost no attention in the perceptual learning or

expertise literatures, and thus the PI began a series of studies in consultation with John

Vanderkolk, of the Indiana State Police Forensic Sciences Laboratory in Fort Wayne, Indiana.

Our first study addressed the nature of the expertise effects in a behavioral experiment, and then

we followed up evidence for configural processing with an electrophysiological study. The

discussion below describes the experiments in some detail, in part because extensions of this

work are proposed in Section D, and a complete description here illustrates the technical rigor

and converging methods of our approach.



                                               Page 6
Adding human expertise to the quantitative analysis of fingerprints              Busey and Chen

C.1.b. Behavioral evidence for

configural processing

   In our first experiment, we abstracted
                                                                                       Study
                                                                                       1 Sec
what we felt were the essential elements

of the fingerprint examination process

into an X-AB task that could be

accomplished in relatively short order.
                                                                                       Mas
                                                                                       200 o
                                                                                       Milli
This work is described in Busey and

Vanderkolk (2005), but we briefly

describe the methods here since they

illustrate how our approach seeks to find

a paradigms that is less time-consuming
                                                                                       Test
                                                                                       Until
than fully realistic forensic examinations    Figure 1. Sequence of events in a behavioral experiment with
                                              fingerprint experts and novices. Note that the study image has a
(which can take hours to days to              different orientation and is slightly brighter to reduce reliance
                                              on low-level cues.
complete) yet still maintains enough

ecological validity to tap the expertise of the examiners. Figure 1 shows the stimuli used in the

experiment as well as a timeline of one trial. We cropped out fingerprint fragments from inked

prints, grouped them into pairs, and briefly presented one of the two for 1 second. This was
followed by a mask for either 200 or 5200 ms, and then the expert or novice subject made a

forced-choice response indicating which of the two test prints they believe was shown at study.

We introduced orientation and brightness jitter at study, and the construction of the pairs was

done to reduce the reliance on idiosyncratic features such as lint or blotches.

   At test, we introduced two manipulations that we thought captured aspects of latent prints, as

shown in Figure 2. First, latent prints are often embedded in visual noise from the texture of the

surface, dust, and other sources. One expert, in describing how he approached latent prints,

stated that his job was to 'see through the noise.' To simulate at least elements of this noise, we

embedded half of our test prints in white visual noise. While this may have a spatial distribution


                                               Page 7
Adding human expertise to the quantitative analysis of fingerprints                                    Busey and Chen

that differs from the noise typically

encountered by experts, we hoped that it

would tap whatever facilities experts

may have developed to deal with noise.
                                                             Clear Fragm
                                                                 Partially
    The second manipulation was

motivated by the observation that latent

prints are rarely complete copies of their
                                                             Fragments
                                                                 Partially
                                                             Presented in
                                                                  Present
inked counterparts. They often appear          Figure 2. Four types of test trials.

patchy if made on an irregular surface,

and sections may be partially masked out. To simulate this, we created partially-masked

fingerprint fragments as shown in the upper-right panel of Figure 2. Note that the partially-

masked print and its complement each contain exactly half of the information of the full print

and the full print can be recovered by summing the two partial prints pixel-by-pixel. We use this

property to test for configural effects as described in a later section.

    All three manipulations (delay between study and test, added noise and partial masking) were

fully crossed to create 8 conditions. The

data is shown in Figure 3, which show                          Experts- Short Delay                           Experts- Long Delay
                                                       1.0                                            1.0


main effects for all three factors for                 0.9                                            0.9

                                                       0.8                                            0.8
                                             Percent Correct                                Percent Correct
novices. Somewhat surprising is the                    0.7                                            0.7

                                                       0.6                                            0.6
                                                                No Noise                                       No Noise
                                                                Noise Added                                    Noise Added
finding that while experts show effects                0.5                                            0.5

                                                                 Full Image Partial Image                       Full Image Partial Image

                                                                     Image Type                                     Image Type
of added noise and partial masking, they
                                                               Novices- Short Delay                           Novices- Long Delay
show no effect of delay, which suggests                1.0                                            1.0

                                                       0.9                                            0.9

that they are able to re-code their visual             0.8                                            0.8
                                             Percent Correct                                Percent Correct
                                                       0.7                                            0.7
information into a more durable store                  0.6                                            0.6
                                                                No Noise
                                                                Noise Added                                    No Noise
                                                       0.5                                            0.5      Noise Added
resistant to decay, or have better visual                        Full Image Partial Image                       Full Image   Partial Image

                                                                     Image Type                                      Image Type

memories. Experts also show an

interaction between added noise and            Figure 3. Behavioral Experiment Data. Error bars represent one
                                               standard error of the mean (SEM).


                                                 Page 8
Adding human expertise to the quantitative analysis of fingerprints        Busey and Chen

partial masking, but novices do not. This interaction seen with the experts may result from very

strong performance for full images embedded in noise, and may result from configural processes.

To test this in a scale-invariant manner, we developed a multinomial model which makes a

prediction for full-image performance given partial-image performance using principles similar

to probability summation. The complete results are found in Busey & Vanderkolk (2005), but to

summarize, when partial image performance is around 65%, the model predicts full image

performance to be about 75%, and it is almost at 90%, significantly above the probability

summation prediction. Thus it appears that when both halves of an image are present (as in the

full image) experts are much more efficient at extracting information from each half.

   The results of this experiment lay the groundwork for a more complete investigation of

perceptual expertise in fingerprint examiners. From this work we have evidence that:

   1) Experts perform much better than novices overall, despite the fact that the testing

conditions were time-limited and somewhat different than those found in a traditional latent print

examination.

   2) Experts appear immune to longer delays between study and test images, suggesting better

information re-coding strategies and/or better visual memories

   3) Experts may have adopted configural processing abilities over the course of their training

and practice. All observers have similar facilities for faces as a consequence the ecological
importance of faces and our quotidian exposure as a result of social interactions. Experts may

have extended this ability to the domain of fingerprints, since configural processing is seen as

one mechanism underlying expertise (e.g. Gauthier & Tarr, 1997).


C.1.c. Electrophysiological evidence for configural processing

   To provide converging evidence that fingerprint experts process full fingerprints

configurally, we turned to an electrophysiological paradigm based on work from the face

recognition literature. This experiment is described more fully in Busey and Vanderkolk (2005),

which is included as an appendix. However, these results support the prior conclusions described

above, and demonstrate that the configural processing observed with fingerprint examiners is a

                                              Page 9
Adding human expertise to the quantitative analysis of fingerprints         Busey and Chen

result of profound and qualitative changes that occur in the very earliest stages of their

perceptual processing of fingerprints.


C.2. Elements of human expertise that could improve quantitative analyses

   The two studies described above are important because they illustrate that configural

information is one process that could be adapted for use in the quantitative analyses of

fingerprints. Existing quantitative models of fingerprints incorporate some elements of the

expertise seen above, but many elements could be added that would improve the recognition

accuracy of existing programs. The two major approaches to fingerprint matching rely on local

features such as minutiae detection (refs), and more global approaches such as dynamic masks

applied to orientation computed at many locations on a grid overlaying the print (refs). Of these

two approaches, the dynamic mask approach comes closer to the idea of configural processing,

although it does not compute minutiae directly. strengthen this intro

   Neither approach takes advantage of the temporal information that expresses elements of

expertise in the human matching process. Quantitative information such as fingerprint data, when

represented in pixel form, has a highly-dimensional structure. The two techniques described

above reduce this dimensionality by either extracting salient points such as minutiae, or

computing orientation only at discrete locations. Both of these approaches throw out a great deal

of information that could otherwise be used to train a statistical model on the elemental features
that allow for matches. Part of the reason this is necessary is that the high-dimensional space is

difficult to work in: all prints are more or less equally similar without this dimensionality

reduction, and by reducing the dimensionality computations such as similarity become tractable.

The key, then, is to reduce the dimensionality while preserving the essential features that allow

for discrimination among prints. One technique that has been explored in language acquisition is

the concept of "starting small" (Elman ref). In this procedure, machine learning approaches such

as neural network analyses are given very coarse information at first, which helps the network

find an appropriate starting point. Gradually, more and more detail information is added, which

allows the network to make finer and finer discriminations.

                                              Page 10
Adding human expertise to the quantitative analysis of fingerprints          Busey and Chen

   We discuss these ideas more fully in section X.Xx, but we mention it here to motivate the

empirical methods described next. Experts likely select which information they choose to

initially examine based on the need to organize their search processes. Thus they likely acquire

information that may not immediately indicate to a definitive conclusion of confirmation or

rejection, but guides the later acquisition process. In the scene perception literature, this process

is known as 'gist acquisition' (refs), and suggests that the order in which a system (machine or

human) learns information matters. In the section below we describe how we acquire both spatial

and temporal information from experts, and then describe how this knowledge can be

incorporated into quantitative models.


C.3. Capturing the information acquisition process: The moving window paradigm

   To identify the nature of the information used by experts, and the order in which it is

gathered, we have begun to use a technique called a moving window procedure. In the sections

below we describe this procedure and how it can be extended to address the role of configural or

gist information in human experts.


C.3.a. The moving window paradigm

   The moving window paradigm is a software tool that simulate the relative acuity of the fovea

and peripheral visual systems. As we look around the world, there is a region of high acuity at
the location our eyes are currently pointing. Regions outside the foveal viewing cone are

represented less well. In the moving window paradigm we represent this state by slightly

blurring the image and reducing the contrast.

   http://cognitrn.psych.indiana.edu/busey/FingerprintExample/




                                               Page 11
Adding human expertise to the quantitative analysis of fingerprints                  Busey and Chen




   Figure 4. The moving window paradigm allows the user to move the circle of interest around to different
   locations on the two prints. This circle provides high-quality information, and allows the expert the
   opportunity to demonstrate, in a procedure that is very similar to an actual latent print examination, which
   sections of the prints they believe are most informative. This procedure also records the order in which
   different sites are visited.


   Figure 4 shows several frames of the moving window program, captured at different points in

time. The two images have been degraded by a blurring operation that somewhat mimics the

reduced representation of peripheral vision. The exception is a clear circle that responds in real

time to the movement of the mouse. This dynamic display forces the user to move the clear

window to regions of the display that warrant special interest. The blurred portions provide some

context for where to move the window. By recording the position of the mouse each time it is


                                                   Page 12
Adding human expertise to the quantitative analysis of fingerprints           Busey and Chen

moved, we can reconstruct a complete record of the manner in which the user examined the

prints. This method has some drawbacks in that the eyes move faster than the mouse. However,

we find that with practice the experts report very little limitations with this procedure and it has

the benefit of precise spatial localization. A major benefit of this procedure is that it can be done

over the web, reaching dozens of experts and producing a massive dataset. Many related

information theoretic approaches such as latent semantic analysis find that a large corpus of data

is necessary in order to reveal the underlying structure of the representation of information, and a

web-based approach provides sufficient data.

    The data produced by this paradigm is vast: x/y coordinates for the clear window at each

millisecond. We have begun to analyze this data using several different techniques. The first

analysis we designed creates a mask that is black for regions the observer never visited and clear

for areas visited most often. Figure 5 shows an example of this kind of analysis. Areas visited

less often are somewhat darkened. The left panels of Figure 5 show two masked images, which

shows not only where the experts visited, but how long they spent inspecting each location. Thus

it represents a window into the regions the experts believed informative.

    The right panels give a slightly different view, where unvisited areas are represented in red.

This illustrates that experts actually spend most of their time in relatively small regions of the

prints.
    As a first pass, the images in Figure 5 reveal where the experts believe the task-relevant

information resides. However, lost in such a representation is the order in which these sites were

visited. In addition, this information is very specific to a particular set of print. Ultimately we

will produce more general representation that characterizes both the fundamental set of features

(often described as the basis set) that experts rely on, as well as how they process these features.

We have begun to explore an information-theoretic approach to this problem that seeks to find a

set of visual features that is common to a number of experts and fingerprint pairs. This approach

is related to many of the dimensionality reduction techniques that have been applied to natural

images (e.g. Olshausen & Field, 1996). Later project extend this approach to incorporate


                                               Page 13
Adding human expertise to the quantitative analysis of fingerprints                      Busey and Chen




 Figure 5. Examples of masked imaged revealing where experts choose to acquire information in order to make
 an identification. The black versions show only regions where the expert spent any time, and the mask is
 clearer for regions in which the expert spent more time. The right-hand images show teh same information, but
 allow some of the uninspected information to show through. These images reveal that experts pay relatively
 little attention to much of the image and only focus on regions they deem releveant for the identification. We
 suggest that this element of expertise, learning to attend to relevant locations, is something that coudl benefit
 quantitative analyes of fingerprints.

elements of configural processing or context-specific models. In the present proposal we discuss

several different ways we plan to analyze what is a very rich dataset.

    Our experts report relatively little hindrance when using the mouse to move the window. The

latent and inked prints have their own window (only one is visible at any one time) and users

press a key to flip back and forth between the two prints. This flip is actually faster than an
eyemovement and automatically serves as a landmark pointer for each print, making this

procedure almost as easy to use as free viewing of the two prints (which are often done under a

loupe with its own movement complexities). In addition, we also give users brief views of the

entire image to allow configural processes to work to establish the basic layout.


C.3.b. Measuring the role of configural processing in latent print examinations

    behavioral experiment- blurred vs. very low contrast- qualitative changes across experts?

    complete this section




                                                      Page 14
Adding human expertise to the quantitative analysis of fingerprints         Busey and Chen

C.3.c. Verification with eyemovement recording

   complete this section


C.4. Extracting the fundamental features used when matching prints

   Because latent and inked prints are rarely direct copies of each other, an expert must extract

invariants from each image that survive the degradations due to noise, smearing, and other

transformations. Once these invariants are extracted, the possibility of a match can be assessed.

This is similar in principle to the type of categorical perception observed in speech recognition,

in which the invariants of parts of speech are extracted from the voices of different talkers. This

suggests that there exists a set of fundamental building blocks, or basis functions, that experts

use to represent and even clean up degraded prints. The nature and existence of these features are

quite relevant for visual expertise, since in some sense these are the direct outcomes of any

perceptual system that tunes itself to the visual diet it experiences.

   We propose to perform data reduction techniques on the output of the moving window

paradigm. These techniques have successfully been applied to derive the statistics of natural

images (Hyvarinen & Hoyer, 2000). The results provided individual features that are localized in

space and resemble the response profiles of simple cells in primary visual cortex. Many of these

studies are performed on random sampling of images and visual sequences, but the moving

window application provides an opportunity to use these techniques to recover the dimensions of
only the inspected regions, and to compare the recovered dimensions from experts and

representations based on random window locations.

   The specifics of this technique are straightforward. For each position of the moving window,

extract out (say) a 12 x 12 patch of pixels. This is repeated at each location that was inspected by

the subject, with each patch weighted by the amount of time spent at each location. The moving

window experiment tens of thousands of patches of pixels, which are submitted to a data

reduction technique (independent component analysis, or ICA), which is similar to principle

components analysis, with the exception that the components are independent, not just



                                               Page 15
Adding human expertise to the quantitative analysis of fingerprints         Busey and Chen




 Figure 6. ICA components from expert data.

uncorrelated. The linear decomposition generated by ICA has the property of sparseness, which

has been shown to be important for representational systems (Field, 1994; Olshausen & Field,

1996) and implies that a random variable (the basis function) is active only very rarely. In

practice, this sparse representation creates basis functions that are more localized in space than

those captured by PCA and are more representative of the receptive fields found in the early

areas of the visual system.

   Huge copra of samples are required to extract invariants from noisy images, and at present

we have only pilot data from several experts. However, the results of this preliminary analysis

can be found in Figure 6. This figure shows features discovered using the ICA algorithm (Hurri

& Hyvarinen, 2003; Hyvarinen, Hoyer & Hurri, 2003). Each image represents a basis function
that when linearly combined will reproduce the windows examined by experts. Inspection of

Figure 6 reveals that features such as ridge endings, y-brachings and islands are beginging to

become represented. This analysis takes on greater value when applied to the entire database we

will gather, since it will combine across individual features to derive the invariant stimulus

features that provide the basis for fingerprint examinations done by human experts.

   The ICA analysis is very sensitive to spatial location, and while cells in V1 are likely also

highly position sensitive, the measured basis functions are properties of the entire visual stream,

not just the early stages. More recent advances in ICA techniques have addressed this issue in a

similar way that the visual system has chosen to solve the problem. In addition performing data

                                              Page 16
Adding human expertise to the quantitative analysis of fingerprints                         Busey and Chen




  Figure 7. ICA components from expert data, and grouped by energy. This analyses allows the basis functions
  to have partial spatial independence, at a slight cost to image quailty. This latter issue is less relevant for larger
  corpi when many similar features are combined by individual basis function groups.

reduction techniques to extract the fundamental basis sets, these extended ICA algorithms group

the recovered components based on their energy (squared outputs). This grouping has shown to

produce classes of basis functions that are position invariant by virtue of the fact that they

include many different positions for each fundamental feature type. The examples shown in

Figure 7 were generated by this technique, which reduces the reliance on spatial location. This

groups the recovered features by class and accounts for the fact that rectangles have similar

properties to nearby rectangles. Note that the features in Figure 14 are less localized than those

typically found with ICA decompositions, which may be due to the large correlational structure

inherent in fingerprints, although this remains an open question addressed by this proposal.

   The development of ICA approaches is an ongoing field, and we anticipate that the results of

the proposed research will help extend these models as we develop our own extensions based on

the applications to fingerprint experts. There are several ways in which the recovered

components can be used to evaluate the choice of positions by experts (which ultimately

determine, along with the image, the basis functions). First, one can visually inspect the sets of

basis functions recovered from datasets produced by experts, and compare this with one

generated from random window locations.

   A second technique can be used to demonstrate that experts do indeed posses a feature set

that differs from a random set. The data from random windows and experts can be combined to


                                                       Page 17
Adding human expertise to the quantitative analysis of fingerprints        Busey and Chen

produce a common set of components (basis functions). ICA is a linear technique, and thus the

original data for both experts and random windows can be recovered through weighted sums of

the components, with some error if only some of the components are saved. If experts share a

common set of features that is estimated by ICA, then their data should be recovered with less

error than that of the random windows. This would demonstrate that an important component of

expertise is the ability to take a highly dimensional dataset (as produced by noisy images) and

reduce it down to fundamental features. From this perspective, visual expertise is data reduction.

   These kinds of data reduction techniques serve a separate purpose. Many of the experiments

described in other sections of this proposal depend on specifying particular features. While initial

estimates of the relevant features can be made on the basis of discussions with fingerprint

experts, we anticipate that the results of the ICA analysis will help refine our view of what

constitutes an important feature within the context of fingerprint matching.

   The moving window procedure has the disadvantage of being a very localized procedure, due

to the nature of the small moving window. There is a fundamental tradeoff between the size of

the window and the spatial acuity of the procedure. If the window is made too large, we know

less about the regions from which the user is attempting to acquire information. To offset this,

we have provided the user the opportunity to view quick flashes of the full image, enough to

provide an overview of the prints, but not enough to allow matches of specific regions. We will
also conduct the studies using large and small windows to see whether the nature of the

recovered components changes with window size.


C.4. Starting Small: Guiding feature extraction with expert knowledge

   We need to ask whether this is compelling, and cut it if it is not.

   Feature extraction procedures attempt to take a high dimensional space and use the

redundancies in this space to derive a lower-dimensional representation that combine across the

redundancies to provide a basis set. This basis set can be thought of as the fundamental feature

set, and the development of this set can be thought of as one mechanism underlying human

expertise. The difficulty with these highly-dimensional spaces is that algorithms that attempt to

                                              Page 18
Adding human expertise to the quantitative analysis of fingerprints         Busey and Chen

uncover the feature set through iterative procedures like Independent Component Analysis or

neural networks may fall into local minima and fail to converge upon a global solution. One

solution that has been proposed in the human developmental literature is one of starting small

(Elman, 1993). In this technique, programmers intially restrict the inputs to statistical models to

provide general kinds of information rather than specific information that would lead to learning

of specific instances. As a network matures, more specific information is added, which allows

the network to avoid falling into local minima that represent non-learned states. While the exact

nature of these effects are still being worked out (Rohde & Plaut, 1999), recent work has

provided empirical support in the visual domain (Conway, Ellefson & Christiansen, ref). This

suggests that we might use the temporal component of the data from experts in the moving

window paradigm to help guide the training of our networks.

   As an expert views a print, they initially are likely to focus on broad, overall types of

information that give the need to finish if necessary


C.5. Automatic detection of regions of interest using expert knowledge

   In both fingerprint classification (e.g. Dass & Jain, 2004; Jain, prabhakar & Hong 1999;

Cappelli, Lumini, Maio & Maltoni, 1999) and fingerprint identification (e.g. Pankanti, Prabhakar

& Jain, 2002; Jain, Prabhakar & Pankanti, 2002) applications, there are two main components for

an automatic system: (1) feature extraction and (2) matching algorithm to compare (or classify)
fingerprints based on feature representation. The feature extraction is the first step to convert

raw images into feature representations. The goal is to find robust and invariant features to deal

with various conditions in real-world applications, such as illumination, orientation and

occlusion. Given a whole image of fingerprint, most fingerprint recognition systems utilize the

location and direction of minutiae as features for pattern matching. In our preliminary study of

human expert behaviors, we observe that human experts focus on just parts of images (regions of

interest – ROIs) as shown in Figure XX, suggesting that it is not necessary for a human expert to

check through all minutiae in a fingerprint. A small subset of minutiae seems to be sufficient for

the human expert to make a judgment. What regions are useful for matching among all the

                                              Page 19
Adding human expertise to the quantitative analysis of fingerprints                Busey and Chen




   Figure X. The overview of automatic detection of regions of interest. The red regions in the fingerprints
   indicate whether human expert focus on in pattern matching task.

minutiae in a fingerprint? Is it possible to build an automatic ROI detection system that can

achieve a similar performance as a human expert? We attempt to answer this question by

building a classification system based on the training data captured from human experts. Given a

new image, the detection system is able to automatically detect and label regions of interest for

the matching purpose. We want to note that we expect that most regions selected by our system

will be minutiae but we also expect that the system will potentially discover the structure

regularities from non-minutia regions that are overlooked in previous studies. Different from

previous studies of minutiae detection (e.g. Maio & Maltoni, 1997), our automatic detection
system will not simply detect minutiae in a fingerprint but focus on detecting both a small set of

minutiae and other useful regions for the matching task. Considering the difficulties in

fingerprint recognition, building this automatic detection system is challenging. However, we are

confident that this proposed research will be first steps toward the success and make important

contributions. This confidence lies in two important factors that make our work different from

other studies: (1) we will record detailed behaviors of human experts (e.g. where they look in a

matching task) and recruits the knowledge extracted from human experts to build a pattern

recognition system; and (2) we will apply state-of-art machine learning techniques in this study

to efficiently encode both expert knowledge and regularities in fingerprint data. The combination



                                                  Page 20
Adding human expertise to the quantitative analysis of fingerprints         Busey and Chen

of these two factors will lead us to achieve this research plan.

   To build this kind of system, we need to develop a machine learning algorithm and estimate

the parameters based on the training data. Using the moving window paradigm (described in

C.3), we collect the information of where a human expert examines from moment to moment

when he performs a matching task. Hence, the expert’s visual attention and behaviors (moving

the windows) can be utilized as labels of regions of interest – providing the teaching signals for a

machine learning algorithm. In the proposed research, we will build an automatic detection

system that captures the expert’s knowledge to guide the detection of useful regions in a

fingerprint for pattern matching.

   We will use the data collected from C.X. Each circular area examined by the expert is filtered

by a bank of Gabor filters. Specifically, the Gabor filters with three scales and five orientations

are applied to the segmented image. It is assumed that the local texture regions are spatially

homogeneous, and the mean and the standard deviation of the magnitude of the transform

coefficients are used to represent an object in a 48-dimensional feature vector. We reduced the

high-dimensional feature vectors into the vectors of dimensionality 10 by principle component

analysis (PCA), which represents the data in a lower dimensional subspace by pruning away

those dimensions with the least variance. We also randomly sample other areas that the expert

doesn’t pay attention to and code these areas with a Non-ROI label which is paired with feature
vectors extracted from these areas. In total, the training data consists of two groups of labeled

features – ROI and Non-ROI.

   Next, we will build a binary classifier based on Support vector machines (SVMs). SVMs

have been successfully applied to many classification tasks (Vapnik 1995; Burges 1998). A SVM

trains a linear separating plane for classifying data, through maximizes the margins of two

parallel planes near the separating one. The central idea is to nonlinearly map the input vector

into a high-dimensional feature space and then construct an optimal hyperplane for separating

the features. This decision hyperplane depends on only a subset of the training data called

support vectors.


                                              Page 21
Adding human expertise to the quantitative analysis of fingerprints                Busey and Chen

    For a set of n-dimensional training examples, Χ = {xi }i =1 labeled by expert’s visual attention
                                                           m



{ yi }im1 , and a mapping of data into q-dimensional vectors φ ( X ) = {φ ( xi )}im1 by kernel function
       =                                                                          =


where q >> n , a SVM can be built on the set of mapping training data based on the solution of

the following optimization problem:
                                                                             m
                                                                   1 T
      Minimizing over ( w, b, ξ1 ,..., ξ m ) the cost function:      w w + C∑xi
                                                                   2        i =1


      Subject to: ∀i =1 : y ( w φ ( xi ) + b) ≥ 1 − ξ i and
                   m           T



                    ξ ι ≥ 0 for all i

Where C is a user-specified constant for controlling the penalty to the violation terms denoted by
each ξ i . The ξ is called slack variables that measure the deviation of a data point from the ideal

condition of pattern separability. After training, w and b constitute of the classifier:
                                              y = sign( wT φ ( x) + b)

    Compared with other approaches used in fingerprint recognition, such as neural networks and

k-nearest neighbors, SVMs have demonstrated more effective in many classification tasks. In

addition, we first transform original features into a lower-dimensional space based on PCA. The

purpose of this first step is to deal with the curse of dimensionality. We then map the data points

into another higher-dimensional space so that they are linearly separable. By doing so, we

convert the original pattern recognition problem into a simpler one. This idea is quite in line with

kernel-based nonlinear PCA (Scholkopf, Smola & Muller 1998) that have been successfully used
in several fields (e.g. Wu, Su & Carpuat 2004).

    Given a new testing fingerprint, we will shift a 40x40 window over the image and classify all

the patches at each location and scale. The system will first extract Gabor-based features from

local patches which will be the input to the detector. The detector will label all the regions as

either ROI or Non-ROI. We expect that most ROIs are minutiae. Different from the methods

based on minutiae matching, we also expect that only a small of minutiae are utilized by human

experts. Moreover, we expect the system to detect some areas that are not defined as minutiae

but human experts also pay attention to during the matching task. Thus, the ROI detector we

develop will go beyond the standard approach in fingerprint recognition (minutiae extraction and


                                                     Page 22
Adding human expertise to the quantitative analysis of fingerprints           Busey and Chen

matching). By efficiently encoding the knowledge of human expert, the proposed system will

have opportunities to discover the statistical regularities in fingerprints that have been

overlooked in previous studies.


C.6. Using expert-identified correspondences to extract environmental models

   In our moving window paradigm, a human expert moves the window back and forth between

inked and latent fingerprints to perform pattern matching. We propose that the dynamic

behaviors of the expert provide additional signals indicating one-to-one correspondences

between two images. In light of this, our hypothesis is that an expert’s decision is based on the

comparison of these one-to-one patches. Therefore, we propose that these expert-identified

correspondences can serve as additional information to find the regularities in fingerprint and

build the automatic detection system.

   We propose to use this knowledge as a prior for the training data. We observe that not all the

focused regions in the latent print have the corresponding regions in the inked print. Thus, it is

more likely that those one-to-one pairs play a more important role in pattern matching than other

regions of interest. Based on this observation, we propose to maintain a set of weights over the

training data. More specifically, for each ROI in the latent image, we find the most likely pairing

patch in the inked image. Two constraints guide the searching of the matching pair. The temporal

constraint is based on the expert’s behaviors. For instance, the patch in the inked pair that the
expert immediately examine (right after looking at the ROI in the latent image) is more likely to

associate with that ROI in the latent pair. The spatial constraint is to find the highest similarity of

the patch in the latent image and any other patch in the inked image. In this way, each ROI in the

latent image can be assigned with a weight indicating the probability to map this region to a

region in the other image. With a set of weighted training data, we will apply a SVM-based

algorithm (briefly described in C.5) which will focus on the paired samples (with high weights)

in the training data. More specifically, we replace the constant C in the standard SVM with a set
of variables ci , each of which corresponds to the weight of a data point. Accordingly, the new



                                               Page 23
Adding human expertise to the quantitative analysis of fingerprints           Busey and Chen
                               m
                      1 T
objective function is   w w + ∑ cix i . Thus, the matching
                      2       i =1


regions receive more penalties if they are nonseparable

points while other regions receive less attention because it

is more likely that they are irrelevant to the expert’s

decision. Thus, the parameters of the SVM are tuned up to

favorite the regions that human experts are especially

interested in. By encoding this knowledge in a machine
                                                                  Figure X. The overview of automatic
                                                                  detection of regions of interest. The red
learning algorithm, we expect that this method will lead to a     regions in the fingerprints indicate
                                                                  whether human expert focus on in
better performance by closely imitating the expert’s              pattern matching task.

decision.


C.7. Dependencies between global and local information: The role of gist information

   Fingerprints are categorized into several classes, such as whorl, right loop, left loop, arch,

and tented arch in the Henry classification system (Henry 1900). In the literatures, researchers

use only 4-7 classes in an automatic classification system. This is because the task of

determining a fingerprint class can be difficult. For example, it is hard to find robust features

from raw images that can aid classification as well as exhibit low variations within each class. In

C.5 and C.6, we discuss how to use expert knowledge to find useful features for pattern
matching. By taking a bigger picture of feature detection and fingerprint classification in this

section, we find that we need to deal with a chicken-and-egg problem: (1) useful local features

can predict fingerprint classes; and (2) a specific fingerprint class can predict what kinds of local

regions likely occur in this type of fingerprint. In contrast, standard alone feature detection

algorithms (e.g. in C.5 and C.6) usually look at local pieces of the image in isolation when

deciding whether the patch is a region of interest. In machine learning, Murphy, Torralba and

Freeman (2003) proposed a conditional random filed for jointly solving the tasks of object

detection and scene classification. In light of this, we propose to use the whole image context as

an extra source of global information to guide the searching of ROIs. In addition, a better set of


                                               Page 24
Adding human expertise to the quantitative analysis of fingerprints                    Busey and Chen

ROIs will also potentially make the classification of the whole fingerprint more accurate. Thus,

the chicken-and-eggs problem is tackled by a bootstrapping procedure in which local and global

pattern recognition systems interact with and boost each other.

    We propose a machine learning system based on graphical models (Jordan 1999) as shown in

Figure XX. We define the gist of image as a feature vector extracted from the whole image by
treating it as a single patch. The gist is denoted by vG . Then we introduce a latent variable T

describing the type of fingerprint. The central idea in our graphical model is that ROI presence is

conditionally independent given the type and the type is determined by the gist of image. Thus,

our approach encodes the contextual information on a per image basis instead of extracting

detailed correlations between different kinds of ROIs (e.g. a fix prior such as the patch A always

occurs to the left of the patch B) because of the complexity and variations of detailed

descriptions. Next we need to classify fingerprint types. We will simply train a one-vs-all binary

SVM classifier for recognizing each fingerprint type based on the gist. We will then normalize
                                p (T t = 1 | vG )
the results: p (T = t | vG ) =                          where p (T t = 1 | v G ) is the output of tth one-vs-all
                               ∑t ' p (T t ' = 1 | vG )

classifier.

    As far as the fingerprint type is known, we can use this information to facilitate ROI

detection. As shown on the tree-structured graphical model in Figure XX, the following

conditional joint density can be expressed as follows:
                             1                                1
    p (T , R1 ,..., RN | v) = p (T | vG )∏ p ( Ri | T , vi ) = p (T | vG )∏∑ p (Tt ) p( Ri | Tt , vi )
                             z           i                    z           i t




                                                     Page 25
Adding human expertise to the quantitative analysis of fingerprints        Busey and Chen

Where vG and vi are local and global features respectively. Ri is the class of a local patch. In the

proposed research, we will investigate two types of R. One classification defines ROI and Non-

ROI types which is the same with C.5 and C.6. The other classification defines several minutia

types (plus Non-ROI) such as termination minutia and bifurcation minutia. z is a normalizing

constant. Based on this graphical model, we will be able to use contextual knowledge to facilitate

the classification of a local image. We also plan to develop a more advanced model which will

use local information to facilitate the fingerprint type classification. We expect that this kind of

approach will lead to a more effective automatic system that can perform both top-down

inference (fingerprint types to minutia types) and bottom-up inference (minutia types to

fingerprint types).


C.8. Summary of quantitative approaches

   (Tom writes)

   General themes:

   Incorporate expert knowledge

   Links between global and local structure made possible by input from experts

   Specification of elemental basis or feature set

   Classifying informativeness of regions

   Defining an intermediate level between low-level feature extractors and high-level gist or
configural information


                          D. Implications for knowledge and practice

   The implications of the knowledge gained by the results of these studies and analyses falls

into four broad categories, each of which are discussed below.


D.1. Implications for quantitative understanding of the information content of fingerprints




                                             Page 26
Adding human expertise to the quantitative analysis of fingerprints       Busey and Chen

D.2. Implications for an understand of the links between quantitative information content

and the latent print examination process




D.3. Implications for the classification and filtering of poor-quality latent prints




D.4. Implications for the development of software-based tools to assist human-based latent

print examinations and training




                        E. Mangement plan and organization




                        F. Dissemination plan for project deliverables

   scientific articles, presentations at machine learning conferences and fingerprint conferences,

proof-of-concept Java-based applets.



   (end of 30 pages)




                                            Page 27
Adding human expertise to the quantitative analysis of fingerprints        Busey and Chen


G. Description of estimated costs

   Personnel

   The project will be co-directed by Thomas Busey and Chen Yu. We request 11 weeks of

summer support, during which time both will devote 100% of their efforts to the project.

Benefits are calculated at 19.81%. The salaries are incremented 3% per year.

   Many of the simulations will be conducted by a graduate student, who will be hired

specifically for the purposes of this project. This student, likely an advanced computer science

student with a background in cognitive science, requires a stipend, a fee remission and health

insurance. The health insurance is incremented at 5% per year.

   Subject coordination and database management will be coordinated by hourly students who

will work 20 hours/wk on the project. We will pay them $10/hr.

   Consultant

   John Vanderkolk, with whom Busey has worked with for the past two years, has agreed to

serve as an unpaid consultant on this grant. He does require modest travel costs when he visits

Bloomington.

   Travel

   Money is requested to bring in four experts for testing using the eyemovement recording

equipment. These costs will total approximately $1500/yr.
   Money is requested for three conferences a year. These will enable the investigators to travel

to conferences such as Neural Information Processing (NIPS) and forensic science conferences

such as the International Association for Identification (IAI) to interact with colleagues and share

the results of our analyses. These trips serve an important role in communicating the efforts of

this grant to a wider audience.

   Other Costs

   Equipment

   This research is very computer-intensive, and thus we require a large UNIX-based server to

run simulations in parallel. In addition, we require three pc-based workstations to run Matlab and

                                              Page 28
Adding human expertise to the quantitative analysis of fingerprints         Busey and Chen

other simulations programs. Finally, conferences such as IAI and local Society for Identification

meetings provide an ideal place to gather data from experts, and thus we require a portable

computer for such onsite data-gathering purposes. We anticipate that up to half of our data can

be collected using these on-site techniques, and this technique is preferable because we have

control over the monitor and software. Thus the laptop computer represents a good investiment

in the success of the project.

   Other costs

   The graduate student line requires a fee remission each year. The fee remission is

incremented at 5% per year.

   The results of our studies require resources to reach a wide audience, and thus we require

dissemination costs to cover the costs of publication and web-based dissemination.

   This project is highly image-intensive, and we require money to purchase image-processing

software and upgrades. These include software packages such as Adobe Photoshop, as well as

new image processing packages as they become available.

   We will test 80 subjects a year to obtain the necessary data for use in our statistical

applications. Each subject requires $20 for the approximate 90 minute testing period.

   The project will consume supplies of approximately $100/month, for items such as backups,

power supplies, etc.
   Indirect Costs

   The indirect rate negotiated between Indiana University and the federal government is set at

51.5%. This rate is assessed against all costs except the fee remission. This was negotiated with

DHHS on 5.14.04


                                 G. Staffing plan and Resources

   Both Busey and Chen maintain laboratories in the Department of Psychololgy at Indiana

University that each contain approximately 700 sq. feet of space. These have subject running

rooms, offices and spaces for servers. Chen's lab contains an eyemovement recording setup that

is sufficent for the eyemovement porition of the experiments. Both investigators have offices in

                                              Page 29
Adding human expertise to the quantitative analysis of fingerprints         Busey and Chen

the Psychology department as well.

   We will recruit a graduate student from the Computer Science or Psychology programs at

Indiana University. This student must have experience with machine learning algorythms at a

theoretical level, and also be an expert programmer. They will work 20 hrs/wk. We will also

recruit two hourly undergraduate students to coordinate the subject running, data analysis and

server maintainance. They will also be responsible for managing the data repository site where

our data will be accessible by other reserachers who wish to integrate human expert knowledge

into their networks.

   The bulk of the theoretical work will be handled by Chen and Busey, while the graduate

student will work in impliemnation and model testing.


                                             H. Timeline

   This is a multi-year project that is designed to alternate between acquiring human data and

using it to refine the quantitative analyses of latent and inked prints.



   Year 1: Acquire necessary fingerprint databases. Begin testing 80 experts on 72 different

latent/inked print pairs. Program Support Vector and Global Local models. Test 2 experts on the

eyemovement equipment using all 72 prints.

   Year 2: Test an additional 80 experts on 72 new latent/inked prints. Begin model fitting and
refinement. Test 2 experts on the eyemovement equipment using all 72 prints. Compare results

from eyemovement studies and moving window studies.

   Year 3: Test the final 80 experts on 72 new latent/inked prints. Develop new versions of

statistical models based on prior results. Put entire database online for use by other researchers.

Disseminate results to peer-reviewed journals.




                                               Page 30

More Related Content

What's hot

Inverse Modeling for Cognitive Science "in the Wild"
Inverse Modeling for Cognitive Science "in the Wild"Inverse Modeling for Cognitive Science "in the Wild"
Inverse Modeling for Cognitive Science "in the Wild"
Aalto University
 
ATTENTION-BASED DEEP LEARNING SYSTEM FOR NEGATION AND ASSERTION DETECTION IN ...
ATTENTION-BASED DEEP LEARNING SYSTEM FOR NEGATION AND ASSERTION DETECTION IN ...ATTENTION-BASED DEEP LEARNING SYSTEM FOR NEGATION AND ASSERTION DETECTION IN ...
ATTENTION-BASED DEEP LEARNING SYSTEM FOR NEGATION AND ASSERTION DETECTION IN ...
ijaia
 
IRJET- Opinion Targets and Opinion Words Extraction for Online Reviews wi...
IRJET-  	  Opinion Targets and Opinion Words Extraction for Online Reviews wi...IRJET-  	  Opinion Targets and Opinion Words Extraction for Online Reviews wi...
IRJET- Opinion Targets and Opinion Words Extraction for Online Reviews wi...
IRJET Journal
 
A Novel Voice Based Sentimental Analysis Technique to Mine the User Driven Re...
A Novel Voice Based Sentimental Analysis Technique to Mine the User Driven Re...A Novel Voice Based Sentimental Analysis Technique to Mine the User Driven Re...
A Novel Voice Based Sentimental Analysis Technique to Mine the User Driven Re...
IRJET Journal
 
Trading outlier detection machine learning approach
Trading outlier detection  machine learning approachTrading outlier detection  machine learning approach
Trading outlier detection machine learning approach
EditorIJAERD
 
Bayesian Networks and Association Analysis
Bayesian Networks and Association AnalysisBayesian Networks and Association Analysis
Bayesian Networks and Association Analysis
Adnan Masood
 
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
Aalto University
 
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
IJECEIAES
 
Report of Previous Project by Yifan Guo
Report of Previous Project by Yifan GuoReport of Previous Project by Yifan Guo
Report of Previous Project by Yifan GuoYifan Guo
 
Methods for Sentiment Analysis: A Literature Study
Methods for Sentiment Analysis: A Literature StudyMethods for Sentiment Analysis: A Literature Study
Methods for Sentiment Analysis: A Literature Study
vivatechijri
 
Computational Rationality I - a Lecture at Aalto University by Antti Oulasvirta
Computational Rationality I - a Lecture at Aalto University by Antti OulasvirtaComputational Rationality I - a Lecture at Aalto University by Antti Oulasvirta
Computational Rationality I - a Lecture at Aalto University by Antti Oulasvirta
Aalto University
 
Unsupervised Distance Based Detection of Outliers by using Anti-hubs
Unsupervised Distance Based Detection of Outliers by using Anti-hubsUnsupervised Distance Based Detection of Outliers by using Anti-hubs
Unsupervised Distance Based Detection of Outliers by using Anti-hubs
IRJET Journal
 
Pca seminar final report
Pca seminar final reportPca seminar final report
Probabilistic Interestingness Measures - An Introduction with Bayesian Belief...
Probabilistic Interestingness Measures - An Introduction with Bayesian Belief...Probabilistic Interestingness Measures - An Introduction with Bayesian Belief...
Probabilistic Interestingness Measures - An Introduction with Bayesian Belief...
Adnan Masood
 
Collnet _Conference_Turkey
Collnet _Conference_TurkeyCollnet _Conference_Turkey
Collnet _Conference_TurkeyGohar Feroz Khan
 
Collnet turkey feroz-core_scientific domain
Collnet turkey feroz-core_scientific domainCollnet turkey feroz-core_scientific domain
Collnet turkey feroz-core_scientific domainHan Woo PARK
 
Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...
Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...
Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...
csandit
 

What's hot (17)

Inverse Modeling for Cognitive Science "in the Wild"
Inverse Modeling for Cognitive Science "in the Wild"Inverse Modeling for Cognitive Science "in the Wild"
Inverse Modeling for Cognitive Science "in the Wild"
 
ATTENTION-BASED DEEP LEARNING SYSTEM FOR NEGATION AND ASSERTION DETECTION IN ...
ATTENTION-BASED DEEP LEARNING SYSTEM FOR NEGATION AND ASSERTION DETECTION IN ...ATTENTION-BASED DEEP LEARNING SYSTEM FOR NEGATION AND ASSERTION DETECTION IN ...
ATTENTION-BASED DEEP LEARNING SYSTEM FOR NEGATION AND ASSERTION DETECTION IN ...
 
IRJET- Opinion Targets and Opinion Words Extraction for Online Reviews wi...
IRJET-  	  Opinion Targets and Opinion Words Extraction for Online Reviews wi...IRJET-  	  Opinion Targets and Opinion Words Extraction for Online Reviews wi...
IRJET- Opinion Targets and Opinion Words Extraction for Online Reviews wi...
 
A Novel Voice Based Sentimental Analysis Technique to Mine the User Driven Re...
A Novel Voice Based Sentimental Analysis Technique to Mine the User Driven Re...A Novel Voice Based Sentimental Analysis Technique to Mine the User Driven Re...
A Novel Voice Based Sentimental Analysis Technique to Mine the User Driven Re...
 
Trading outlier detection machine learning approach
Trading outlier detection  machine learning approachTrading outlier detection  machine learning approach
Trading outlier detection machine learning approach
 
Bayesian Networks and Association Analysis
Bayesian Networks and Association AnalysisBayesian Networks and Association Analysis
Bayesian Networks and Association Analysis
 
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
User Interfaces that Design Themselves: Talk given at Data-Driven Design Day ...
 
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
Insights to Problems, Research Trend and Progress in Techniques of Sentiment ...
 
Report of Previous Project by Yifan Guo
Report of Previous Project by Yifan GuoReport of Previous Project by Yifan Guo
Report of Previous Project by Yifan Guo
 
Methods for Sentiment Analysis: A Literature Study
Methods for Sentiment Analysis: A Literature StudyMethods for Sentiment Analysis: A Literature Study
Methods for Sentiment Analysis: A Literature Study
 
Computational Rationality I - a Lecture at Aalto University by Antti Oulasvirta
Computational Rationality I - a Lecture at Aalto University by Antti OulasvirtaComputational Rationality I - a Lecture at Aalto University by Antti Oulasvirta
Computational Rationality I - a Lecture at Aalto University by Antti Oulasvirta
 
Unsupervised Distance Based Detection of Outliers by using Anti-hubs
Unsupervised Distance Based Detection of Outliers by using Anti-hubsUnsupervised Distance Based Detection of Outliers by using Anti-hubs
Unsupervised Distance Based Detection of Outliers by using Anti-hubs
 
Pca seminar final report
Pca seminar final reportPca seminar final report
Pca seminar final report
 
Probabilistic Interestingness Measures - An Introduction with Bayesian Belief...
Probabilistic Interestingness Measures - An Introduction with Bayesian Belief...Probabilistic Interestingness Measures - An Introduction with Bayesian Belief...
Probabilistic Interestingness Measures - An Introduction with Bayesian Belief...
 
Collnet _Conference_Turkey
Collnet _Conference_TurkeyCollnet _Conference_Turkey
Collnet _Conference_Turkey
 
Collnet turkey feroz-core_scientific domain
Collnet turkey feroz-core_scientific domainCollnet turkey feroz-core_scientific domain
Collnet turkey feroz-core_scientific domain
 
Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...
Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...
Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...
 

Viewers also liked

Презентация УИС для учебных заведений
Презентация УИС для учебных заведенийПрезентация УИС для учебных заведений
Презентация УИС для учебных заведенийsoftmotions
 
Kehidupan anak kos tugas AGAMA ISLAM
Kehidupan anak kos tugas AGAMA ISLAMKehidupan anak kos tugas AGAMA ISLAM
Kehidupan anak kos tugas AGAMA ISLAM
MuchFahmi
 
Kozmetika ETANI akcia 25perc. zlava na vsetko http://www.etani.sk/sk/znizena...
Kozmetika ETANI  akcia 25perc. zlava na vsetko http://www.etani.sk/sk/znizena...Kozmetika ETANI  akcia 25perc. zlava na vsetko http://www.etani.sk/sk/znizena...
Kozmetika ETANI akcia 25perc. zlava na vsetko http://www.etani.sk/sk/znizena...
Libor Slosar
 
MAY DAY?... (“RAMANUJAM INDUSTRY”)
MAY DAY?... (“RAMANUJAM INDUSTRY”)MAY DAY?... (“RAMANUJAM INDUSTRY”)
MAY DAY?... (“RAMANUJAM INDUSTRY”)
IJERD Editor
 
Diplomna_prezentaciq
Diplomna_prezentaciqDiplomna_prezentaciq
Diplomna_prezentaciqDamyan Ganev
 
Ser Mujer Con Epilepsia
Ser Mujer Con EpilepsiaSer Mujer Con Epilepsia
Ser Mujer Con Epilepsiaguest59c58a4
 
день геолога 2010
день геолога 2010день геолога 2010
день геолога 2010
guest203e43
 
Carybé
CarybéCarybé
Carybé
erm3
 
孤獨與品味 (Rev1)
孤獨與品味 (Rev1)孤獨與品味 (Rev1)
孤獨與品味 (Rev1)花東宏宣
 
ครูไทยหัวใจไอที1
ครูไทยหัวใจไอที1ครูไทยหัวใจไอที1
ครูไทยหัวใจไอที1
guest3114116
 
農業委員會:「農藥管理法」部分條文修正草案
農業委員會:「農藥管理法」部分條文修正草案農業委員會:「農藥管理法」部分條文修正草案
農業委員會:「農藥管理法」部分條文修正草案
R.O.C.Executive Yuan
 
PMDC NEB Step-1 (Review of abdominal contents)-day-7
PMDC NEB Step-1 (Review of abdominal contents)-day-7PMDC NEB Step-1 (Review of abdominal contents)-day-7
PMDC NEB Step-1 (Review of abdominal contents)-day-7
DrSaeed Shafi
 

Viewers also liked (14)

Презентация УИС для учебных заведений
Презентация УИС для учебных заведенийПрезентация УИС для учебных заведений
Презентация УИС для учебных заведений
 
Kehidupan anak kos tugas AGAMA ISLAM
Kehidupan anak kos tugas AGAMA ISLAMKehidupan anak kos tugas AGAMA ISLAM
Kehidupan anak kos tugas AGAMA ISLAM
 
Kozmetika ETANI akcia 25perc. zlava na vsetko http://www.etani.sk/sk/znizena...
Kozmetika ETANI  akcia 25perc. zlava na vsetko http://www.etani.sk/sk/znizena...Kozmetika ETANI  akcia 25perc. zlava na vsetko http://www.etani.sk/sk/znizena...
Kozmetika ETANI akcia 25perc. zlava na vsetko http://www.etani.sk/sk/znizena...
 
MAY DAY?... (“RAMANUJAM INDUSTRY”)
MAY DAY?... (“RAMANUJAM INDUSTRY”)MAY DAY?... (“RAMANUJAM INDUSTRY”)
MAY DAY?... (“RAMANUJAM INDUSTRY”)
 
Diplomna_prezentaciq
Diplomna_prezentaciqDiplomna_prezentaciq
Diplomna_prezentaciq
 
Ser Mujer Con Epilepsia
Ser Mujer Con EpilepsiaSer Mujer Con Epilepsia
Ser Mujer Con Epilepsia
 
день геолога 2010
день геолога 2010день геолога 2010
день геолога 2010
 
Carybé
CarybéCarybé
Carybé
 
孤獨與品味 (Rev1)
孤獨與品味 (Rev1)孤獨與品味 (Rev1)
孤獨與品味 (Rev1)
 
ครูไทยหัวใจไอที1
ครูไทยหัวใจไอที1ครูไทยหัวใจไอที1
ครูไทยหัวใจไอที1
 
VSS_Analyst
VSS_AnalystVSS_Analyst
VSS_Analyst
 
INTS3350_SQ_CC
INTS3350_SQ_CCINTS3350_SQ_CC
INTS3350_SQ_CC
 
農業委員會:「農藥管理法」部分條文修正草案
農業委員會:「農藥管理法」部分條文修正草案農業委員會:「農藥管理法」部分條文修正草案
農業委員會:「農藥管理法」部分條文修正草案
 
PMDC NEB Step-1 (Review of abdominal contents)-day-7
PMDC NEB Step-1 (Review of abdominal contents)-day-7PMDC NEB Step-1 (Review of abdominal contents)-day-7
PMDC NEB Step-1 (Review of abdominal contents)-day-7
 

Similar to DOJProposal7.doc

DOJProposal7.doc
DOJProposal7.docDOJProposal7.doc
DOJProposal7.docbutest
 
Graph embedding approach to analyze sentiments on cryptocurrency
Graph embedding approach to analyze sentiments on cryptocurrencyGraph embedding approach to analyze sentiments on cryptocurrency
Graph embedding approach to analyze sentiments on cryptocurrency
IJECEIAES
 
Deep_Learning_Innovations_In_Facial_Analysis
Deep_Learning_Innovations_In_Facial_AnalysisDeep_Learning_Innovations_In_Facial_Analysis
Deep_Learning_Innovations_In_Facial_Analysis
KrishnaMargaliGopara
 
Texture Analysis As An Aid In CAD And Computational Logic
Texture Analysis As An Aid In CAD And Computational LogicTexture Analysis As An Aid In CAD And Computational Logic
Texture Analysis As An Aid In CAD And Computational Logic
iosrjce
 
A017350106
A017350106A017350106
A017350106
IOSR Journals
 
Cyber bullying detection and analysis.ppt.pdf
Cyber bullying detection and analysis.ppt.pdfCyber bullying detection and analysis.ppt.pdf
Cyber bullying detection and analysis.ppt.pdf
Hunais Abdul Nafi
 
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...
MITAILibrary
 
Developing cognitive applications v1
Developing cognitive applications v1Developing cognitive applications v1
Developing cognitive applications v1
Harsha Srivatsa
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
Editor IJCATR
 
Prospects of Deep Learning in Medical Imaging
Prospects of Deep Learning in Medical ImagingProspects of Deep Learning in Medical Imaging
Prospects of Deep Learning in Medical Imaging
Godswll Egegwu
 
A Study on Face Expression Observation Systems
A Study on Face Expression Observation SystemsA Study on Face Expression Observation Systems
A Study on Face Expression Observation Systems
ijtsrd
 
top journals
top journalstop journals
top journals
rikaseorika
 
Phan cl-data scientist-1 july-2016
Phan cl-data scientist-1 july-2016Phan cl-data scientist-1 july-2016
Phan cl-data scientist-1 july-2016
eknowledgediscovery
 
Phan cl-data scientist-1 july-2016
Phan cl-data scientist-1 july-2016Phan cl-data scientist-1 july-2016
Phan cl-data scientist-1 july-2016
eknowledgediscovery
 
Pattern recognition
Pattern recognitionPattern recognition
Pattern recognition
mohmedahmed23
 
Product Analyst Advisor
Product Analyst AdvisorProduct Analyst Advisor
Product Analyst Advisor
IRJET Journal
 
Human Activity Recognition (HAR) Using Opencv
Human Activity Recognition (HAR) Using OpencvHuman Activity Recognition (HAR) Using Opencv
Human Activity Recognition (HAR) Using Opencv
IRJET Journal
 
Understanding The Pattern Of Recognition
Understanding The Pattern Of RecognitionUnderstanding The Pattern Of Recognition
Understanding The Pattern Of Recognition
Rahul Bedi
 
Study on Different Human Emotions Using Back Propagation Method
Study on Different Human Emotions Using Back Propagation MethodStudy on Different Human Emotions Using Back Propagation Method
Study on Different Human Emotions Using Back Propagation Method
ijiert bestjournal
 
Data Science Demystified_ Journeying Through Insights and Innovations
Data Science Demystified_ Journeying Through Insights and InnovationsData Science Demystified_ Journeying Through Insights and Innovations
Data Science Demystified_ Journeying Through Insights and Innovations
Vaishali Pal
 

Similar to DOJProposal7.doc (20)

DOJProposal7.doc
DOJProposal7.docDOJProposal7.doc
DOJProposal7.doc
 
Graph embedding approach to analyze sentiments on cryptocurrency
Graph embedding approach to analyze sentiments on cryptocurrencyGraph embedding approach to analyze sentiments on cryptocurrency
Graph embedding approach to analyze sentiments on cryptocurrency
 
Deep_Learning_Innovations_In_Facial_Analysis
Deep_Learning_Innovations_In_Facial_AnalysisDeep_Learning_Innovations_In_Facial_Analysis
Deep_Learning_Innovations_In_Facial_Analysis
 
Texture Analysis As An Aid In CAD And Computational Logic
Texture Analysis As An Aid In CAD And Computational LogicTexture Analysis As An Aid In CAD And Computational Logic
Texture Analysis As An Aid In CAD And Computational Logic
 
A017350106
A017350106A017350106
A017350106
 
Cyber bullying detection and analysis.ppt.pdf
Cyber bullying detection and analysis.ppt.pdfCyber bullying detection and analysis.ppt.pdf
Cyber bullying detection and analysis.ppt.pdf
 
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...
 
Developing cognitive applications v1
Developing cognitive applications v1Developing cognitive applications v1
Developing cognitive applications v1
 
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
A Comparative Study of Various Data Mining Techniques: Statistics, Decision T...
 
Prospects of Deep Learning in Medical Imaging
Prospects of Deep Learning in Medical ImagingProspects of Deep Learning in Medical Imaging
Prospects of Deep Learning in Medical Imaging
 
A Study on Face Expression Observation Systems
A Study on Face Expression Observation SystemsA Study on Face Expression Observation Systems
A Study on Face Expression Observation Systems
 
top journals
top journalstop journals
top journals
 
Phan cl-data scientist-1 july-2016
Phan cl-data scientist-1 july-2016Phan cl-data scientist-1 july-2016
Phan cl-data scientist-1 july-2016
 
Phan cl-data scientist-1 july-2016
Phan cl-data scientist-1 july-2016Phan cl-data scientist-1 july-2016
Phan cl-data scientist-1 july-2016
 
Pattern recognition
Pattern recognitionPattern recognition
Pattern recognition
 
Product Analyst Advisor
Product Analyst AdvisorProduct Analyst Advisor
Product Analyst Advisor
 
Human Activity Recognition (HAR) Using Opencv
Human Activity Recognition (HAR) Using OpencvHuman Activity Recognition (HAR) Using Opencv
Human Activity Recognition (HAR) Using Opencv
 
Understanding The Pattern Of Recognition
Understanding The Pattern Of RecognitionUnderstanding The Pattern Of Recognition
Understanding The Pattern Of Recognition
 
Study on Different Human Emotions Using Back Propagation Method
Study on Different Human Emotions Using Back Propagation MethodStudy on Different Human Emotions Using Back Propagation Method
Study on Different Human Emotions Using Back Propagation Method
 
Data Science Demystified_ Journeying Through Insights and Innovations
Data Science Demystified_ Journeying Through Insights and InnovationsData Science Demystified_ Journeying Through Insights and Innovations
Data Science Demystified_ Journeying Through Insights and Innovations
 

More from butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

More from butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

DOJProposal7.doc

  • 1. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen PROGRAM NARRATIVE A. Research Question Machine learning algorithms take a number of approaches to the quantitative analysis of fingerprints. These include identifying and matching minutiae (refs), matching patterns of local orientation based on dynamic masks (refs), and neural network approaches that attempt to learn the structure of fingerprints (refs). While these techniques provide good results in biometric applications and serve a screening role in forensic cases, they are less useful when applied to severely degraded fingerprints, which must be matched by human experts. Indeed, statistical approaches and human experts have different strengths. Despite the enormous computational power available today for use by computer analysis systems, the human visual system remains unequaled in its flexibility and pattern recognition abilities. Three possible reasons for this success come from the experts knowledge of where the most important regions are located on a particular set of prints, the ability to tune their visual systems to specific features, and the integration of information across different features. In the present project, we propose to integrate the knowledge of experts into the quantitative analysis of fingerprints to a degree not achieved by other approaches. There is much that fingerprint examiners can add to machine learning algorithms and, as we describe below, many ways in which statistical learning algorithms can assist human experts. Thus the central research question of this proposal is: How can the integration of information derived from experts improve the quantitative analysis of fingerprints? B. Research goals and objectives The goal of the present proposal is to integrate data from human experts with statistical learning algorithms to improve the quantitative analysis of inked and latent prints. We introduce a novel procedure developed by one investigator (Tom Busey) and use it to guide the input to statistical learning algorithms developed and extended by our other investigator (Chen Yu). The fundamental idea behind our approach is that the quantitative evaluation of the information Page 1
  • 2. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen contained in latent and inked prints can be vastly improved by using elements of human expertise to assist the statistical modeling, as well as to introduce a new dimension of time that is not contained in the static latent print analysis. The main benefit, as we discuss in sections C.x.x, is that the format of the data extracted from experts allows the application of novel quantitative models that are adapted from related areas. To apply this knowledge derived from experts, we will use our backgrounds in vision, perception, machine learning and behavioral testing to design experiments that extract relevant information from experts and use this to improve the quantitative analysis techniques applied to fingerprints by integrating the two sources of information. Our research interests differ somewhat from the existing approaches and reflects the adaptations that are necessary to incorporate human expert knowledge. Existing statistical algorithms developed to match fingerprints rely on several different classes of algorithms, Some extract minutiae and other robust sources of information such as the number of ridges between minutiae (refs). Others rely on the computation of local curvature of the ridges, and then partition these into different classes (MASK refs). Virtually all approaches make reasoned and reasonable guesses as to what the important sources of information might be, such as minutiae, local ridge orientation or local ridge width (dgs paper). The present approach takes a more agnostic approach to what might be the important sources of information in fingerprints, and we will develop statistical models that take advantage of the data derived from experts. However, a major goal of the grant is to demonstrate how expert knowledge can be applied to any extant model, and to suggest how this might be accomplished. Thus we will spend substantial time documenting our application of expert knowledge for our statistical models. In addition, we will make all of our expert data available for other researchers and practitioners. It is likely that the data will have implications for training, although this is not the focus of the present proposal. C. Research design and methods At the heart of our approach is idea that human expertise, properly represented, can improve the quantitative analyses of fingerprints. In a later section we describe how we apply human Page 2
  • 3. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen expert knowledge to various statistical analyses, but first we need to answer the question of whether human experts can add something to the quantitative analyses of prints. The answer to this question can be broken down into two parts. First, do human visual systems in general possesses attributes not captured by current statistical approaches, and second, do human experts have additional capacities not shared by novices, capacities that could further inform statistical approaches. Below we briefly summarize what the visual science literature tells us about how humans recognize patterns, and then describe our own work that has addressed the differences between experts and novices. As we will show, human experts have much to add to quantitative approaches. We should stress that while we will gather data from human experts to improve our quantitative analyses of fingerprints, the goal of this grant is not to study human experts in order to determine whether or how they differ from novices, nor are we interested in questions about the reliability or accuracy of human experts. Instead, we will generalize our previous results that demonstrate strong differences in the visual processing of fingerprints in experts, and apply this expertise to our own statistical analyses. As a result, we will only gather data from human experts (latent print examiners with at least 5 years of post-apprentice work in the field) under the assumption that this will provide maximum improvement to our statistical methods. We can demonstrate the effectiveness of this knowledge by simply re-running the statistical analyses without the benefit of knowledge from experts. There are various metrics attached to each analysis technique that demonstrate the superiority of expert-enhanced analyeses, such as the correct recognition/false recognoition tradeoff graphs, or the dimensionality reduction/reconstruction successes of data reduction techniques. We will also apply novel approaches adapted from the related domain of language analyses. It might seem odd to apply techniques developed for linguistic analyses to a visual domain such as pattern recognition, but the principles that underlie both domains are very similar. Both involve large numbers of features that have complex statistical relations. In the case of language, the features are often words, phonemes or other acoustical signals. Fingerprints are defined by a Page 3
  • 4. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen complex but very regular dictionary of features that also share a complex and meaningful correlational structure. One of us (Chen) is a highly-published expert in the field machine learning algorithms as applied to multimodal data, and several papers inlcuded as appendicies detail this expertisze. His work on multimodal applications between visual and auditory domains make him well-suited to address the relation between human data and machnie leanring algorythms. Both linguistic and visual informaiton contain highly-structured data that consist of regularities that are extracted by perceivers, and this is not unlike the temporal sequence that experts go through when they perform a latent print examination, as we describe in a later section. First, however, we address how we might document the principles of human expertise. Can we use elements of the human visual system to improve our statistical analyses? The answer to this question is straightforward, in part because of the overwhelming evidence that human-based recognition systems contain processes that are not captured by current statistical approaches. One of us (Busey) has published many articles addressing different aspects of human sensation, perception and cognition, and thus is well-suited to manage the acquisition and application of human expertise to statistical approaches. Below we briefly summarize the properties of the human visual system and in a later section we describe how we plan to extract fundamental principles from this design in order to improve our statistical analyses of fingerprints. An analyses of the human visual system by vision scientists demonstrates that the recognition process proceeds via an hierarchical series of stages, each with important non-linearities (nature ref), that produce areas that respond to objects of greater and greater complexity. This process also provides increasing spatial independence, allowing brain areas to integrate over larger and larger regions. This will become important for holistic or configural processing, as discussed in a later section. (also talk about feature-based attention) A second benefit of this hierarchical approach is that objects achieve limited scale and contrast invariance. Statistical approaches often deal with this through local contrast or brightness normalization, but this is a separate process. Scale invariance is often achieved by Page 4
  • 5. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen explicitly measuring the width of ridges (grayscale ref), again a separate process. A third strength of the human visual system is that it appears to have the ability to form new feature templates through an analyses of the statistical information contained in the fingerprints. This process, called unitization, will tend to improve feature detection in noisy environments as is often found with latent prints. Do forensic scientists have visual capabilities not shared by novices? The prior summary of the elements of the human visual system suggests that current statistical approaches can be improved by adapting some of the principles underlying the human visual system. There are, however, other processes that are specifically developed by latent print examiners that may also be profitably applied to statistical models. Below we summarize the results of two empirical studies that have recently been published in the highly respected journal Vision Research (Busey & Vanderkolk, 2005). The results demonstrate not only that experts are better than novices, but suggest the nature of the processes that produce this superior performance. Visual expertise takes many forms. It could be different for different parts of the identification process, and may not even be verbalizable by the expert since many elements of perceptual expertise remain cognitively impenetrable (refs). A major focus of our research is to capture elements of this expertise and use this as a training signal for our statistical learning algorithms. What is novel to our approach is our ability to capture the expertise at a very deep and rich level. In the next section we describe our prior work documenting the nature of the processes that enable experts to perform at levels much superior to novices, and then in Section C.2 we describe how we capture this expertise in a way that we can use it to improve our statistical learning algorithms. C.1. Documenting expertise in human latent print examiners Initially, experts tend to focus on the entire print, which leads to benefits that we have previously identified as configural processing (Busey & Vanderkolk, 2005). Configural Page 5
  • 6. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen processing takes several forms, but the basic idea behind this process is that instead of focusing on individual features or minutiae, the observer instead integrates information over a large region, to identify important relations such as relative locations of features or curvature of ridge flow. Fingerprint examiners often talk about 'viewing the image in its totality', which is different language for the same process. While configural processing reveals the overall structure of an image and selects important regions for further inspection, the real work comes in comparing small regions in one print to regions in the other. These regions may be selected on the basis of minutiae identified in the print, or high-quality Level 3 detail. We know from related work on perceptual learning in the visual system that one of the processes by which expertise develops is through the development of new feature detectors. Experts spend a great deal of time viewing prints, and this has the potential to result in profound changes in how their visual systems process fingerprints. (config processing refs) One process by which experts could improve how they extract latent print information from noisy prints is termed unitization, in which novel feature detector are created through experience (unitization refs). Fingerprints contain remarkable regularities and the human visual system C.1.a. Do experts have information valuable to training networks or documenting the quantitative nature of fingerprints? Fingerprint examiners have received almost no attention in the perceptual learning or expertise literatures, and thus the PI began a series of studies in consultation with John Vanderkolk, of the Indiana State Police Forensic Sciences Laboratory in Fort Wayne, Indiana. Our first study addressed the nature of the expertise effects in a behavioral experiment, and then we followed up evidence for configural processing with an electrophysiological study. The discussion below describes the experiments in some detail, in part because extensions of this work are proposed in Section D, and a complete description here illustrates the technical rigor and converging methods of our approach. Page 6
  • 7. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen C.1.b. Behavioral evidence for configural processing In our first experiment, we abstracted Study 1 Sec what we felt were the essential elements of the fingerprint examination process into an X-AB task that could be accomplished in relatively short order. Mas 200 o Milli This work is described in Busey and Vanderkolk (2005), but we briefly describe the methods here since they illustrate how our approach seeks to find a paradigms that is less time-consuming Test Until than fully realistic forensic examinations Figure 1. Sequence of events in a behavioral experiment with fingerprint experts and novices. Note that the study image has a (which can take hours to days to different orientation and is slightly brighter to reduce reliance on low-level cues. complete) yet still maintains enough ecological validity to tap the expertise of the examiners. Figure 1 shows the stimuli used in the experiment as well as a timeline of one trial. We cropped out fingerprint fragments from inked prints, grouped them into pairs, and briefly presented one of the two for 1 second. This was followed by a mask for either 200 or 5200 ms, and then the expert or novice subject made a forced-choice response indicating which of the two test prints they believe was shown at study. We introduced orientation and brightness jitter at study, and the construction of the pairs was done to reduce the reliance on idiosyncratic features such as lint or blotches. At test, we introduced two manipulations that we thought captured aspects of latent prints, as shown in Figure 2. First, latent prints are often embedded in visual noise from the texture of the surface, dust, and other sources. One expert, in describing how he approached latent prints, stated that his job was to 'see through the noise.' To simulate at least elements of this noise, we embedded half of our test prints in white visual noise. While this may have a spatial distribution Page 7
  • 8. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen that differs from the noise typically encountered by experts, we hoped that it would tap whatever facilities experts may have developed to deal with noise. Clear Fragm Partially The second manipulation was motivated by the observation that latent prints are rarely complete copies of their Fragments Partially Presented in Present inked counterparts. They often appear Figure 2. Four types of test trials. patchy if made on an irregular surface, and sections may be partially masked out. To simulate this, we created partially-masked fingerprint fragments as shown in the upper-right panel of Figure 2. Note that the partially- masked print and its complement each contain exactly half of the information of the full print and the full print can be recovered by summing the two partial prints pixel-by-pixel. We use this property to test for configural effects as described in a later section. All three manipulations (delay between study and test, added noise and partial masking) were fully crossed to create 8 conditions. The data is shown in Figure 3, which show Experts- Short Delay Experts- Long Delay 1.0 1.0 main effects for all three factors for 0.9 0.9 0.8 0.8 Percent Correct Percent Correct novices. Somewhat surprising is the 0.7 0.7 0.6 0.6 No Noise No Noise Noise Added Noise Added finding that while experts show effects 0.5 0.5 Full Image Partial Image Full Image Partial Image Image Type Image Type of added noise and partial masking, they Novices- Short Delay Novices- Long Delay show no effect of delay, which suggests 1.0 1.0 0.9 0.9 that they are able to re-code their visual 0.8 0.8 Percent Correct Percent Correct 0.7 0.7 information into a more durable store 0.6 0.6 No Noise Noise Added No Noise 0.5 0.5 Noise Added resistant to decay, or have better visual Full Image Partial Image Full Image Partial Image Image Type Image Type memories. Experts also show an interaction between added noise and Figure 3. Behavioral Experiment Data. Error bars represent one standard error of the mean (SEM). Page 8
  • 9. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen partial masking, but novices do not. This interaction seen with the experts may result from very strong performance for full images embedded in noise, and may result from configural processes. To test this in a scale-invariant manner, we developed a multinomial model which makes a prediction for full-image performance given partial-image performance using principles similar to probability summation. The complete results are found in Busey & Vanderkolk (2005), but to summarize, when partial image performance is around 65%, the model predicts full image performance to be about 75%, and it is almost at 90%, significantly above the probability summation prediction. Thus it appears that when both halves of an image are present (as in the full image) experts are much more efficient at extracting information from each half. The results of this experiment lay the groundwork for a more complete investigation of perceptual expertise in fingerprint examiners. From this work we have evidence that: 1) Experts perform much better than novices overall, despite the fact that the testing conditions were time-limited and somewhat different than those found in a traditional latent print examination. 2) Experts appear immune to longer delays between study and test images, suggesting better information re-coding strategies and/or better visual memories 3) Experts may have adopted configural processing abilities over the course of their training and practice. All observers have similar facilities for faces as a consequence the ecological importance of faces and our quotidian exposure as a result of social interactions. Experts may have extended this ability to the domain of fingerprints, since configural processing is seen as one mechanism underlying expertise (e.g. Gauthier & Tarr, 1997). C.1.c. Electrophysiological evidence for configural processing To provide converging evidence that fingerprint experts process full fingerprints configurally, we turned to an electrophysiological paradigm based on work from the face recognition literature. This experiment is described more fully in Busey and Vanderkolk (2005), which is included as an appendix. However, these results support the prior conclusions described above, and demonstrate that the configural processing observed with fingerprint examiners is a Page 9
  • 10. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen result of profound and qualitative changes that occur in the very earliest stages of their perceptual processing of fingerprints. C.2. Elements of human expertise that could improve quantitative analyses The two studies described above are important because they illustrate that configural information is one process that could be adapted for use in the quantitative analyses of fingerprints. Existing quantitative models of fingerprints incorporate some elements of the expertise seen above, but many elements could be added that would improve the recognition accuracy of existing programs. The two major approaches to fingerprint matching rely on local features such as minutiae detection (refs), and more global approaches such as dynamic masks applied to orientation computed at many locations on a grid overlaying the print (refs). Of these two approaches, the dynamic mask approach comes closer to the idea of configural processing, although it does not compute minutiae directly. strengthen this intro Neither approach takes advantage of the temporal information that expresses elements of expertise in the human matching process. Quantitative information such as fingerprint data, when represented in pixel form, has a highly-dimensional structure. The two techniques described above reduce this dimensionality by either extracting salient points such as minutiae, or computing orientation only at discrete locations. Both of these approaches throw out a great deal of information that could otherwise be used to train a statistical model on the elemental features that allow for matches. Part of the reason this is necessary is that the high-dimensional space is difficult to work in: all prints are more or less equally similar without this dimensionality reduction, and by reducing the dimensionality computations such as similarity become tractable. The key, then, is to reduce the dimensionality while preserving the essential features that allow for discrimination among prints. One technique that has been explored in language acquisition is the concept of "starting small" (Elman ref). In this procedure, machine learning approaches such as neural network analyses are given very coarse information at first, which helps the network find an appropriate starting point. Gradually, more and more detail information is added, which allows the network to make finer and finer discriminations. Page 10
  • 11. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen We discuss these ideas more fully in section X.Xx, but we mention it here to motivate the empirical methods described next. Experts likely select which information they choose to initially examine based on the need to organize their search processes. Thus they likely acquire information that may not immediately indicate to a definitive conclusion of confirmation or rejection, but guides the later acquisition process. In the scene perception literature, this process is known as 'gist acquisition' (refs), and suggests that the order in which a system (machine or human) learns information matters. In the section below we describe how we acquire both spatial and temporal information from experts, and then describe how this knowledge can be incorporated into quantitative models. C.3. Capturing the information acquisition process: The moving window paradigm To identify the nature of the information used by experts, and the order in which it is gathered, we have begun to use a technique called a moving window procedure. In the sections below we describe this procedure and how it can be extended to address the role of configural or gist information in human experts. C.3.a. The moving window paradigm The moving window paradigm is a software tool that simulate the relative acuity of the fovea and peripheral visual systems. As we look around the world, there is a region of high acuity at the location our eyes are currently pointing. Regions outside the foveal viewing cone are represented less well. In the moving window paradigm we represent this state by slightly blurring the image and reducing the contrast. http://cognitrn.psych.indiana.edu/busey/FingerprintExample/ Page 11
  • 12. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen Figure 4. The moving window paradigm allows the user to move the circle of interest around to different locations on the two prints. This circle provides high-quality information, and allows the expert the opportunity to demonstrate, in a procedure that is very similar to an actual latent print examination, which sections of the prints they believe are most informative. This procedure also records the order in which different sites are visited. Figure 4 shows several frames of the moving window program, captured at different points in time. The two images have been degraded by a blurring operation that somewhat mimics the reduced representation of peripheral vision. The exception is a clear circle that responds in real time to the movement of the mouse. This dynamic display forces the user to move the clear window to regions of the display that warrant special interest. The blurred portions provide some context for where to move the window. By recording the position of the mouse each time it is Page 12
  • 13. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen moved, we can reconstruct a complete record of the manner in which the user examined the prints. This method has some drawbacks in that the eyes move faster than the mouse. However, we find that with practice the experts report very little limitations with this procedure and it has the benefit of precise spatial localization. A major benefit of this procedure is that it can be done over the web, reaching dozens of experts and producing a massive dataset. Many related information theoretic approaches such as latent semantic analysis find that a large corpus of data is necessary in order to reveal the underlying structure of the representation of information, and a web-based approach provides sufficient data. The data produced by this paradigm is vast: x/y coordinates for the clear window at each millisecond. We have begun to analyze this data using several different techniques. The first analysis we designed creates a mask that is black for regions the observer never visited and clear for areas visited most often. Figure 5 shows an example of this kind of analysis. Areas visited less often are somewhat darkened. The left panels of Figure 5 show two masked images, which shows not only where the experts visited, but how long they spent inspecting each location. Thus it represents a window into the regions the experts believed informative. The right panels give a slightly different view, where unvisited areas are represented in red. This illustrates that experts actually spend most of their time in relatively small regions of the prints. As a first pass, the images in Figure 5 reveal where the experts believe the task-relevant information resides. However, lost in such a representation is the order in which these sites were visited. In addition, this information is very specific to a particular set of print. Ultimately we will produce more general representation that characterizes both the fundamental set of features (often described as the basis set) that experts rely on, as well as how they process these features. We have begun to explore an information-theoretic approach to this problem that seeks to find a set of visual features that is common to a number of experts and fingerprint pairs. This approach is related to many of the dimensionality reduction techniques that have been applied to natural images (e.g. Olshausen & Field, 1996). Later project extend this approach to incorporate Page 13
  • 14. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen Figure 5. Examples of masked imaged revealing where experts choose to acquire information in order to make an identification. The black versions show only regions where the expert spent any time, and the mask is clearer for regions in which the expert spent more time. The right-hand images show teh same information, but allow some of the uninspected information to show through. These images reveal that experts pay relatively little attention to much of the image and only focus on regions they deem releveant for the identification. We suggest that this element of expertise, learning to attend to relevant locations, is something that coudl benefit quantitative analyes of fingerprints. elements of configural processing or context-specific models. In the present proposal we discuss several different ways we plan to analyze what is a very rich dataset. Our experts report relatively little hindrance when using the mouse to move the window. The latent and inked prints have their own window (only one is visible at any one time) and users press a key to flip back and forth between the two prints. This flip is actually faster than an eyemovement and automatically serves as a landmark pointer for each print, making this procedure almost as easy to use as free viewing of the two prints (which are often done under a loupe with its own movement complexities). In addition, we also give users brief views of the entire image to allow configural processes to work to establish the basic layout. C.3.b. Measuring the role of configural processing in latent print examinations behavioral experiment- blurred vs. very low contrast- qualitative changes across experts? complete this section Page 14
  • 15. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen C.3.c. Verification with eyemovement recording complete this section C.4. Extracting the fundamental features used when matching prints Because latent and inked prints are rarely direct copies of each other, an expert must extract invariants from each image that survive the degradations due to noise, smearing, and other transformations. Once these invariants are extracted, the possibility of a match can be assessed. This is similar in principle to the type of categorical perception observed in speech recognition, in which the invariants of parts of speech are extracted from the voices of different talkers. This suggests that there exists a set of fundamental building blocks, or basis functions, that experts use to represent and even clean up degraded prints. The nature and existence of these features are quite relevant for visual expertise, since in some sense these are the direct outcomes of any perceptual system that tunes itself to the visual diet it experiences. We propose to perform data reduction techniques on the output of the moving window paradigm. These techniques have successfully been applied to derive the statistics of natural images (Hyvarinen & Hoyer, 2000). The results provided individual features that are localized in space and resemble the response profiles of simple cells in primary visual cortex. Many of these studies are performed on random sampling of images and visual sequences, but the moving window application provides an opportunity to use these techniques to recover the dimensions of only the inspected regions, and to compare the recovered dimensions from experts and representations based on random window locations. The specifics of this technique are straightforward. For each position of the moving window, extract out (say) a 12 x 12 patch of pixels. This is repeated at each location that was inspected by the subject, with each patch weighted by the amount of time spent at each location. The moving window experiment tens of thousands of patches of pixels, which are submitted to a data reduction technique (independent component analysis, or ICA), which is similar to principle components analysis, with the exception that the components are independent, not just Page 15
  • 16. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen Figure 6. ICA components from expert data. uncorrelated. The linear decomposition generated by ICA has the property of sparseness, which has been shown to be important for representational systems (Field, 1994; Olshausen & Field, 1996) and implies that a random variable (the basis function) is active only very rarely. In practice, this sparse representation creates basis functions that are more localized in space than those captured by PCA and are more representative of the receptive fields found in the early areas of the visual system. Huge copra of samples are required to extract invariants from noisy images, and at present we have only pilot data from several experts. However, the results of this preliminary analysis can be found in Figure 6. This figure shows features discovered using the ICA algorithm (Hurri & Hyvarinen, 2003; Hyvarinen, Hoyer & Hurri, 2003). Each image represents a basis function that when linearly combined will reproduce the windows examined by experts. Inspection of Figure 6 reveals that features such as ridge endings, y-brachings and islands are beginging to become represented. This analysis takes on greater value when applied to the entire database we will gather, since it will combine across individual features to derive the invariant stimulus features that provide the basis for fingerprint examinations done by human experts. The ICA analysis is very sensitive to spatial location, and while cells in V1 are likely also highly position sensitive, the measured basis functions are properties of the entire visual stream, not just the early stages. More recent advances in ICA techniques have addressed this issue in a similar way that the visual system has chosen to solve the problem. In addition performing data Page 16
  • 17. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen Figure 7. ICA components from expert data, and grouped by energy. This analyses allows the basis functions to have partial spatial independence, at a slight cost to image quailty. This latter issue is less relevant for larger corpi when many similar features are combined by individual basis function groups. reduction techniques to extract the fundamental basis sets, these extended ICA algorithms group the recovered components based on their energy (squared outputs). This grouping has shown to produce classes of basis functions that are position invariant by virtue of the fact that they include many different positions for each fundamental feature type. The examples shown in Figure 7 were generated by this technique, which reduces the reliance on spatial location. This groups the recovered features by class and accounts for the fact that rectangles have similar properties to nearby rectangles. Note that the features in Figure 14 are less localized than those typically found with ICA decompositions, which may be due to the large correlational structure inherent in fingerprints, although this remains an open question addressed by this proposal. The development of ICA approaches is an ongoing field, and we anticipate that the results of the proposed research will help extend these models as we develop our own extensions based on the applications to fingerprint experts. There are several ways in which the recovered components can be used to evaluate the choice of positions by experts (which ultimately determine, along with the image, the basis functions). First, one can visually inspect the sets of basis functions recovered from datasets produced by experts, and compare this with one generated from random window locations. A second technique can be used to demonstrate that experts do indeed posses a feature set that differs from a random set. The data from random windows and experts can be combined to Page 17
  • 18. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen produce a common set of components (basis functions). ICA is a linear technique, and thus the original data for both experts and random windows can be recovered through weighted sums of the components, with some error if only some of the components are saved. If experts share a common set of features that is estimated by ICA, then their data should be recovered with less error than that of the random windows. This would demonstrate that an important component of expertise is the ability to take a highly dimensional dataset (as produced by noisy images) and reduce it down to fundamental features. From this perspective, visual expertise is data reduction. These kinds of data reduction techniques serve a separate purpose. Many of the experiments described in other sections of this proposal depend on specifying particular features. While initial estimates of the relevant features can be made on the basis of discussions with fingerprint experts, we anticipate that the results of the ICA analysis will help refine our view of what constitutes an important feature within the context of fingerprint matching. The moving window procedure has the disadvantage of being a very localized procedure, due to the nature of the small moving window. There is a fundamental tradeoff between the size of the window and the spatial acuity of the procedure. If the window is made too large, we know less about the regions from which the user is attempting to acquire information. To offset this, we have provided the user the opportunity to view quick flashes of the full image, enough to provide an overview of the prints, but not enough to allow matches of specific regions. We will also conduct the studies using large and small windows to see whether the nature of the recovered components changes with window size. C.4. Starting Small: Guiding feature extraction with expert knowledge We need to ask whether this is compelling, and cut it if it is not. Feature extraction procedures attempt to take a high dimensional space and use the redundancies in this space to derive a lower-dimensional representation that combine across the redundancies to provide a basis set. This basis set can be thought of as the fundamental feature set, and the development of this set can be thought of as one mechanism underlying human expertise. The difficulty with these highly-dimensional spaces is that algorithms that attempt to Page 18
  • 19. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen uncover the feature set through iterative procedures like Independent Component Analysis or neural networks may fall into local minima and fail to converge upon a global solution. One solution that has been proposed in the human developmental literature is one of starting small (Elman, 1993). In this technique, programmers intially restrict the inputs to statistical models to provide general kinds of information rather than specific information that would lead to learning of specific instances. As a network matures, more specific information is added, which allows the network to avoid falling into local minima that represent non-learned states. While the exact nature of these effects are still being worked out (Rohde & Plaut, 1999), recent work has provided empirical support in the visual domain (Conway, Ellefson & Christiansen, ref). This suggests that we might use the temporal component of the data from experts in the moving window paradigm to help guide the training of our networks. As an expert views a print, they initially are likely to focus on broad, overall types of information that give the need to finish if necessary C.5. Automatic detection of regions of interest using expert knowledge In both fingerprint classification (e.g. Dass & Jain, 2004; Jain, prabhakar & Hong 1999; Cappelli, Lumini, Maio & Maltoni, 1999) and fingerprint identification (e.g. Pankanti, Prabhakar & Jain, 2002; Jain, Prabhakar & Pankanti, 2002) applications, there are two main components for an automatic system: (1) feature extraction and (2) matching algorithm to compare (or classify) fingerprints based on feature representation. The feature extraction is the first step to convert raw images into feature representations. The goal is to find robust and invariant features to deal with various conditions in real-world applications, such as illumination, orientation and occlusion. Given a whole image of fingerprint, most fingerprint recognition systems utilize the location and direction of minutiae as features for pattern matching. In our preliminary study of human expert behaviors, we observe that human experts focus on just parts of images (regions of interest – ROIs) as shown in Figure XX, suggesting that it is not necessary for a human expert to check through all minutiae in a fingerprint. A small subset of minutiae seems to be sufficient for the human expert to make a judgment. What regions are useful for matching among all the Page 19
  • 20. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen Figure X. The overview of automatic detection of regions of interest. The red regions in the fingerprints indicate whether human expert focus on in pattern matching task. minutiae in a fingerprint? Is it possible to build an automatic ROI detection system that can achieve a similar performance as a human expert? We attempt to answer this question by building a classification system based on the training data captured from human experts. Given a new image, the detection system is able to automatically detect and label regions of interest for the matching purpose. We want to note that we expect that most regions selected by our system will be minutiae but we also expect that the system will potentially discover the structure regularities from non-minutia regions that are overlooked in previous studies. Different from previous studies of minutiae detection (e.g. Maio & Maltoni, 1997), our automatic detection system will not simply detect minutiae in a fingerprint but focus on detecting both a small set of minutiae and other useful regions for the matching task. Considering the difficulties in fingerprint recognition, building this automatic detection system is challenging. However, we are confident that this proposed research will be first steps toward the success and make important contributions. This confidence lies in two important factors that make our work different from other studies: (1) we will record detailed behaviors of human experts (e.g. where they look in a matching task) and recruits the knowledge extracted from human experts to build a pattern recognition system; and (2) we will apply state-of-art machine learning techniques in this study to efficiently encode both expert knowledge and regularities in fingerprint data. The combination Page 20
  • 21. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen of these two factors will lead us to achieve this research plan. To build this kind of system, we need to develop a machine learning algorithm and estimate the parameters based on the training data. Using the moving window paradigm (described in C.3), we collect the information of where a human expert examines from moment to moment when he performs a matching task. Hence, the expert’s visual attention and behaviors (moving the windows) can be utilized as labels of regions of interest – providing the teaching signals for a machine learning algorithm. In the proposed research, we will build an automatic detection system that captures the expert’s knowledge to guide the detection of useful regions in a fingerprint for pattern matching. We will use the data collected from C.X. Each circular area examined by the expert is filtered by a bank of Gabor filters. Specifically, the Gabor filters with three scales and five orientations are applied to the segmented image. It is assumed that the local texture regions are spatially homogeneous, and the mean and the standard deviation of the magnitude of the transform coefficients are used to represent an object in a 48-dimensional feature vector. We reduced the high-dimensional feature vectors into the vectors of dimensionality 10 by principle component analysis (PCA), which represents the data in a lower dimensional subspace by pruning away those dimensions with the least variance. We also randomly sample other areas that the expert doesn’t pay attention to and code these areas with a Non-ROI label which is paired with feature vectors extracted from these areas. In total, the training data consists of two groups of labeled features – ROI and Non-ROI. Next, we will build a binary classifier based on Support vector machines (SVMs). SVMs have been successfully applied to many classification tasks (Vapnik 1995; Burges 1998). A SVM trains a linear separating plane for classifying data, through maximizes the margins of two parallel planes near the separating one. The central idea is to nonlinearly map the input vector into a high-dimensional feature space and then construct an optimal hyperplane for separating the features. This decision hyperplane depends on only a subset of the training data called support vectors. Page 21
  • 22. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen For a set of n-dimensional training examples, Χ = {xi }i =1 labeled by expert’s visual attention m { yi }im1 , and a mapping of data into q-dimensional vectors φ ( X ) = {φ ( xi )}im1 by kernel function = = where q >> n , a SVM can be built on the set of mapping training data based on the solution of the following optimization problem: m 1 T Minimizing over ( w, b, ξ1 ,..., ξ m ) the cost function: w w + C∑xi 2 i =1 Subject to: ∀i =1 : y ( w φ ( xi ) + b) ≥ 1 − ξ i and m T ξ ι ≥ 0 for all i Where C is a user-specified constant for controlling the penalty to the violation terms denoted by each ξ i . The ξ is called slack variables that measure the deviation of a data point from the ideal condition of pattern separability. After training, w and b constitute of the classifier: y = sign( wT φ ( x) + b) Compared with other approaches used in fingerprint recognition, such as neural networks and k-nearest neighbors, SVMs have demonstrated more effective in many classification tasks. In addition, we first transform original features into a lower-dimensional space based on PCA. The purpose of this first step is to deal with the curse of dimensionality. We then map the data points into another higher-dimensional space so that they are linearly separable. By doing so, we convert the original pattern recognition problem into a simpler one. This idea is quite in line with kernel-based nonlinear PCA (Scholkopf, Smola & Muller 1998) that have been successfully used in several fields (e.g. Wu, Su & Carpuat 2004). Given a new testing fingerprint, we will shift a 40x40 window over the image and classify all the patches at each location and scale. The system will first extract Gabor-based features from local patches which will be the input to the detector. The detector will label all the regions as either ROI or Non-ROI. We expect that most ROIs are minutiae. Different from the methods based on minutiae matching, we also expect that only a small of minutiae are utilized by human experts. Moreover, we expect the system to detect some areas that are not defined as minutiae but human experts also pay attention to during the matching task. Thus, the ROI detector we develop will go beyond the standard approach in fingerprint recognition (minutiae extraction and Page 22
  • 23. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen matching). By efficiently encoding the knowledge of human expert, the proposed system will have opportunities to discover the statistical regularities in fingerprints that have been overlooked in previous studies. C.6. Using expert-identified correspondences to extract environmental models In our moving window paradigm, a human expert moves the window back and forth between inked and latent fingerprints to perform pattern matching. We propose that the dynamic behaviors of the expert provide additional signals indicating one-to-one correspondences between two images. In light of this, our hypothesis is that an expert’s decision is based on the comparison of these one-to-one patches. Therefore, we propose that these expert-identified correspondences can serve as additional information to find the regularities in fingerprint and build the automatic detection system. We propose to use this knowledge as a prior for the training data. We observe that not all the focused regions in the latent print have the corresponding regions in the inked print. Thus, it is more likely that those one-to-one pairs play a more important role in pattern matching than other regions of interest. Based on this observation, we propose to maintain a set of weights over the training data. More specifically, for each ROI in the latent image, we find the most likely pairing patch in the inked image. Two constraints guide the searching of the matching pair. The temporal constraint is based on the expert’s behaviors. For instance, the patch in the inked pair that the expert immediately examine (right after looking at the ROI in the latent image) is more likely to associate with that ROI in the latent pair. The spatial constraint is to find the highest similarity of the patch in the latent image and any other patch in the inked image. In this way, each ROI in the latent image can be assigned with a weight indicating the probability to map this region to a region in the other image. With a set of weighted training data, we will apply a SVM-based algorithm (briefly described in C.5) which will focus on the paired samples (with high weights) in the training data. More specifically, we replace the constant C in the standard SVM with a set of variables ci , each of which corresponds to the weight of a data point. Accordingly, the new Page 23
  • 24. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen m 1 T objective function is w w + ∑ cix i . Thus, the matching 2 i =1 regions receive more penalties if they are nonseparable points while other regions receive less attention because it is more likely that they are irrelevant to the expert’s decision. Thus, the parameters of the SVM are tuned up to favorite the regions that human experts are especially interested in. By encoding this knowledge in a machine Figure X. The overview of automatic detection of regions of interest. The red learning algorithm, we expect that this method will lead to a regions in the fingerprints indicate whether human expert focus on in better performance by closely imitating the expert’s pattern matching task. decision. C.7. Dependencies between global and local information: The role of gist information Fingerprints are categorized into several classes, such as whorl, right loop, left loop, arch, and tented arch in the Henry classification system (Henry 1900). In the literatures, researchers use only 4-7 classes in an automatic classification system. This is because the task of determining a fingerprint class can be difficult. For example, it is hard to find robust features from raw images that can aid classification as well as exhibit low variations within each class. In C.5 and C.6, we discuss how to use expert knowledge to find useful features for pattern matching. By taking a bigger picture of feature detection and fingerprint classification in this section, we find that we need to deal with a chicken-and-egg problem: (1) useful local features can predict fingerprint classes; and (2) a specific fingerprint class can predict what kinds of local regions likely occur in this type of fingerprint. In contrast, standard alone feature detection algorithms (e.g. in C.5 and C.6) usually look at local pieces of the image in isolation when deciding whether the patch is a region of interest. In machine learning, Murphy, Torralba and Freeman (2003) proposed a conditional random filed for jointly solving the tasks of object detection and scene classification. In light of this, we propose to use the whole image context as an extra source of global information to guide the searching of ROIs. In addition, a better set of Page 24
  • 25. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen ROIs will also potentially make the classification of the whole fingerprint more accurate. Thus, the chicken-and-eggs problem is tackled by a bootstrapping procedure in which local and global pattern recognition systems interact with and boost each other. We propose a machine learning system based on graphical models (Jordan 1999) as shown in Figure XX. We define the gist of image as a feature vector extracted from the whole image by treating it as a single patch. The gist is denoted by vG . Then we introduce a latent variable T describing the type of fingerprint. The central idea in our graphical model is that ROI presence is conditionally independent given the type and the type is determined by the gist of image. Thus, our approach encodes the contextual information on a per image basis instead of extracting detailed correlations between different kinds of ROIs (e.g. a fix prior such as the patch A always occurs to the left of the patch B) because of the complexity and variations of detailed descriptions. Next we need to classify fingerprint types. We will simply train a one-vs-all binary SVM classifier for recognizing each fingerprint type based on the gist. We will then normalize p (T t = 1 | vG ) the results: p (T = t | vG ) = where p (T t = 1 | v G ) is the output of tth one-vs-all ∑t ' p (T t ' = 1 | vG ) classifier. As far as the fingerprint type is known, we can use this information to facilitate ROI detection. As shown on the tree-structured graphical model in Figure XX, the following conditional joint density can be expressed as follows: 1 1 p (T , R1 ,..., RN | v) = p (T | vG )∏ p ( Ri | T , vi ) = p (T | vG )∏∑ p (Tt ) p( Ri | Tt , vi ) z i z i t Page 25
  • 26. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen Where vG and vi are local and global features respectively. Ri is the class of a local patch. In the proposed research, we will investigate two types of R. One classification defines ROI and Non- ROI types which is the same with C.5 and C.6. The other classification defines several minutia types (plus Non-ROI) such as termination minutia and bifurcation minutia. z is a normalizing constant. Based on this graphical model, we will be able to use contextual knowledge to facilitate the classification of a local image. We also plan to develop a more advanced model which will use local information to facilitate the fingerprint type classification. We expect that this kind of approach will lead to a more effective automatic system that can perform both top-down inference (fingerprint types to minutia types) and bottom-up inference (minutia types to fingerprint types). C.8. Summary of quantitative approaches (Tom writes) General themes: Incorporate expert knowledge Links between global and local structure made possible by input from experts Specification of elemental basis or feature set Classifying informativeness of regions Defining an intermediate level between low-level feature extractors and high-level gist or configural information D. Implications for knowledge and practice The implications of the knowledge gained by the results of these studies and analyses falls into four broad categories, each of which are discussed below. D.1. Implications for quantitative understanding of the information content of fingerprints Page 26
  • 27. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen D.2. Implications for an understand of the links between quantitative information content and the latent print examination process D.3. Implications for the classification and filtering of poor-quality latent prints D.4. Implications for the development of software-based tools to assist human-based latent print examinations and training E. Mangement plan and organization F. Dissemination plan for project deliverables scientific articles, presentations at machine learning conferences and fingerprint conferences, proof-of-concept Java-based applets. (end of 30 pages) Page 27
  • 28. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen G. Description of estimated costs Personnel The project will be co-directed by Thomas Busey and Chen Yu. We request 11 weeks of summer support, during which time both will devote 100% of their efforts to the project. Benefits are calculated at 19.81%. The salaries are incremented 3% per year. Many of the simulations will be conducted by a graduate student, who will be hired specifically for the purposes of this project. This student, likely an advanced computer science student with a background in cognitive science, requires a stipend, a fee remission and health insurance. The health insurance is incremented at 5% per year. Subject coordination and database management will be coordinated by hourly students who will work 20 hours/wk on the project. We will pay them $10/hr. Consultant John Vanderkolk, with whom Busey has worked with for the past two years, has agreed to serve as an unpaid consultant on this grant. He does require modest travel costs when he visits Bloomington. Travel Money is requested to bring in four experts for testing using the eyemovement recording equipment. These costs will total approximately $1500/yr. Money is requested for three conferences a year. These will enable the investigators to travel to conferences such as Neural Information Processing (NIPS) and forensic science conferences such as the International Association for Identification (IAI) to interact with colleagues and share the results of our analyses. These trips serve an important role in communicating the efforts of this grant to a wider audience. Other Costs Equipment This research is very computer-intensive, and thus we require a large UNIX-based server to run simulations in parallel. In addition, we require three pc-based workstations to run Matlab and Page 28
  • 29. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen other simulations programs. Finally, conferences such as IAI and local Society for Identification meetings provide an ideal place to gather data from experts, and thus we require a portable computer for such onsite data-gathering purposes. We anticipate that up to half of our data can be collected using these on-site techniques, and this technique is preferable because we have control over the monitor and software. Thus the laptop computer represents a good investiment in the success of the project. Other costs The graduate student line requires a fee remission each year. The fee remission is incremented at 5% per year. The results of our studies require resources to reach a wide audience, and thus we require dissemination costs to cover the costs of publication and web-based dissemination. This project is highly image-intensive, and we require money to purchase image-processing software and upgrades. These include software packages such as Adobe Photoshop, as well as new image processing packages as they become available. We will test 80 subjects a year to obtain the necessary data for use in our statistical applications. Each subject requires $20 for the approximate 90 minute testing period. The project will consume supplies of approximately $100/month, for items such as backups, power supplies, etc. Indirect Costs The indirect rate negotiated between Indiana University and the federal government is set at 51.5%. This rate is assessed against all costs except the fee remission. This was negotiated with DHHS on 5.14.04 G. Staffing plan and Resources Both Busey and Chen maintain laboratories in the Department of Psychololgy at Indiana University that each contain approximately 700 sq. feet of space. These have subject running rooms, offices and spaces for servers. Chen's lab contains an eyemovement recording setup that is sufficent for the eyemovement porition of the experiments. Both investigators have offices in Page 29
  • 30. Adding human expertise to the quantitative analysis of fingerprints Busey and Chen the Psychology department as well. We will recruit a graduate student from the Computer Science or Psychology programs at Indiana University. This student must have experience with machine learning algorythms at a theoretical level, and also be an expert programmer. They will work 20 hrs/wk. We will also recruit two hourly undergraduate students to coordinate the subject running, data analysis and server maintainance. They will also be responsible for managing the data repository site where our data will be accessible by other reserachers who wish to integrate human expert knowledge into their networks. The bulk of the theoretical work will be handled by Chen and Busey, while the graduate student will work in impliemnation and model testing. H. Timeline This is a multi-year project that is designed to alternate between acquiring human data and using it to refine the quantitative analyses of latent and inked prints. Year 1: Acquire necessary fingerprint databases. Begin testing 80 experts on 72 different latent/inked print pairs. Program Support Vector and Global Local models. Test 2 experts on the eyemovement equipment using all 72 prints. Year 2: Test an additional 80 experts on 72 new latent/inked prints. Begin model fitting and refinement. Test 2 experts on the eyemovement equipment using all 72 prints. Compare results from eyemovement studies and moving window studies. Year 3: Test the final 80 experts on 72 new latent/inked prints. Develop new versions of statistical models based on prior results. Put entire database online for use by other researchers. Disseminate results to peer-reviewed journals. Page 30