Model-Guided Discovery for Intelligence Analysis

Model-Guided Information Discovery
for Intelligence Analysis
Rafael Alonso Hua Li
Sarnoff Corporation
201 Washington Road
Princeton, NJ 08543
+1 (609) 734-2172
{ralonso, hli}@sarnoff.com
ABSTRACT
Intelligence analysis can be aided and guided by models of the
analysts’ interests and priorities. This paper describes our ap-
proach to analyst modeling as part of the Ant CAFÉ project, in
which analyst models are used to guide the searching behavior of
a swarm of intelligent agents. Structural elements of our analyst
model include concepts and relations, both of which help to cap-
ture the analyst’s current interest and concerns. In addition, the
concepts and relationships have associated scalar parameters to
provide a quantitative measure of the user’s level of interest. We
have developed algorithms for dynamically adapting the weights
and evolving the elements of the model itself. To evaluate these
algorithms we have built an Analyst Modeling Environment
workbench. We have tested our approach on this workbench using
traces generated by human analysts, and have demonstrated im-
provements over current state of the art search engines.
Categories and Subject Descriptors
I.2.6 [Learning]: Knowledge Acquisition, H.3.3 [Information
Search and Retrieval]: Search Process, Clustering, Information
Filtering,.
General Terms
Algorithms, Human Factors, Performance, Experimentation.
Keywords
Adaptive Search, User Modeling, IR, Intelligence Analysis.
1. INTRODUCTION
When searching information through massive data intelligence
analysts face a number of obstacles including inadequate exami-
nation of evidence [5] and limitations of human cognitive capac-
ity [6]. Conventional information retrieval (IR) is limited in their
help because of their disregard for personalization. In contrast,
our approach to alleviate analysts’ problems is to dynamically
model analysts’ interests and use the models to search and filter
existing information. As a result, only relevant information is
brought to analysts’ attention. Analysts can invest more of their
valuable time to do what they do best, i.e., to think about the
meaning and implications of the facts.
This work describes the Analyst Modeling Environment (AME),
which forms the front-end of the Ant CAFÉ (Composite Adaptive
Fitness Evaluation) system. In this system, the user models built
by AME guide a search backend based on swarming mechanisms
being built by the Altarum Institute [8].
2. User Modeling in AME
The analyst model automatically captures a composite view of an
analyst's interests and preferences related to some tasking. It is
derived based on prior and tacit information and on analyst ac-
tions. After initialization, the model adapts automatically based
on the analyst's interactions with the system.
The analyst model is visually represented as a concept map com-
monly employed for knowledge organization and representation
[3]. In this paper, we define a concept map as a diagram consist-
ing of concepts, relations, and their associated weights (Figure 1).
A concept denotes
an idea, or a gen-
eral notion regard-
ing the task and
can usually
mapped to a node
in an ontology. A
relation denotes
the relationship
between two con-
cepts. The weight
(between 0 and 1) associated with a parameter denotes the user’s
interest level with higher weight indicating higher importance and
relevance to the analyst.
The analyst model forms the context for the current task, and pro-
vides a source for automatic query enrichment in the process of
query contextualization even before the query is sent to a search
engine [2]. This process solves the problem of context mismatch,
i.e., the contexts of IR users for their queries do not match those
of content authors and thus differs from query expansion, which
addresses the problem of word mismatch [9].
The analyst model can be manually initialized by the user. Alter-
natively, AME can automatically extract keywords and relation-
ships from the text description of analyst taskings and insert them
with default interest levels into the model. In this case, the analyst
can override any elements as appropriate.
AME adapts the initialized user model with user relevance feed-
back, implicit (e.g., document selection and browsing) or explicit
Copyright is held by the author/owner(s).
CIKM’05, October 31–November 5, 2005, Bremen, Germany.
ACM 1-59593-140-6/05/0010.
Figure 1. A concept map example.
269

(e.g. rating a document). The inter-
est levels in the model are adapted
using an extended version of our
Bubble Up algorithm [1]. The re-
vised algorithm works with concept
map models and textual data.
AME uses the Implicit Ontology-
based Model Expansion algorithm to
adapt the structure of the model.
Concept and relations are automati-
cally expanded with WordNet [7]
hyponyms and hypernyms. Those
tentative additions with interest lev-
els exceeding a threshold in subse-
quent adaptation are formally admit-
ted to the model.
3. AME Evaluation
AME was implemented in Java as a
client-server Web application. The
application takes the appearance of a
IR system, and is employed to
evaluate model adaptation algo-
rithms. The evaluation compared the
performance of Google search with
or without the aid of AME. The data came from ARDA’s NIMD
program and consisted of analysis activities of volunteer intelli-
gence analysts [4]. We were interested in search sagas, specific
sequences of Web search events present in the data. A search saga
consists a series of chapters. A chapter is composed of following
sequential events: a search query, a results page from the search
engine, a sequence of link follow-ups. For the evaluation, we
extracted 131 usable sagas of Google search activities.
The performance metrics are precision and recall. Precision is
defined as number of relevant results relative to total results in a
given chapter. Recall is relevant results found in a given chapter
relative to the total relevant results in the saga from which the
chapter came.
We have two experimental conditions: “Google” alone and
“AME+Google”. The calculation of the metrics for each chapter
in a saga for Google alone is straightforward because the data are
readily attainable from the saga. For AME+Google, a analyst
model was initialized with the terms extracted from the tasking
description. Whenever a link follow-up occurred, AME assumed
the linked document was relevant and adapted the model. In addi-
tion, query terms are automatically added to the model. The ana-
lyst model is used to rank all documents found in the given saga.
The top-ranked documents are returned as the “AME result page”,
which is the equivalent of the “query result page” generated by
Google. Performance metrics can thus be computed as usual for
every chapter in the saga.
AME+Google outperforms Google on precision and recall
(Figure 2) with statistical significance. In addition, the precision
for AME+Google seems to improve over time (i.e., as chapter
number increases).
While the work presented here ad-
dresses the problem of helping the
analyst find additional information
similar to that which he has found
useful, the central goal of our work
is to aid him locate novel informa-
tion that, while related to previous
interests, also contains new insights.
We are currently exploring ways to
avoid cognitive biases by morphing
the user model and query similar to
that of hypothesis generation [2].
4. ACKNOWLEDG-
MENTS
This study was supported and moni-
tored by the Advanced Research and
Development Activity (ARDA) and
the National Geospatial-Intelligence
Agency (NGA) under Contract
Number NMA401-02-C-0020. The
views, opinions, and findings con-
tained in this report are those of the
authors and should not be construed
as an official Department of Defense
position, policy, or decision, unless so designated by other official
documentation.
5. REFERENCES
[1] Alonso, R., Bloom, J., Li, H., and Basu, C., An adaptive
nearest neighbor search for a parts acquisition ePortal, The
Proceedings of the SIGKDD'03, Washington, 2003.
[2] Alonso, R. and Li, H. Combating Cognitive Biases in In-
formation Retrieval, First International Conference on In-
telligence Analysis Methods and Tools, McLean, VA,
2005.
[3] Ausubel, D. P. The Psychology of Meaningful Verbal
Learning. New York: Grune and Stratton, 1963.
[4] Glass Box web site is at: http://glassbox.labworks.org/.
[5] Grabo, C.M., Anticipating surprise: Analysis for Strategic
Warning. Joint Military Intelligence College, 2002.
[6] Heuer, Jr., R. J., Psychology of Intelligence Analysis. Cen-
ter for the study of intelligence, Central Intelligence
Agency, 1999.
[7] Miller, G.A. WordNet: an on-line lexical database. Interna-
tional Journal of Lexicography, 3(4):235-312, 1990.
[8] Weinstein, P., Parunak, H.V.D., Chiusano, P., Brueckner, S.
Agents Swarming in Semantic Spaces to Corroborate Hy-
potheses. Autonomous Agents and Multi-Agent Systems
(AAMAS 2004), Columbia University, NY, 2004.
[9] Xu, J. and Croft, W. B. 2000. Improving the effectiveness
of information retrieval with local context analysis. ACM
Trans. Inf. Syst. 18, 1 (2000), 79-112.
Figure 2. Precision, recall averaged across 131 sagas.
Google, red squares; AME+Google, blue circles.
270

Model-Guided Discovery for Intelligence Analysis

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Similar to Model-Guided Discovery for Intelligence Analysis

Similar to Model-Guided Discovery for Intelligence Analysis (20)

More from Hua Li, PhD

More from Hua Li, PhD (6)

Model-Guided Discovery for Intelligence Analysis