Current Approaches to Automated Information Evaluation and their Applicability to Priority Intelligence Requirement Answering
Upcoming SlideShare
Loading in...5
×
 

Current Approaches to Automated Information Evaluation and their Applicability to Priority Intelligence Requirement Answering

on

  • 1,197 views

Doctrinally, Priority Intelligence Requirements (PIRs) represent information that the commander needs to know in order to make a decision or achieve a desired effect. Networked warfare provides the ...

Doctrinally, Priority Intelligence Requirements (PIRs) represent information that the commander needs to know in order to make a decision or achieve a desired effect. Networked warfare provides the intelligence officer with access to multitudes of sensor outputs and reports, often from unfamiliar sources. Counterinsurgency requires evaluating information across all PMESII-PT categories: Political, Military, Economic, Social, Infrastructure Information, Physical Environment and Time. How should analysts evaluate this information? NATO's STANAG (Standard Agreement) 2022 requires that every piece of information in intelligence reports used to answer PIRs should be evaluated along two independent dimensions: the reliability of its source and the credibility of the information. Recent developments in information retrieval technologies, including social search technologies, incorporate metrics of information evaluation, reliability and credibility, such as Google's PageRank. In this paper, we survey various current approaches to automatic information evaluation and explore their applicability to the information evaluation and PIR answering tasks. (Presented at Fusion 2010)

Statistics

Views

Total Views
1,197
Views on SlideShare
1,197
Embed Views
0

Actions

Likes
0
Downloads
23
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Wolfram Alpha identifies Tupelo as the place of Elvis’ birth (Elvis disambiguated as Elvis Presley) and provides additional information on the city.. Reference sources by title, not easily checked. Add box.
  • Elvis disambiguated as “Elvis Presley” by PageRank. Consensus answer apparent by inspection. Highest ranking document doesn’t contain answer in its snippet.
  • http://www.nytimes.com/interactive/2010/06/16/magazine/watson-trivia-game.html?scp=3&sq=ibm%20watson&st=cse
  • (Partial) geographic overlay tied to trailing month archive of news articles. Matches are documents with a location that contain “Elvis” and “born”. At least one document contains the correct answer, but many false hits, including an article about an Elvis-loving Episcopal priest in Alaska. Hits in France, Spain, Haiti, etc. Somewhat denser around Tupelo, but not enough to indicate answer clearly.
  • Elvis disambiguated to several Elvises. Birthplaces highlighted in each by Powerset. Uses Wikipedia data only.
  • Correct answer routed to a self-identified Elvis expert (assumed Elvis = Elvis Presley) and correct answer returned in less than a minute. Feedback can be provided “Was Gregory’s answer helpful”? Yes, Kind of, but not for me. No. Question phrased this way because questions have to be over a certain length.
  • Green means an automated solution exists; Yellow means solution is partial or not wholly automated (requires human judgment). Red means no automated solution.

Current Approaches to Automated Information Evaluation and their Applicability to Priority Intelligence Requirement Answering Current Approaches to Automated Information Evaluation and their Applicability to Priority Intelligence Requirement Answering Presentation Transcript

  • Fusion 2010 13 th International Conference on Information Fusion EICC, Edinburgh, UK Thursday, 29 July 2010 Current Approaches to Automated Information Evaluation and their Applicability to Priority Intelligence Requirement Answering
  • Outline
    • Overview
    • Priority Intelligence Requirements
    • Doctrine: Reliability/Credibility
    • Question-Answering Technologies
    • Conclusion/Research Gaps
    • Disclaimer
    VIStology | FUSION 2010 | Edinburgh www.vistology.com
  • Overview
    • Priority Intelligence Requirement (PIR) Answering requires STANAG 2022 assessments of information reliability/credibility/independence. Each element of information in a PIR should have an assessment of: the accuracy of the information provided, how credible it is, and how reliable the source is. STANAG 2022 explicitly adopted in US and NATO doctrine.
    • There are currently no real tools for making these assessments, or for reasoning with STANAG-assessed data.
    • Contemporary commercial question-answering tools partially address some of the necessary reliability/independence/credibility issues.
    • Here we survey the state of the art technology and evaluate the research gaps.
    www.vistology.com VIStology | FUSION 2010 | Edinburgh
  • Priority Intelligence Requirements: Doctrine
    • Priority Intelligence Requirements (PIRs) are “those intelligence requirements for which a commander has an anticipated and stated priority in his task of planning and decision making” ( FM 2-0 “Intelligence”, section 1-32)
    • Ask a single question.
    • Are ranked in importance.
    • Are specific: Focus on a specific event, fact or activity.
    • Are tied to a single decision or planning task the commander has to make.
    • Provide a last time by which information is of value (LTIOV).
    • Are answerable using available assets and capabilities.
      • McDonough, LTC W. G., Conway, LTC J. A., “Understanding Priority Intelligence Requirements”, Military Intelligence Professionals Bulletin, April-June 2009.
    VIStology | FUSION 2010 | Edinburgh www.vistology.com
  • NATO STANAG 2022
  • Question-Answering Technologies by Source Data Format VIStology | FUSION 2010 | Edinburgh www.vistology.com Information Source Format Familiar Application Advanced Application Tables (Relational DBs, Spreadsheets) Structured Query Language (SQL) Wolfram Alpha (Mathematica) Text Web Search Engines (Google, Yahoo!, Ask) Systems from AQUAINT (IC) competition; IBM Watson Tagged Text Google Patent Search Metacarta; Palantir Logic Statements Prolog Powerset (acquired by MS Bing); Cyc Trusted Teammates Personal Communication Yahoo! Answers; Vark (acquired by Google); US Army Intelligence Knowledge Network Shoutbox
  • Structured Data Q-A: Wolfram Alpha www.vistology.com VIStology | FUSION 2010 | Edinburgh Wolfram Alpha identifies Tupelo as where Elvis was born (Elvis disambiguated as Elvis Presley) and provides map overlay and additional info, like current city population. Reference sources listed by title on another screen, no access to source data. Query “Where was Elvis born?” automatically translated to Mathematica query: Elvis Presley, place of birth.
  • Text: Google VIStology | FUSION 2010 | Edinburgh www.vistology.com Google PageRank disambiguates query: Elvis = Elvis Presley by PageRank. Top-ranked snippets can easily be scanned for consensus answer from independent sources: Tupelo, MS. PageRank less useful in MI context because reports are not hyperlinked.
  • Text-Based Q-A: IBM Watson www.vistology.com VIStology | FUSION 2010 | Edinburgh IBM’s text-based algorithms identified these phrases as top potential “Jeopardy” answer, with scores displayed. In “Jeopardy”, answer is in form of question. Query in “Jeopardy” format (including category “Musical Pastiche”)
  • Tagged Text: Metacarta VIStology | FUSION 2010 | Edinburgh www.vistology.com Query identifies documents that contain “elvis”, “born” and a location. Answers literally all over the map. Consensus answer not obvious from location clusters. Documents are recent news articles. Query: “Where was Elvis born?”
  • Logic-Based Q-A: Powerset VIStology | FUSION 2010 | Edinburgh www.vistology.com Answers involve multiple “Elvises”. Source data is Wikipedia only.
  • Social Question-Answering: Vark VIStology | FUSION 2010 | Edinburgh www.vistology.com Routed to unknown user in my ‘network’ computed as likely to provide answer; Answer returned in less than minute. Optimized for mobile environment. Feedback Vark queries need to be over certain length, hence this phrasing.
  • Comparison by Technology www.vistology.com VIStology | FUSION 2010 | Edinburgh STANAG Requirement Tables: Wolfram Alpha Text: Google IBM Watson Tagged Text: Metacarta Palantir Logic Statements: Powerset Teammates: Vark Y! Answers Source Wolfram: Reference document title (no url) URL of document in which info appears (usually: not Watson). No further attempt to match info to source. I.e. not: 1000 demonstrators according to police. Teammate known. May not say where info originates. Source Reliability Curated data: Reference works, Government data. Centrality measures: Google PageRank (eigenvector centrality); Technorati Authority (inlink centrality); VIStology blogger authority (centrality + engagement) Curated data: Wikipedia. Wikipedia has PageRank: 9 out of 10 (reliable) Track record, Reputation. Votes on answers. Longevity. # of answers Source Independence No. One unified datastore. Duplicate document detection; Explicit source tracking (href; bit.ly); Leskovec meme tracking. SNA metrics of independence. No. Single data source. User Authentication. Information Credibility Partial Integrity constraints. Can’t easily verify info. Consensus answers; same answer identified in multiple distinct sources. Could check integrity constraints; URI co-ref a problem. Contradictions halt inference. Demonstrated area of expertise
  • Research Gaps
    • How best to map network-based reliability metrics to STANAG 2022 reliability codes?
    • How to make reliability metrics derived from networks of different scales comparable with non-estimated reliability metrics?
    • How to automatically reason with information that has been assigned STANAG 2022 evaluation codes?
    • How to efficiently identify independent confirmation of reports in social media and other networked sources?
    • How to tractably identify inconsistent new reports?
    • How to adjudicate inconsistencies among reports automatically?
    www.vistology.com VIStology | FUSION 2010 | Edinburgh
  • Conclusions
    • In contemporary environments, direct evaluation of source reliability may be impossible given the proliferation of OSINT and other sources relevant to COIN fight across all PMESII-PT categories.
    • Networked sources make judging independence of sources and identifying influence more difficult.
    • Analysts may have to rely on correlated network-based metrics of reliability, credibility and inde-dence rather than evaluate many sources/reports as “Reliability cannot be judged”/”Truth cannot be judged”.
    www.vistology.com VIStology | FUSION 2010 | Edinburgh
  • Thank You
    • Note: This paper does not represent
    • an endorsement by the Army Research Laboratory of any of the commercial
    • products discussed.
    VIStology | FUSION 2010 | Edinburgh www.vistology.com