• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Effective Counterterrorism and the Limited Role of Predictive ...

Effective Counterterrorism and the Limited Role of Predictive ...






Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Effective Counterterrorism and the Limited Role of Predictive ... Effective Counterterrorism and the Limited Role of Predictive ... Document Transcript

    • No. 584 December 11, 2006 ������� Effective Counterterrorism and the Limited Role of Predictive Data Mining by Jeff Jonas and Jim Harper Executive Summary The terrorist attacks on September 11, 2001, Though data mining has many valuable uses, spurred extraordinary efforts intended to protect it is not well suited to the terrorist discovery America from the newly highlighted scourge of problem. It would be unfortunate if data mining international terrorism. Among the efforts was the for terrorism discovery had currency within consideration and possible use of “data mining” as national security, law enforcement, and technol- a way to discover planning and preparation for ter- ogy circles because pursuing this use of data rorism. Data mining is the process of searching mining would waste taxpayer dollars, needlessly data for previously unknown patterns and using infringe on privacy and civil liberties, and misdi- those patterns to predict future outcomes. rect the valuable time and energy of the men and Information about key members of the 9/11 women in the national security community. plot was available to the U.S. government prior What the 9/11 story most clearly calls for is a to the attacks, and the 9/11 terrorists were close- sharper focus on the part of our national securi- ly connected to one another in a multitude of ty agencies—their focus had undoubtedly sharp- ways. The National Commission on Terrorist ened by the end of the day on September 11, Attacks upon the United States concluded that, 2001—along with the ability to efficiently locate, by pursuing the leads available to it at the time, access, and aggregate information about specific the government might have derailed the plan. suspects. _____________________________________________________________________________________________________ Jeff Jonas is distinguished engineer and chief scientist with IBM’s Entity Analytic Solutions Group. Jim Harper is director of information policy studies at the Cato Institute and author of Identity Crisis: How Identification Is Overused and Misunderstood.
    • Though data Introduction We must continue to study and analyze mining has many the events surrounding the 9/11 attacks so The terrorist attacks on September 11, 2001, that the most appropriate policies can be valuable uses, it is spurred extraordinary efforts intended to pro- used to suppress terror, safeguard Americans, not well suited tect America from the newly highlighted and protect American values. This is all the scourge of international terrorism. Congress more important in light of recent controver- to the terrorist and the president reacted quickly to the attacks, sies about the monitoring of telephone calls discovery passing the USA-PATRIOT Act,1 which made and the collection of telephone traffic data problem. substantial changes to laws that govern crimi- by the U.S. National Security Agency, as well nal and national security investigations. In as surveillance of international financial 2004 the report of the National Commission transactions by the U.S. Department of the on Terrorist Attacks upon the United States Treasury. (also known as the 9/11 Commission) provided While hindsight is 20/20, the details of the enormous insight into the lead-up to 9/11 and 9/11 story reveal that federal authorities had the events of that day. The report spawned a significant opportunities to unravel the 9/11 further round of policy changes, most notably terrorist plot and potentially avert that day’s enactment of the Intelligence Reform and tragedies. Two of the terrorists who ultimately Terrorism Prevention Act of 2004.2 hijacked and destroyed American Airlines flight Information about key members of the 9/11 77 were already considered suspects by federal plot was available to the U.S. government prior authorities and known to be in the United to the attacks, and the 9/11 terrorists were States. One of them was known to have associ- closely connected to one another in a multitude ated with what a CIA official called a “major of ways. The 9/11 Commission concluded that, league killer.”3 Finding them and connecting by pursuing the leads available to it at the time, them to other September 11 hijackers would the government might have derailed the plan. have been possible—indeed, quite feasible— What the 9/11 story most clearly calls for is using the legal authority and investigative sys- sharper focus on the part of our national secu- tems that existed before the attacks. rity agencies and the ability to efficiently In the days and months before 9/11, new laws locate, access, and aggregate information and technologies like predictive data mining about specific suspects. Investigators should were not necessary to connect the dots. What was use intelligence to identify subjects of interest needed to reveal the remaining 9/11 conspirators and then follow specific leads to detect and was better communication, collaboration, a preempt terrorism. But a significant reaction heightened focus on the two known terrorists, to 9/11 beyond Congress’s amendments to and traditional investigative processes. federal law was the consideration and possible This paper is not intended to attack the use of “data mining” as a way to discover plan- hard-working and well-intentioned members ning and preparation for terrorism. of our law enforcement and intelligence com- Data mining is not an effective way to dis- munities. Rather, it seeks to illustrate that cover incipient terrorism. Though data mining predictive data mining, while well suited to has many valuable uses, it is not well suited to certain endeavors, is problematic and gener- the terrorist discovery problem. It would be ally counterproductive in national security unfortunate if data mining for terrorism dis- settings where its use is intended to ferret out covery had currency within national security, the next terrorist. law enforcement, and technology circles be- cause pursuing this use of data mining would waste taxpayer dollars, needlessly infringe on The Story behind 9/11 privacy and civil liberties, and misdirect the valuable time and energy of the men and Details of the run-up to 9/11 provide women in the national security community. tremendous insight into what could have 2
    • been done to hamper or even entirely avert With the knowledge that the associate of a the 9/11 attacks. Failing to recognize these “major league killer” was possibly roaming free details and learn from them could com- in the United States, the hunt by the FBI should pound the tragedy either by permitting have been on. The FBI certainly had a valid rea- future attacks or by encouraging acquies- son to open a case against these two individuals cence to measures that erode civil liberties as they were connected to the ongoing USS Cole without protecting the country. bombing investigation, the 1998 embassy In early January 2000 covert surveillance bombing, and al-Qaeda.12 On August 24, 2001, revealed a terrorist planning meeting in Kuala Nawaf al-Hazmi and al-Mihdhar were added to Lumpur that included Nawaf al-Hazmi, the State Department’s TIPOFF13 watchlist.14 Khalid al-Mihdhar, and others.4 In March 2000 Efforts to locate Nawaf al-Hazmi and al- the CIA was informed that Nawaf al-Hazmi Mihdhar initially foundered on confusion departed Malaysia on a United Airlines flight within the FBI about the sharing and use of for Los Angeles. (Although unreported at the data collected through intelligence versus time, al-Mihdhar was on the same flight.) The criminal channels.15 The search for al-Mihdhar CIA did not notify the State Department and was assigned to one FBI agent, his first ever the FBI.5 Later to join the 9/11 hijackings, both counterterrorism lead.16 Because the lead was were known to be linked with al-Qaeda and “routine,” he was given 30 days to open an The 9/11 specifically with the 1998 embassy bombings intelligence case and make some effort to terrorists did not in Tanzania and Kenya.6 As the 9/11 locate al-Mihdhar.17 If more attention had take significant Commission reported, the trail was lost with- been paid to these subjects, the location and out a clear realization that it had been lost, and detention of al-Mihdhar and Nawaf al-Hazmi steps to mask without much effort to pick it up again.7 could have derailed the 9/11 attack.18 their identities In January 2001, almost one year after being lost in Bangkok, al-Mihdhar was on or obscure the radar screen again after being identified Hiding in Plain Sight their activities. by a joint CIA-FBI investigation of the bomb- ing of the USS Cole, the October 2000 attack The 9/11 terrorists did not take significant on a U.S. guided missile destroyer in Yemen’s steps to mask their identities or obscure their Aden Harbor that killed 17 crew members activities. They were hiding in plain sight. They and injured 39.8 Even with this new knowl- had P.O. boxes, e-mail accounts, drivers’ licens- edge the CIA did not renew its search for al- es, bank accounts, and ATM cards.19 For exam- Mihdhar and did not make his identity ple, Nawaf al-Hazmi and al-Mihdhar used their known to the State Department (which pre- true names to obtain California drivers’ licenses sumably would have interfered with his plans and to open New Jersey bank accounts.20 Nawaf to re-enter the United States).9 Al-Mihdhar al-Hazmi had a car registered, and his name flew to New York City on July 4, 2001, on a appeared in the San Diego white pages with an new visa. As the 9/11 Commission reported, address of 6401 Mount Ada Road, San Diego, “No one was looking for him.”10 California.21 Mohamed Atta registered his red On August 21, 2001, an FBI analyst who Pontiac Grand Prix car in Florida with the had been detailed to the CIA’s Bin Laden unit address 4890 Pompano Road, Venice.22 Ziad finally made the connection and “grasped Jarrah registered his red 1990 Mitsubishi the significance” of Nawaf al-Hazmi and al- Eclipse as well.23 Fourteen of the terrorists got Mihdhar’s visits to the United States. The drivers’ licenses or ID cards from either Florida Immigration and Naturalization Service was or Virginia.24 immediately notified. On August 22, 2001, The terrorists not only operated in plain the INS responded with information that sight, they were interconnected. They lived caused the FBI analyst to conclude that al- together, shared P.O. boxes and frequent Mihdhar might still be in the country.11 flyer numbers, used the same credit card 3
    • numbers to make airline travel reservations, bling them into reports that simplify basic and made reservations using common background investigations done by PIs, addresses and contact phone numbers. For potential employers, potential landlords, and example, al-Mihdhar and Nawaf al-Hazmi others. These databases include phone book lived together in San Diego.25 Hamza al- data, driver’s license data, vehicle registration Ghamdi and Mohand al-Shehri rented Box data, credit header data, voter registration, 260 at a Mail Boxes Etc. for a year in Delray property ownership, felony convictions, and Beach, Florida.26 Hani Hanjour and Majed the like. Such a search could have unearthed Moqed rented an apartment together at 486 the driver’s license, the car registration, and Union Avenue, Patterson, New Jersey.27 Atta the telephone listing of Nawaf al-Hazmi and stayed with Marwan al-Shehhi at the Hamlet al-Mihdhar.36 Country Club in Delray Beach, Florida. Later, Given the connections of Nawaf al-Hazmi they checked into the Panther Inn in and al-Mihdhar to terrorist activities over- Deerfield Beach together.28 seas, the FBI, of course, could have sought When Ahmed al-Nami applied for his subpoenas for credit card and banking infor- Florida ID card he provided the same address mation, travel information, and other busi- that was used by Nawaf al-Hazmi and Saeed al- ness records. It could have conducted inten- Ghamdi.29 Wail al-Shehri purchased plane tick- sive surveillance under FISA, the Foreign ets using the same address and phone number Intelligence Surveillance Act, because the as Waleed al-Shehri.30 Nawaf al-Hazmi and case involved a foreign power or an agent of a Salem al-Hazmi booked tickets through foreign power.37 The FBI could not only have Travelocity.com using the same Fort Lee, New located these subjects but could have started Jersey, address and the same Visa card.31 to unravel their highly interconnected net- Abdulaziz al-Omari purchased his ticket via the work, had it been pursuing available leads. American Airlines website and used Atta’s fre- It is Monday morning quarterbacking, of quent flyer number and the same Visa card and course, to suggest that all 19 of the 9/11 hijack- address as Atta (the same address used by ers could have been rolled up by the proper Marwan al-Shehhi).32 The phone number al- investigation. But interference with and deten- Omari used on his plane reservation was also tion of the right subset of the 9/11 terrorists the same as that of Atta and Wail and Waleed might have “derailed the plan,” as the 9/11 al-Shehri.33 Hani Hanjour and Majed Moqed Commission concluded in its report.38 rented room 343 at the Valencia Hotel on If our federal law enforcement and intelli- Route 1 in Laurel, Maryland; they were joined gence agencies needed anything, it was nei- by al-Mihdhar, Nawaf al-Hazmi, and Salem al- ther new technology nor more laws but sim- Hazmi.34 While these are plentiful examples of ply a sharper focus and perhaps the ability to the 9/11 terrorists’ interconnectedness, even more efficiently locate, access, and aggregate more connections existed. information about specific suspects. They lacked this focus and capability—with tragic results. Interference with Finding a Few Bad Guys and detention In late August 2001 the FBI began to Data Analysis and of the right search for al-Mihdhar and Nawaf al-Hazmi.35 Data Mining The two might have been located easily even subset of the 9/11 by a private investigator (PI). A PI would have As we have seen, authorities could have terrorists might performed a public records search using a ser- and should have more aggressively hunted vice such as those provided by ChoicePoint or some of the 9/11 terrorists. If they had been have derailed LexisNexis, perhaps both. These organiza- hunted, they could have been found. Their the plan. tions aggregate public record data, assem- web of connections would have led suffi- 4
    • ciently motivated investigators to informa- mining” may have preserved disagreements Data analysis tion that could have confounded the 9/11 among people who may be in substantial adds to the plot. Better interagency information shar- agreement. ing,39 investigatory legwork in pursuit of gen- Several authorities have offered definitions investigatory uine leads, and better training are what the or discussions of data mining that are impor- arsenal of 9/11 story most clearly calls for. tant touchstones, though they still may not be A number of policy changes followed the sufficiently precise. In its May 2004 report, for national security 9/11 attacks. The Intelligence Reform and example, the Government Accountability and law enforce- Terrorism Prevention Act of 2004 revamped Office surveyed the literature and produced ment by bringing the nation’s intelligence operations, and the the following definition of data mining: “the USA-PATRIOT Act eased information shar- application of database technology and tech- together more ing between investigators pursuing criminal niques—such as statistical analysis and model- information and national security cases. ing—to uncover hidden patterns and subtle from more Data mining also gained some currency in relationships in data and to infer rules that national security and technology circles as a allow for the prediction of future results.”45 In diverse sources potential anti-terrorism tool,40 though whether a January 2006 report, the Congressional and correlating and to what extent it has been used are unclear. Research Service said: The Total Information Awareness program the data. within the Department of Defense is widely Data mining involves the use of sophisti- believed to have contemplated using data min- cated data analysis tools to discover pre- ing, though the program’s documentation is viously unknown, valid patterns and unclear.41 The documentation discusses re- relationships in large data sets. These search on privacy-protecting technologies,42 tools can include statistical models, but Congress defunded the program in 2003 mathematical algorithms, and machine because of privacy concerns. However, the learning methods (algorithms that National Journal reported in February 2006 that improve their performance automatical- research on “predict[ing] terrorist attacks by ly through experience, such as neural net- mining government databases and the person- works or decision trees). Consequently, al records of people in the United States” has data mining consists of more than col- been moved from the Department of Defense lecting and managing data, it also to another group linked to the National includes analysis and prediction.46 Security Agency.43 In May 2004 the Government Account- Data mining is best understood as a sub- ability Office reported the existence of 14 set of the broader practice of data analysis. data-mining programs, planned or opera- Data analysis adds to the investigatory arse- tional, dedicated to analyzing intelligence and nal of national security and law enforcement detecting terrorist activity, in the Depart- by bringing together more information from ments of Defense, Education, Health and more diverse sources and correlating the Human Services, Homeland Security, and data. Finding previously unknown financial Justice.44 Ten of them were reported to use or communications links between criminal personal information. Of those, half use gangs, for example, can give investigators information acquired from the private sector, more insight into their activities and culture, other agencies, or both. strengthening the hand of law enforcement. “Data mining” is a broad and fairly loaded The key goal—and challenge—is to pro- term that means different things to different duce not just more information but more use- people. Up to this point, discussions of data ful information. “Useful information” is mining have probably been hampered by lack information that puts the analyst in a posi- of clarity about its meaning. Indeed, collec- tion to act appropriately in a given context. It tive failure to get to the root of the term “data is the usefulness of the result—the fact that it 5
    • can be used effectively for a given purpose— analysis. Data analysis brought information that establishes the value of any given algo- from diverse sources together to create new rithm. The ultimate goal of data analysis is to knowledge. discover knowledge.47 There are two loose categories of data The term “predicate” is often used in law analysis that are relevant to this discussion: enforcement to refer to a piece of informa- subject based and pattern based.48 Subject- tion that warrants further investigation or based data analysis seeks to trace links from action. When a police officer sees a person known individuals or things to others. The attacking another with a knife, that is a example just cited and the opportunities to sound basis, or predicate, for intervening by disrupt the 9/11 plot described further above drawing his or her weapon and calling for a would have used subject-based data analysis stop to the attack. When a police officer because each of them starts with information observes people appearing to “case” a store, about specific suspects, combined with gen- that may be a predicate for making a display eral knowledge. of authority or briefly questioning the people In pattern-based analysis, investigators use about their purposes. In Fourth Amendment statistical probabilities to seek predicates in law, probable cause to believe that informa- large data sets. This type of analysis seeks to Attempting to use tion about a crime can be found in a particu- find new knowledge, not from the investigative predictive data lar place is a predicate for the issuance of a and deductive process of following specific mining to ferret warrant to search that place. leads, but from statistical, inductive processes. Here is an example of a potential terrorism- Because it is more characterized by prediction out terrorists related predicate: The combined facts that a than by the traditional notion of suspicion, we before they strike particular person has been identified by an refer to it as “predictive data mining.” informant as having visited Afghanistan dur- The question in predictive data mining is would be a subtle ing June 2001 and participated in scuba train- whether and when it comes up with action- but important ing some years later, and that al-Qaeda plans able information, with knowledge: suitable misdirection of to have divers mine cruise ships, may form a predicates for subsequent action. As we will predicate for investigating the person or mon- discuss below, there are many instances when national security itoring his or her communications. it does. But terrorism is not one. Attempting resources. In the first two examples discussed to use predictive data mining to ferret out above—the knife attack and thieves casing a terrorists before they strike would be a subtle store—all the observations needed to estab- but important misdirection of national secu- lish a predicate for action were collected at rity resources. once. Those are simple cases. Other than The possible benefits of predictive data judging whether the response is proportion- mining for finding planning or preparation al to the predicate, there is little need to parse for terrorism are minimal. The financial costs, them. But in the terror-suspect example, sev- wasted effort, and threats to privacy and civil eral observations made by different people at liberties are potentially vast. Those costs out- different times are combined to create the strip any conceivable benefits of using predic- predicate. The fact that the person visited tive data mining for this purpose. Afghan training camps might have come from an informant in Europe. The fact that he took scuba training might have come Predictive Data Mining from business records in Corpus Christi, in Action Texas. And the fact that al-Qaeda contem- plated using scuba divers may have come Predictive data mining has been applied from a computer captured in Pakistan. most heavily in the area of consumer direct Because multiple observations are combined, marketing. Companies have spent hundreds this predicate can be said to result from data of millions if not billions of dollars imple- 6
    • menting and perfecting their direct market- keters’ searches for new customers are typi- ing data-mining initiatives. Data mining cer- cally in excess of 90 percent. tainly gives a “lift” to efforts to find people The “damage” done by an imperfectly with certain propensities. In marketing, data aimed direct-mail piece may be a dollar lost to mining is used to reduce the expense (to the marketer and a moment’s time wasted by companies) and annoyance (to consumers) the consumer. That is an acceptable loss to of unwanted advertising. And that is valuable most people. The same results in a terror to companies despite the fact that response investigation would not be acceptable. Civil rates to bulk mailings tuned by data mining liberties violations would be routine and per- improve by only single-digit percentages. son-years of investigators’ precious time Consider how a large retailer such as Acme would be wasted if investigations, surveillance, Discount Retail (“Acme Discount”)—a fictional or the commitment of people to screening retailer trying to compete with Wal-Mart and lists were based on algorithms that were Target—might use data mining: Acme Dis- wrong the overwhelming majority of the time. count wants to promote its new store that just Perhaps, though, more assiduous work by opened in a suburb of Chicago. It has many government authorities and contractors— other stores and thousands of customers. using a great deal more data—could over- Starting with the names and addresses of the come the low precision of data mining and top 1,000 Acme Discount customers, it con- bring false positives from 90+ percent to the tracts with a data broker to enhance what it low single digits. For at least two related rea- knows about those customers. (This is known sons, predictive data mining is not useful for in database marketing as an “append” process.) counterterrorism: First, the absence of terror- Acme Discount may purchase magazine sub- ism patterns means that it would be impossi- scription and warranty card information (just ble to develop useful algorithms. Second, the to name a couple of likely data sources). Those corresponding statistical likelihood of false sources augment what Acme Discount knows positives is so high that predictive data min- about its customers with such data points as ing will inevitably waste resources and threat- income levels, presence of children, purchasing en civil liberties. power, home value, and personal interests, such as a subscription to Golf Digest. Thus, Acme Discount develops a demo- The Absence of Terrorism graphic profile of what makes a good Acme Patterns Discount customer. For example, the ideal cus- tomer might be a family that subscribes to One of the fundamental underpinnings of magazines of the Vanity Fair genre, that has two predictive data mining in the commercial sec- to four children, that owns two or fewer cars, tor is the use of training patterns. Corporations and that lives in a home worth $150,000– that study consumer behavior have millions of The statistical $225,000. Acme Discount’s next objective is to patterns that they can draw upon to profile likelihood of locate noncustomers near its new Chicago their typical or ideal consumer. Even when data store that fit this pattern and market to them mining is used to seek out instances of identity false positives is in the hope they will do business at the newly and credit card fraud, this relies on models con- so high that opened store. The goal is to predict as accu- structed using many thousands of known predictive data rately as possible who might be swayed to shop examples of fraud per year. at Acme Discount. Terrorism has no similar indicia. With a mining will Despite all of this information collection relatively small number of attempts every inevitably waste and statistical analysis, the percent chance year and only one or two major terrorist inci- resources and that Acme Discount will target someone will- dents every few years—each one distinct in ing to transact is in the low to mid single dig- terms of planning and execution—there are threaten civil its.49 This means that false positives in mar- no meaningful patterns that show what liberties. 7
    • Without well- behavior indicates planning or preparation wrongly reports the presence of disease. A constructed for terrorism. false negative, or Type II error, is when a test Unlike consumers’ shopping habits and wrongly reports the absence of disease. Study algorithms based financial fraud, terrorism does not occur with of the false positive and false negative rates in on extensive enough frequency to enable the creation of particular tests, combined with the incidence valid predictive models. Predictive data mining of the disease in the population, helps deter- historical for the purpose of turning up terrorist plan- mine when the test should be administered patterns, ning using all available demographic and trans- and how test results are used. predictive data actional data points will produce no better Even a test with very high accuracy—low results than the highly sophisticated commer- false positives and false negatives—may be mining for cial data mining done today. The one thing pre- inappropriate to use widely if a disease is not terrorism dictable about predictive data mining for ter- terribly common. Suppose, for example, that a will fail. rorism is that it would be consistently wrong. test for a particular disease accurately detects Without patterns to use, one fallback for ter- the disease (reports a true positive) 99 percent rorism data mining is the idea that any anom- of the time and inaccurately reports the pres- aly may provide the basis for investigation of ence of the disease (false positive) 1 percent of terrorism planning. Given a “typical” American the time. Suppose also that only one in a thou- pattern of Internet use, phone calling, doctor sand, or 0.1 percent of the population, has visits, purchases, travel, reading, and so on, per- that disease. Finally, suppose that if the test haps all outliers merit some level of investiga- indicates the presence of disease the way to tion. This theory is offensive to traditional confirm it is with a biopsy, or the taking of a American freedom, because in the United tissue sample from the potential victim’s body. States everyone can and should be an “outlier” It would seem that a test this good should in some sense. More concretely, though, using be used on everyone. After all, in a popula- data mining in this way could be worse than tion of 300 million people, 300,000 people searching at random; terrorists could defeat it have the disease, and running the test on the by acting as normally as possible. entire population would reveal the disease in Treating “anomalous” behavior as suspi- 297,000 of the victims. But it would cause 10 cious may appear scientific, but, without pat- times that number—nearly three million peo- terns to look for, the design of a search algo- ple—to undergo an unnecessary biopsy. If the rithm based on anomaly is no more likely to test were run annually, every 5 years, or every turn up terrorists than twisting the end of a 10 years, the number of people unnecessarily kaleidoscope is likely to draw an image of the affected would rise accordingly. Mona Lisa. In his book The Naked Crowd, George Without well-constructed algorithms Washington University law professor Jeffrey based on extensive historical patterns, predic- Rosen discusses false positive rates in a system tive data mining for terrorism will fail. The that might have been designed to identify the result would be to flood the national security 19 hijackers involved in the 9/11 attacks.50 system with false positives—suspects who are Assuming a 99 percent accuracy rate, search- truly innocent. ing our population of nearly 300,000,000, some 3,000,000 people would be identified as potential terrorists. False Positives The concepts of false positive and false Costs of Predictive negative come from probability theory. They Data Mining have a great deal of use in health care, where tests for disease have known inaccuracy rates. Given the assumption that the devasta- A false positive, or Type I error, is when a test tion of the 9/11 attacks can be replicated, 8
    • some people may consider the investigation stop-and-frisk as a violation of the Fourth of 1 percent of the population (or whatever Amendment. the false positive rate) acceptable, just as If predictive data mining is used as the some might consider it acceptable for 10 peo- basis for investigating specific people, it must ple to undergo unnecessary surgery for every meet this test: there must be a pattern that fits 1 person diagnosed with a certain disease. terrorism planning—a pattern that is exceed- Fewer would consider a 5 percent error rate ingly unlikely ever to exist—and the actions of (or 15,000,000 people) acceptable. And even investigated persons must fit that pattern fewer would consider a 10 percent error rate while not fitting any common pattern of law- (or 30,000,000 people) acceptable. ful behavior. Predictive data mining premised The question is not simply one of medical on insufficient pattern information could not ethics or Fourth Amendment law but one of possibly meet this test. Unless investigators resources. The expenditure of resources need- can winnow their investigations down to data ed to investigate 3,000,000, 15,000,000, or sets already known to reflect a high incidence 30,000,000 fellow citizens is not practical of actual terrorist information, the high num- from a budgetary point of view, to say noth- ber of false positives will render any results ing of the risk that millions of innocent peo- essentially useless. ple would likely be under the microscope of Predictive data mining requires lots of The unfocused, progressively more invasive surveillance as data. Bringing all the data, either physically false-positive- they were added to suspect lists by successive or logically, into a central system poses a laden results of data-mining operations. number of challenging problems, including As we have shown, the unfocused, false- the difficulty of keeping the data current and predictive data positive-laden results of predictive data min- the difficulty of protecting so much sensitive mining in the ing in the terrorism context would waste data from misuse. Large aggregations of data national resources. Worse yet, the resources create additional security risks from both terrorism context expended following those “leads” would insiders and outsiders because such aggre- would waste detract directly from pursuing genuine leads gates are so valuable and attractive. national that have been developed by genuine intelli- Many Americans already chafe at the large gence. amount and variety of information about resources. The corollary would be to threaten the them available to marketers and data aggre- civil liberties of the many Americans deemed gators. Those data are collected from their suspects by predictive data mining. As many commercial transactions and from Supreme Court precedents show, the agar in public records. Most data-mining efforts which reasonable suspicion grows is a mix- would rely on even more collections of trans- ture of specific facts and rational inferences. actional and behavioral information, and on Thus, in Terry v. Ohio, the Supreme Court centralization of that data, all to examine approved a brief interrogation and pat-down Americans for criminality or disloyalty to the of men who appeared to have been “casing” a United States or Western society. That level store for robbery.51 An experienced officer of surveillance, aimed at the entire citizenry, observed their repeated, furtive passes by a would be inconsistent with American values. store window; that gave him sufficient cause to approach the men, ask their business, and pat them down for weapons, which he found. The Deceptiveness of The behavior exhibited by the men he frisked Predictive Data Mining fit a pattern of robbery planning and did not fit any common pattern of lawful and inno- Experience with a program that used pre- cent behavior. Any less correlation between dictive data mining shows that it is not very their behavior and inchoate crime and the helpful in finding terrorists, even when abun- Court would likely have struck down the dant information is available. Using predic- 9
    • tive analysis—even in hindsight—the universe almost certain to fail when information of “suspects” generated contains so many about attackers and their plans, associates, irrelevant entries that such analysis is essen- and methods is not known. tially useless. In his book No Place to Hide, Washington Post reporter Robert O’Harrow tells the story of Conclusion how Hank Asher, owner of an information service called Seisint, concocted a way to So how should one find bad guys? The fight back against terrorists in the days after most efficient, effective approach—and the September 11, 2001. one that protects civil liberties—is the one suggested by 9/11: pulling the strings that Using artificial intelligence software connect bad guys to other plotters. and insights from profiling programs Searching for terrorists must begin with he’d created for marketers over the actionable information, and it must follow years, he told Seisint’s computers to logically through the available data toward look for people in America who had greater knowledge. Predictive data mining certain characteristics that he thought always provides “information,” but useful might suggest ties to terrorists. Key ele- knowledge comes from context and from ments included ethnicity and religion. inferences drawn from known facts about In other words, he was using the data known people and events. to look for certain Muslims. “Boom,” The Fourth Amendment is a help, not a he said, “32,000 people came up that hindrance: It guides the investigator toward looked pretty interesting.”. . . specific facts and rational inferences. When In his darkened bedroom that night, they focus on following leads, investigators he put the system through its paces can avoid the mistaken goal of attempting to over a swift connection to Seisint. “I got “predict” terrorist attacks, an effort certain to down to a list of 419 through an artifi- flood investigators with false positives, to cial intelligence algorithm that I had waste resources, and to open the door to written,” he recalled later. The list con- infringements of civil liberties. That approach tained names of Muslims with odd ties focuses our national security effort on devel- or living in suspicious-seeming circum- oping information about terrorism plotters, stances, at least according to Asher’s their plans, and associates. It offers no panacea analysis.52 or technological quick fix to the security dilemmas created by terrorism. But there is no Ultimately, Asher produced a list of 1,200 quick fix. Predictive data mining is not a sharp people he deemed the biggest threats. Of enough sword, and it will never replace tradi- those, five were hijackers on the planes that tional investigation and intelligence, because Data mining crashed September 11, 2001. it cannot predict precisely enough who will be is almost certain What seems like a remarkable feat of pre- the next bad guy. dictive analysis is more an example of how Since 9/11 there has been a great deal of to fail when deceptive hindsight can be. Asher produced a discussion about whether data mining can information list of 9/11 terror suspects with a greater than prevent acts of terrorism. In fact, the most about attackers 99 percent false positive rate—after the attack, efficient means of detecting and preempting its perpetrators, and their modus operandi terrorism have been within our grasp all and their plans, were known. along. Protecting America requires no predic- associates, and The proof provided by the Seisint experi- tive-data-mining technologies. ence is not that there is a viable method in Indeed, if there is a lesson to be learned methods is not predictive analysis for finding incipient ter- from 9/11, it is not very groundbreaking. It is known. rorism but that data mining of this type is this: Enable investigators to efficiently dis- 10
    • cover, access, and aggregate relevant informa- http://www.businessweek.com/bwdaily/dnflash/ oct2001/nf2001104_7412.htm. tion related to actionable suspects. Period. Sufficient dedication of national resources to 22. Dan Eggen et al., “The Plot: A Web of more precisely “pull the strings” offers the Connections,” WashingtonPost.com, October 4, 2001, best chance of detecting and preempting http://www.washingtonpost.com/wp-srv/nation /graphics/attack/investigation_24.html (hereinafter future acts of terrorism. “Web of Connections”). 23. “Web of Connections.” Notes 24. Ibid. 1. Uniting and Strengthening America by Providing Appropriate Tools Required to Intercept and 25. Ibid. Obstruct Terrorism Act of 2001 (USA PATRIOT Act), Pub. L. No. 107-56 (Oct. 12, 2001). 26. Ibid. 2. Intelligence Reform and Terrorism Prevention 27. Ibid. Act of 2004, Pub. L. No. 108-458 (Dec. 17, 2004). 28. Ibid. 3. National Commission on Terrorist Attacks upon the United States, The 9/11 Commission Report, 2004, 29. Ibid. p. 268 (hereinafter 9/11 Commission Report). 30. Ibid. 4. Ibid., p. 181. 31. Ibid. 5. Ibid., pp. 181–82. 32. Ibid. 6. Ibid., p. 181. 33. Ibid. 7. Ibid., p. 266. 34. Ibid. 8. Ibid., p. 266. 35. 9/11 Commission Report, pp. 271–72. 9. Ibid., pp. 266–67. 36. Ibid., p. 539, n 85. 10. Ibid., p. 269. 37. 50 U.S.C. § 1804. 11. Ibid., p. 270. 38. 9/11 Commission Report, p. 272. 12. Ibid., p. 271. 39. Ibid., p. 271. 13. The TIPOFF database contains a list if for- eigners who will be denied a U.S. visa. 40. Arshad Mohammed and Sara Kehaulani Goo, “Government Increasingly Turning to Data Mining,” 14. 9/11 Commission Report, p. 270. Washington Post, June 15, 2006, http://www.washing tonpost.com/wp-dyn/content/article/2006/06/ 15. Ibid., p. 271. 14/AR2006061402063.html. 16. Ibid., p. 288. 41. See Defense Advanced Research Projects Agency, Information Awareness Office, “Report to 17. Ibid., p. 271. Congress Regarding the Terrorism Information Awareness Program, “ May 30, 2003, pp. 7–8, 17, A- 18. Ibid., p. 272. 4, A-14, A-15 (referring variously to “discovery of . . . patterns of activity”; “ability to automatically learn 19. Tim Golden et al., “A Nation Challenged: The patterns”; “training software algorithms to recog- Plot,” New York Times, September 23, 2001. nize patterns”; and “developing technology to . . . suggest previously unknown but potentially signif- 20. 9/11 Commission Report, p. 539, n 85. icant patterns”), http://foi.missouri.edu/totalin foaware/tia2.pdf. 21. Ibid.; and Jane Black, “Don’t Make Privacy the Next Victim of Terror,” BusinessWeek Online, 42. Ibid., pp. 6–7. 11
    • 43. Shane Harris, “TIA Lives On,” National Journal, Harper Collins 2005), p. 331; and Mary DeRosa, February 23, 2006, http://nationaljournal.com/ “Data Mining and Data Analysis for Counterter- about/njweekly/stories/2006/0223nj1.htm. rorism,” Center for Strategic and International Studies, March 2004. 44. Government Accountability Office, “Data Mining: Federal Efforts Cover a Wide Range of 49. Direct marketing results are dependent upon Uses,” GAO-04-548, May 3004. many factors such as industry and offer. For example, offering a consumer a loss-leader dis- 45. Ibid., p. 1. count raises response rates. Despite billions invested and unprecedented access to U.S. con- 46. Jeffrey W. Seifert, “Data Mining and Homeland sumer behavior data, current direct marketing Security: An Overview,” Congressional Research response rates industrywide range from 5.78 per- Service, updated January 27, 2006 (Order Code cent for telephone solicitation to 0.04 percent for RL31798). See also K. A. Taipale, “Data Mining and direct response television. Direct Marketing Domestic Security: Connecting the Dots to Make Association, “DMA Releases New Response Rate Sense of Data,” Columbia Science and Technology Law Report,” news release, October 17, 2004, http: Review 22–23 (2003), http://papers..ssrn.can/ //www.the-dma.org/cgi/dispnewsstand?article abstract=5467827. =2891. 47. The major annual conference on data mining is 50. Jeffrey Rosen, The Naked Crowd (New York: called “KDD,” for Knowledge Discovery and Data Random House, 2004), pp. 104–7. Mining. See http://www.acm.org/sigs/sigkdd/kd d2006. 51. 392 U.S. 1 (1968). 48. See Martha Baer et al., SAFE: The Race to Protect 52. Robert O’Harrow Jr., No Place to Hide (New Ourselves in a Newly Dangerous World (New York: York: Free Press, 2005), pp. 98, 102.