SlideShare a Scribd company logo
1 of 11
Download to read offline
Running head: DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 1
DARPA Memex Project Erodes Internet Privacy
Christopher Furton
Syracuse University
DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 2
Abstract
In February of 2014, the Defense Advanced Research Projects Agency (DARPA)
announced the Memex Project that is currently being used by Federal agencies, law enforcement,
and Non-Governmental Organizations (NGOs). The Memex project deploys technology that
crawls, indexes, analyzes, extracts, and provides search functionality across the entire Internet
including the criminal underground referred to as the Dark Net. Despite the good intention of
DARPA, the Memex tool raises several privacy concerns such as scope of use, oversight and
transparency, data retention, and information security. With this powerful big data capability,
precautions must be taken to protect citizen’s privacy rights.
DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 3
DARPA Memex Project Further Erodes Internet Privacy
What is Memex?
The Defense Advanced Research Projects Agency (DARPA) announced plans to create a
project known as Memex on February 9th
, 2014 (DARPA, 2014, p. 1). A further look into the
Broad Agency Announcement (BAA) shows that DARPA is looking for proposals from industry
to “maintain technological superiority in the area of content indexing and web search on the
Internet” (DARPA: Information Innovation Office, 2014, p. 4). DARPA identifies a problem
with current web search functionality stating that it has limitations on what gets indexed and the
richness of available details. For government researchers and law enforcement personnel,
current methods used involve manual searching by input of exact information one entry at a time.
Further analysis must be done to organize or aggregate beyond a list of links (DARPA:
Information Innovation Office, 2014, p. 4).
DARPA plans to solve this problem with the Memex Project by developing technologies
that “provide the mechanisms for content discovery, information extraction, information
retrieval, user collaboration, and other areas need to address distributed aggregation, analysis,
and presentation of web content” (DARPA: Information Innovation Office, 2014, p. 5). To
accomplish this, DARPA has divided the work into three technical areas: domain-specific
indexing, domain-specific search, and applications. DARPA specifies the need for technology to
reach beyond traditional content, specifically naming the Dark Web as a target. The Dark Web
refers to the large mass of Internet content not relatively accessible through search engines often
requiring special encryption software to access (Chandler, n.d.).
The first technical area DARPA is interested in is domain-specific indexing. This
technical area focuses on developing a highly scalable web crawling capability with both content
DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 4
discovery and information extraction. This crawling process will provide automated link
discovery including obfuscated links, discovery of deep and dark web content, and hidden
services – the function of providing web services such as chat or web page hosting on the Dark
Web. Additionally, this capability will include counter-crawling measures such as paywalls or
member-only areas, crawler bans, and even human detection. Lastly, this capability must also be
able to extract information and include normalization of heterogeneous data, natural language
processing for translation, image analysis, extraction of multimedia, and several other functions
(DARPA: Information Innovation Office, 2014, p. 7).
The second technical area DARPA is interested in is domain-specific searching. This
capability is not the same as current commercial web searching; instead, it will have configurable
interfaces into web content indexed by the first technical area. The interfaces, as outlined in the
BAA, may include conceptually aggregated results, conceptually connected content, task
relevant facets, implicit collaboration for enriched content, explicit collaboration with shared
tags, and several other capabilities. Lastly, this technical area will include a query language so
that DARPA personnel may modify instructions for the crawlers and information extraction
algorithms (DARPA: Information Innovation Office, 2014, p. 7).
The last technical area DARPA is interested in is generically referred to as
“applications.” This technical area is where system-level concepts of operation and use cases are
developed. The utility of the system must be able to evolve over time based off the needs of the
Department of Defense and other agencies, and it involves the development of possible new
content domains including missing persons, found data, and counterfeit goods. Specified in this
technical area is that the integration, testing, and evaluation is to be performed on the open public
Internet (DARPA: Information Innovation Office, 2014, pp. 8-9).
DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 5
Stated Purpose
According to Wired Magazine (2015), Memex’s purpose is to uncover patterns and
relationships in online data for law enforcement and others who track illegal activity (para. 1).
Memex uses automated methods to analyze content in order to uncover hidden relationships
between data points. Additionally, it helps researchers determine how much of the dark web’s
traffic is related to hidden services where content could be indexed. Furthermore, Memex can
help investigators understand the turnover of sites, specifically, the relationship between sites
when one shuts down and a seemingly unrelated site opens up (Zetter, 2015).
DARPA has gone through great measures to specify that the Memex Project is aimed at
indexing “domain-specific” content and gives the example of Human Trafficking – both labor
and sex - in the BAA (DARPA: Information Innovation Office, 2014, p. 4). At least one
instance of Memex in action has been documented: the New York District Attorney’s Office
claims that an experimental set of Internet search tools is part of the prosecutor’s arsenal that
helped secure a sex trafficking conviction. In this instance, Memex was used to scour the
Internet looking for advertisements used to lure victims into servitude and to promote their
sexual exploitation (Greenemeier, 2015, para. 2-3).
DARPA is an agency under the Department of Defense; however, the stated purpose of
Memex has a direct correlation to law enforcement. DARPA has confirmed that in August of
2014, several beta testers were approved to use Memex including two district attorney’s offices,
a law enforcement group, and a nongovernmental organization (NGO). The next set of tests are
expected to begin in early 2015 and include federal and district prosecutors, regional and
national law enforcement, and multiple NGOs. Every quarter, DARPA wants to expand user
testing until they are comfortable handing the tool over to law enforcement agencies and
DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 6
prosecutors. Eventually, the plan is to have the Memex capability installed locally at law
enforcement agencies and to ensure police could access the software from anywhere
(Greenemeier, 2015).
Hypothetical Privacy Abuse
The Dark Web provides an avenue for cyber criminals, human traffickers, child
pornographers, and other criminals to conduct business (Bradbury, 2014). Additionally, it also
provides an avenue for planning and coordination of terrorism activities posing a threat to
National Security (Sachan, 2012). It is only logical that the United States Government would be
interested in gaining better insight and surveillance into the Dark Web for both national security
and law enforcement reasons; however, with that comes the need to protect citizen’s privacy.
Memex, as an international “Big Data” tool, provides the capability for content discovery, index,
search, aggregation, and extraction on a very large scale introducing a swarm of information
privacy concerns.
Scope of Use
One major privacy concern involves the breadth of data collected by Memex’s web
crawling indexers. Since the web crawlers are designed to ignore the content owner’s explicit
crawler prohibitions – often done by robots.txt files – as well as penetrating paywalls and
membership areas, it is highly feasible that content intended to be private will get vacuumed into
the Memex system. Therefore, Memex can be seen as another attempt by the United States
Government to increase the size of the figurative information haystack in order to find more
needles. Despite the usage of “domain-specific” collection techniques, it is inevitable that non-
domain data will get introduced into Memex.
DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 7
Another privacy concern involves the scope of users being allowed access to Memex-
siphoned data. As mentioned earlier in this paper, DARPA has already given access to one
private Non-Governmental Organization (NGO) which introduces concerns, as noted by Mueller
(2010) in regards to private organizations and content regulation, about governance arrangements
that may “avoid substantive and due process rights” (p. 213). With DARPA planning to extend
access to many more NGOs, citizens should be concerned about who has access to this
potentially massive trove of well-organized and readily accessible private information.
Transparency and Oversight
By design, Memex provides for dynamic on-demand content domain generation that
enables Federal agencies, local law enforcement, and other organizations to select surveillance
zones without judicial oversight. The content domain example given by DARPA is for human
trafficking; however, as noted earlier in this paper, web crawlers have the ability to take
commands from controllers to add or modify content domains. This functionality is powerful as
it potentially allows investigators and private organizations access to personal information
collected without oversight and may compromise citizen’s due process protections. Although
currently unknown, it seems feasible that a law-abiding citizen, who posts a request online for a
consensual intimate meeting, may get her information indexed under Memex’s human trafficking
content domain. This functionality without oversight and transparency should invoke concerns
from privacy advocates.
Data Safeguards and Retention
As with any “big data” system, details over how data is collected, stored, and transmitted
is a significant concern. Hackers, identity thieves, and foreign governments may have the
potential and desire to target Memex’s web crawlers or central repositories. These systems must
DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 8
be protected from possible exploitation or hijacking considering the amount of sensitive
information indexed. The data collected, stored, and transmitted – as well as the webcrawlers
themselves – must have safeguards protecting them from compromise by malicious actors intent
on invading citizen’s privacy or committing crimes.
Along with concerns over safeguarding data is the duration of time that collected
information will be retained. In a related situation where governmental authorities built databases
of citizen’s vehicle location information via Automated License Plate Recognition (ALPR)
technology, privacy advocates argued over concerns of data retention (Lynch, 2013). Advocates
have already filed lawsuits attempting to make usage of this technology more transparent
(Electronic Frontier Foundation, 2013). Similarly, privacy advocates will likely be concerned
with Memex’s data retention policies.
Conclusion and the Author’s Personal Reflections
Based off currently published public information, DARPA’s Memex Program is a
powerful tool built with the best of intentions: to protect national security and to combat
cybercrimes. As a technical feat, its ability to crawl the vast scope of the Internet, evade
paywalls and member-only areas, perform in-depth analytical functions, and extract information
is impressive. The breadth of deployment of this tool to include law enforcement, district
attorneys, and NGOs introduces the possibility of abuse. Big Data tools that collect citizen’s
information will inherently encroach on privacy rights and Constitutional protections. Memex
invokes discussion over Internet privacy and whether any and all information posted online is
considered public and open for collection or analysis. Regardless, usage of a tool like Memex
should require strict usage rules that outline scope, additional oversight and transparency, and
must be properly secured. Additionally, information collected by Memex about citizens should
DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 9
be accessible under current freedom of information act laws. Memex is not the first – and will
not be the last – information tool that challenges citizen’s privacy rights.
DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 10
References
Bradbury, D. (2014). Unveiling the dark web. Network Security, 14-17.
Chandler, N. (n.d.). How the deep web works. Retrieved from How Stuff Works: Computer:
http://computer.howstuffworks.com/internet/basics/how-the-deep-web-works.htm
DARPA. (2014, 02 09). DARPA:News Events. Retrieved from Memex aims to create a new
paradigm for domain-specific search.:
http://www.darpa.mil/newsevents/releases/2014/02/09.aspx
DARPA: Information Innovation Office. (2014). Broad Agency Announcement: Memex.
Retrieved from http://go.usa.gov/BBc5
Electronic Frontier Foundation. (2013, 05 06). EFF and ACLU Sue LA Law-Enforcement
Agencies Over License-Plate Reader Records. EFF Press Release.
Greenemeier, L. (2015, 02 08). Human traffickers caught on hidden Internet. Scientific
American. Retrieved from http://www.scientificamerican.com/article/human-traffickers-
caught-on-hidden-internet/
Zetter, K. (2015, 02 15). Darpa is developing a search engine for the dark web. Wired. Retrieved
from http://www.wired.com/2015/02/darpa-memex-dark-web/
Lynch, J. (2013, 05 06). Automated License Plate Readers Threaten Our Privacy. EFF
DeepLinks.
Mueller, M. (2010). Networks and States: The Global Politics of Internet Governance.
Massachusettes Institute of Technology.
Sachan, A. (2012). Countering terrorism through Dark Web analysis. IEEE.
About the author
Author: Christopher Furton
Website: Http://christopher.furton.net
Certified professional with over 12 years of Information Technology experience and 8 years of
hands-on leadership. An expert in cyber security with both managerial and technical skills
proven throughout a career with increasing responsibility and performance expectations. Known
ability to translate complex information for universal understanding. Detail-driven, results-
focused leader with superior analytical, multitasking, and communication skills. Well-versed in
industry best practices including Project Management and IT Service Management. Currently
holding active CISSP, CEH, ITIL Foundations, Security+, and Network+ certifications.
Visit the auhor’s blog:
IT Management Perspectives - https://christopherfurton.wordpress.com/
Social Sphere:
LinkedIn Twitter Google+ Quora Wordpress Flavors.me
Slide Share Tumblr YouTube Pinterest About.me Vimeo

More Related Content

What's hot

A study of index poisoning in peer topeer
A study of index poisoning in peer topeerA study of index poisoning in peer topeer
A study of index poisoning in peer topeerIJCI JOURNAL
 
Online text data for machine learning, data science, and research - Who can p...
Online text data for machine learning, data science, and research - Who can p...Online text data for machine learning, data science, and research - Who can p...
Online text data for machine learning, data science, and research - Who can p...Fredrik Olsson
 
Final Next Generation Content Management
Final    Next  Generation  Content  ManagementFinal    Next  Generation  Content  Management
Final Next Generation Content ManagementScott Abel
 
Electronic Surveillance Of Communications 100225
Electronic Surveillance Of Communications 100225Electronic Surveillance Of Communications 100225
Electronic Surveillance Of Communications 100225Klamberg
 
Electronic Surveillance of Communications 100225
Electronic Surveillance of Communications 100225Electronic Surveillance of Communications 100225
Electronic Surveillance of Communications 100225Klamberg
 
Instructions please write a 5 page paper answering the question con
Instructions please write a 5 page paper answering the question conInstructions please write a 5 page paper answering the question con
Instructions please write a 5 page paper answering the question consimba35
 
Computer Forensic: A Reactive Strategy for Fighting Computer Crime
Computer Forensic: A Reactive Strategy for Fighting Computer CrimeComputer Forensic: A Reactive Strategy for Fighting Computer Crime
Computer Forensic: A Reactive Strategy for Fighting Computer CrimeCSCJournals
 
What Your Tweets Tell Us About You, Speaker Notes
What Your Tweets Tell Us About You, Speaker NotesWhat Your Tweets Tell Us About You, Speaker Notes
What Your Tweets Tell Us About You, Speaker NotesKrisKasianovitz
 
Hello dr. aguiar and classmates,for this week’s forum we were as
Hello dr. aguiar and classmates,for this week’s forum we were asHello dr. aguiar and classmates,for this week’s forum we were as
Hello dr. aguiar and classmates,for this week’s forum we were assimba35
 
Findability Primer by Information Architected - the IA Primer Series
Findability Primer by Information Architected - the IA Primer SeriesFindability Primer by Information Architected - the IA Primer Series
Findability Primer by Information Architected - the IA Primer SeriesDan Keldsen
 
Challenges and emerging practices for knowledge organization in the electron...
Challenges and emerging practices for knowledge  organization in the electron...Challenges and emerging practices for knowledge  organization in the electron...
Challenges and emerging practices for knowledge organization in the electron...Anil Mishra
 
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...ijtsrd
 
Privacidad: La Tensión entre las Capacidades Tecnológicas y las Expectativas ...
Privacidad: La Tensión entre las Capacidades Tecnológicas y las Expectativas ...Privacidad: La Tensión entre las Capacidades Tecnológicas y las Expectativas ...
Privacidad: La Tensión entre las Capacidades Tecnológicas y las Expectativas ...Facultad de Informática UCM
 

What's hot (20)

A study of index poisoning in peer topeer
A study of index poisoning in peer topeerA study of index poisoning in peer topeer
A study of index poisoning in peer topeer
 
Online text data for machine learning, data science, and research - Who can p...
Online text data for machine learning, data science, and research - Who can p...Online text data for machine learning, data science, and research - Who can p...
Online text data for machine learning, data science, and research - Who can p...
 
Final Next Generation Content Management
Final    Next  Generation  Content  ManagementFinal    Next  Generation  Content  Management
Final Next Generation Content Management
 
Polinter09
Polinter09Polinter09
Polinter09
 
Electronic Surveillance Of Communications 100225
Electronic Surveillance Of Communications 100225Electronic Surveillance Of Communications 100225
Electronic Surveillance Of Communications 100225
 
Electronic Surveillance of Communications 100225
Electronic Surveillance of Communications 100225Electronic Surveillance of Communications 100225
Electronic Surveillance of Communications 100225
 
A42020106
A42020106A42020106
A42020106
 
nm
nmnm
nm
 
Instructions please write a 5 page paper answering the question con
Instructions please write a 5 page paper answering the question conInstructions please write a 5 page paper answering the question con
Instructions please write a 5 page paper answering the question con
 
Computer forencis
Computer forencisComputer forencis
Computer forencis
 
Computer Forensic: A Reactive Strategy for Fighting Computer Crime
Computer Forensic: A Reactive Strategy for Fighting Computer CrimeComputer Forensic: A Reactive Strategy for Fighting Computer Crime
Computer Forensic: A Reactive Strategy for Fighting Computer Crime
 
What Your Tweets Tell Us About You, Speaker Notes
What Your Tweets Tell Us About You, Speaker NotesWhat Your Tweets Tell Us About You, Speaker Notes
What Your Tweets Tell Us About You, Speaker Notes
 
Hello dr. aguiar and classmates,for this week’s forum we were as
Hello dr. aguiar and classmates,for this week’s forum we were asHello dr. aguiar and classmates,for this week’s forum we were as
Hello dr. aguiar and classmates,for this week’s forum we were as
 
Findability Primer by Information Architected - the IA Primer Series
Findability Primer by Information Architected - the IA Primer SeriesFindability Primer by Information Architected - the IA Primer Series
Findability Primer by Information Architected - the IA Primer Series
 
Challenges and emerging practices for knowledge organization in the electron...
Challenges and emerging practices for knowledge  organization in the electron...Challenges and emerging practices for knowledge  organization in the electron...
Challenges and emerging practices for knowledge organization in the electron...
 
Digital forensics
Digital forensicsDigital forensics
Digital forensics
 
digital stega
digital stegadigital stega
digital stega
 
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
 
Paper24
Paper24Paper24
Paper24
 
Privacidad: La Tensión entre las Capacidades Tecnológicas y las Expectativas ...
Privacidad: La Tensión entre las Capacidades Tecnológicas y las Expectativas ...Privacidad: La Tensión entre las Capacidades Tecnológicas y las Expectativas ...
Privacidad: La Tensión entre las Capacidades Tecnológicas y las Expectativas ...
 

Similar to Christopher furton-darpa-project-memex-erodes-internet-privacy

DARPA Project Memex Erodes Privacy
DARPA Project Memex Erodes PrivacyDARPA Project Memex Erodes Privacy
DARPA Project Memex Erodes PrivacyChris Furton
 
Terrorism Analysis through Social Media using Data Mining
Terrorism Analysis through Social Media using Data MiningTerrorism Analysis through Social Media using Data Mining
Terrorism Analysis through Social Media using Data MiningIRJET Journal
 
Mining in Ontology with Multi Agent System in Semantic Web : A Novel Approach
Mining in Ontology with Multi Agent System in Semantic Web : A Novel ApproachMining in Ontology with Multi Agent System in Semantic Web : A Novel Approach
Mining in Ontology with Multi Agent System in Semantic Web : A Novel Approachijma
 
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Amit Sheth
 
DATA, TEXT, AND WEB MINING FOR BUSINESS INTELLIGENCE: A SURVEY
DATA, TEXT, AND WEB MINING FOR BUSINESS INTELLIGENCE: A SURVEYDATA, TEXT, AND WEB MINING FOR BUSINESS INTELLIGENCE: A SURVEY
DATA, TEXT, AND WEB MINING FOR BUSINESS INTELLIGENCE: A SURVEYijdkp
 
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...Kumar Goud
 
High Accuracy Location Information Extraction From Social Network Texts Using...
High Accuracy Location Information Extraction From Social Network Texts Using...High Accuracy Location Information Extraction From Social Network Texts Using...
High Accuracy Location Information Extraction From Social Network Texts Using...kevig
 
High Accuracy Location Information Extraction From Social Network Texts Using...
High Accuracy Location Information Extraction From Social Network Texts Using...High Accuracy Location Information Extraction From Social Network Texts Using...
High Accuracy Location Information Extraction From Social Network Texts Using...kevig
 
Deeplight Intelliagg
Deeplight IntelliaggDeeplight Intelliagg
Deeplight IntelliaggGavin O'Toole
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applicationsSubrat Swain
 
Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...
Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...
Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...CSEIJJournal
 
EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...
EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...
EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...CSEIJJournal
 
On How the Darknet and its Access to SCADA is a Threat to National Critical I...
On How the Darknet and its Access to SCADA is a Threat to National Critical I...On How the Darknet and its Access to SCADA is a Threat to National Critical I...
On How the Darknet and its Access to SCADA is a Threat to National Critical I...Matthew Kurnava
 
On line footprint @upc
On line footprint @upcOn line footprint @upc
On line footprint @upcSilvia Puglisi
 

Similar to Christopher furton-darpa-project-memex-erodes-internet-privacy (20)

DARPA Project Memex Erodes Privacy
DARPA Project Memex Erodes PrivacyDARPA Project Memex Erodes Privacy
DARPA Project Memex Erodes Privacy
 
Introduction abstract
Introduction abstractIntroduction abstract
Introduction abstract
 
Terrorism Analysis through Social Media using Data Mining
Terrorism Analysis through Social Media using Data MiningTerrorism Analysis through Social Media using Data Mining
Terrorism Analysis through Social Media using Data Mining
 
Mining in Ontology with Multi Agent System in Semantic Web : A Novel Approach
Mining in Ontology with Multi Agent System in Semantic Web : A Novel ApproachMining in Ontology with Multi Agent System in Semantic Web : A Novel Approach
Mining in Ontology with Multi Agent System in Semantic Web : A Novel Approach
 
Data Models and the DMCA
Data Models and the DMCAData Models and the DMCA
Data Models and the DMCA
 
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
 
DATA, TEXT, AND WEB MINING FOR BUSINESS INTELLIGENCE: A SURVEY
DATA, TEXT, AND WEB MINING FOR BUSINESS INTELLIGENCE: A SURVEYDATA, TEXT, AND WEB MINING FOR BUSINESS INTELLIGENCE: A SURVEY
DATA, TEXT, AND WEB MINING FOR BUSINESS INTELLIGENCE: A SURVEY
 
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
 
High Accuracy Location Information Extraction From Social Network Texts Using...
High Accuracy Location Information Extraction From Social Network Texts Using...High Accuracy Location Information Extraction From Social Network Texts Using...
High Accuracy Location Information Extraction From Social Network Texts Using...
 
High Accuracy Location Information Extraction From Social Network Texts Using...
High Accuracy Location Information Extraction From Social Network Texts Using...High Accuracy Location Information Extraction From Social Network Texts Using...
High Accuracy Location Information Extraction From Social Network Texts Using...
 
Deeplight Intelliagg
Deeplight IntelliaggDeeplight Intelliagg
Deeplight Intelliagg
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applications
 
Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...
Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...
Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...
 
EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...
EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...
EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...
 
On How the Darknet and its Access to SCADA is a Threat to National Critical I...
On How the Darknet and its Access to SCADA is a Threat to National Critical I...On How the Darknet and its Access to SCADA is a Threat to National Critical I...
On How the Darknet and its Access to SCADA is a Threat to National Critical I...
 
Web Mining
Web MiningWeb Mining
Web Mining
 
Edu.03 assignment
Edu.03 assignment Edu.03 assignment
Edu.03 assignment
 
Edu.03
Edu.03 Edu.03
Edu.03
 
On line footprint @upc
On line footprint @upcOn line footprint @upc
On line footprint @upc
 
PanamaPapers
PanamaPapersPanamaPapers
PanamaPapers
 

More from Chris Furton

Christopher Furton - Cybersecurity Threat Brief: Malvertising and Watering Holes
Christopher Furton - Cybersecurity Threat Brief: Malvertising and Watering HolesChristopher Furton - Cybersecurity Threat Brief: Malvertising and Watering Holes
Christopher Furton - Cybersecurity Threat Brief: Malvertising and Watering HolesChris Furton
 
Information Architecture Techniques and Best Practices
Information Architecture Techniques and Best PracticesInformation Architecture Techniques and Best Practices
Information Architecture Techniques and Best PracticesChris Furton
 
Case Study on Effective IS Governance within a Department of Defense Organiza...
Case Study on Effective IS Governance within a Department of Defense Organiza...Case Study on Effective IS Governance within a Department of Defense Organiza...
Case Study on Effective IS Governance within a Department of Defense Organiza...Chris Furton
 
IT Capital Planning: Enterprise Architecture and Exhibit 300 processes for th...
IT Capital Planning: Enterprise Architecture and Exhibit 300 processes for th...IT Capital Planning: Enterprise Architecture and Exhibit 300 processes for th...
IT Capital Planning: Enterprise Architecture and Exhibit 300 processes for th...Chris Furton
 
Configuration Management: a Critical Component to Vulnerability Management
Configuration Management: a Critical Component to Vulnerability ManagementConfiguration Management: a Critical Component to Vulnerability Management
Configuration Management: a Critical Component to Vulnerability ManagementChris Furton
 
Analysis of Enterprise Risk Management of Two Retail Industry Competitors
Analysis of Enterprise Risk Management of Two Retail Industry CompetitorsAnalysis of Enterprise Risk Management of Two Retail Industry Competitors
Analysis of Enterprise Risk Management of Two Retail Industry CompetitorsChris Furton
 
Mock Panel debate on hate speech
Mock Panel debate on hate speechMock Panel debate on hate speech
Mock Panel debate on hate speechChris Furton
 
IT Strategy in Airlines Industry
IT Strategy in Airlines IndustryIT Strategy in Airlines Industry
IT Strategy in Airlines IndustryChris Furton
 

More from Chris Furton (8)

Christopher Furton - Cybersecurity Threat Brief: Malvertising and Watering Holes
Christopher Furton - Cybersecurity Threat Brief: Malvertising and Watering HolesChristopher Furton - Cybersecurity Threat Brief: Malvertising and Watering Holes
Christopher Furton - Cybersecurity Threat Brief: Malvertising and Watering Holes
 
Information Architecture Techniques and Best Practices
Information Architecture Techniques and Best PracticesInformation Architecture Techniques and Best Practices
Information Architecture Techniques and Best Practices
 
Case Study on Effective IS Governance within a Department of Defense Organiza...
Case Study on Effective IS Governance within a Department of Defense Organiza...Case Study on Effective IS Governance within a Department of Defense Organiza...
Case Study on Effective IS Governance within a Department of Defense Organiza...
 
IT Capital Planning: Enterprise Architecture and Exhibit 300 processes for th...
IT Capital Planning: Enterprise Architecture and Exhibit 300 processes for th...IT Capital Planning: Enterprise Architecture and Exhibit 300 processes for th...
IT Capital Planning: Enterprise Architecture and Exhibit 300 processes for th...
 
Configuration Management: a Critical Component to Vulnerability Management
Configuration Management: a Critical Component to Vulnerability ManagementConfiguration Management: a Critical Component to Vulnerability Management
Configuration Management: a Critical Component to Vulnerability Management
 
Analysis of Enterprise Risk Management of Two Retail Industry Competitors
Analysis of Enterprise Risk Management of Two Retail Industry CompetitorsAnalysis of Enterprise Risk Management of Two Retail Industry Competitors
Analysis of Enterprise Risk Management of Two Retail Industry Competitors
 
Mock Panel debate on hate speech
Mock Panel debate on hate speechMock Panel debate on hate speech
Mock Panel debate on hate speech
 
IT Strategy in Airlines Industry
IT Strategy in Airlines IndustryIT Strategy in Airlines Industry
IT Strategy in Airlines Industry
 

Recently uploaded

Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMartaLoveguard
 
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一Fs
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一z xss
 
Complet Documnetation for Smart Assistant Application for Disabled Person
Complet Documnetation   for Smart Assistant Application for Disabled PersonComplet Documnetation   for Smart Assistant Application for Disabled Person
Complet Documnetation for Smart Assistant Application for Disabled Personfurqan222004
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Sonam Pathan
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Lucknow
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Sonam Pathan
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书zdzoqco
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Roomdivyansh0kumar0
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts servicevipmodelshub1
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一Fs
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012rehmti665
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationLinaWolf1
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Roomdivyansh0kumar0
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Dana Luther
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一Fs
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhimiss dipika
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)Christopher H Felton
 

Recently uploaded (20)

Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptx
 
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Uttam Nagar Delhi 💯Call Us 🔝8264348440🔝
 
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
 
Complet Documnetation for Smart Assistant Application for Disabled Person
Complet Documnetation   for Smart Assistant Application for Disabled PersonComplet Documnetation   for Smart Assistant Application for Disabled Person
Complet Documnetation for Smart Assistant Application for Disabled Person
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
 
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja VipCall Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
Call Girls Service Adil Nagar 7001305949 Need escorts Service Pooja Vip
 
Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170Call Girls Near The Suryaa Hotel New Delhi 9873777170
Call Girls Near The Suryaa Hotel New Delhi 9873777170
 
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
办理多伦多大学毕业证成绩单|购买加拿大UTSG文凭证书
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 Documentation
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
 
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
Packaging the Monolith - PHP Tek 2024 (Breaking it down one bite at a time)
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
 
Contact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New DelhiContact Rya Baby for Call Girls New Delhi
Contact Rya Baby for Call Girls New Delhi
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
 

Christopher furton-darpa-project-memex-erodes-internet-privacy

  • 1. Running head: DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 1 DARPA Memex Project Erodes Internet Privacy Christopher Furton Syracuse University
  • 2. DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 2 Abstract In February of 2014, the Defense Advanced Research Projects Agency (DARPA) announced the Memex Project that is currently being used by Federal agencies, law enforcement, and Non-Governmental Organizations (NGOs). The Memex project deploys technology that crawls, indexes, analyzes, extracts, and provides search functionality across the entire Internet including the criminal underground referred to as the Dark Net. Despite the good intention of DARPA, the Memex tool raises several privacy concerns such as scope of use, oversight and transparency, data retention, and information security. With this powerful big data capability, precautions must be taken to protect citizen’s privacy rights.
  • 3. DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 3 DARPA Memex Project Further Erodes Internet Privacy What is Memex? The Defense Advanced Research Projects Agency (DARPA) announced plans to create a project known as Memex on February 9th , 2014 (DARPA, 2014, p. 1). A further look into the Broad Agency Announcement (BAA) shows that DARPA is looking for proposals from industry to “maintain technological superiority in the area of content indexing and web search on the Internet” (DARPA: Information Innovation Office, 2014, p. 4). DARPA identifies a problem with current web search functionality stating that it has limitations on what gets indexed and the richness of available details. For government researchers and law enforcement personnel, current methods used involve manual searching by input of exact information one entry at a time. Further analysis must be done to organize or aggregate beyond a list of links (DARPA: Information Innovation Office, 2014, p. 4). DARPA plans to solve this problem with the Memex Project by developing technologies that “provide the mechanisms for content discovery, information extraction, information retrieval, user collaboration, and other areas need to address distributed aggregation, analysis, and presentation of web content” (DARPA: Information Innovation Office, 2014, p. 5). To accomplish this, DARPA has divided the work into three technical areas: domain-specific indexing, domain-specific search, and applications. DARPA specifies the need for technology to reach beyond traditional content, specifically naming the Dark Web as a target. The Dark Web refers to the large mass of Internet content not relatively accessible through search engines often requiring special encryption software to access (Chandler, n.d.). The first technical area DARPA is interested in is domain-specific indexing. This technical area focuses on developing a highly scalable web crawling capability with both content
  • 4. DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 4 discovery and information extraction. This crawling process will provide automated link discovery including obfuscated links, discovery of deep and dark web content, and hidden services – the function of providing web services such as chat or web page hosting on the Dark Web. Additionally, this capability will include counter-crawling measures such as paywalls or member-only areas, crawler bans, and even human detection. Lastly, this capability must also be able to extract information and include normalization of heterogeneous data, natural language processing for translation, image analysis, extraction of multimedia, and several other functions (DARPA: Information Innovation Office, 2014, p. 7). The second technical area DARPA is interested in is domain-specific searching. This capability is not the same as current commercial web searching; instead, it will have configurable interfaces into web content indexed by the first technical area. The interfaces, as outlined in the BAA, may include conceptually aggregated results, conceptually connected content, task relevant facets, implicit collaboration for enriched content, explicit collaboration with shared tags, and several other capabilities. Lastly, this technical area will include a query language so that DARPA personnel may modify instructions for the crawlers and information extraction algorithms (DARPA: Information Innovation Office, 2014, p. 7). The last technical area DARPA is interested in is generically referred to as “applications.” This technical area is where system-level concepts of operation and use cases are developed. The utility of the system must be able to evolve over time based off the needs of the Department of Defense and other agencies, and it involves the development of possible new content domains including missing persons, found data, and counterfeit goods. Specified in this technical area is that the integration, testing, and evaluation is to be performed on the open public Internet (DARPA: Information Innovation Office, 2014, pp. 8-9).
  • 5. DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 5 Stated Purpose According to Wired Magazine (2015), Memex’s purpose is to uncover patterns and relationships in online data for law enforcement and others who track illegal activity (para. 1). Memex uses automated methods to analyze content in order to uncover hidden relationships between data points. Additionally, it helps researchers determine how much of the dark web’s traffic is related to hidden services where content could be indexed. Furthermore, Memex can help investigators understand the turnover of sites, specifically, the relationship between sites when one shuts down and a seemingly unrelated site opens up (Zetter, 2015). DARPA has gone through great measures to specify that the Memex Project is aimed at indexing “domain-specific” content and gives the example of Human Trafficking – both labor and sex - in the BAA (DARPA: Information Innovation Office, 2014, p. 4). At least one instance of Memex in action has been documented: the New York District Attorney’s Office claims that an experimental set of Internet search tools is part of the prosecutor’s arsenal that helped secure a sex trafficking conviction. In this instance, Memex was used to scour the Internet looking for advertisements used to lure victims into servitude and to promote their sexual exploitation (Greenemeier, 2015, para. 2-3). DARPA is an agency under the Department of Defense; however, the stated purpose of Memex has a direct correlation to law enforcement. DARPA has confirmed that in August of 2014, several beta testers were approved to use Memex including two district attorney’s offices, a law enforcement group, and a nongovernmental organization (NGO). The next set of tests are expected to begin in early 2015 and include federal and district prosecutors, regional and national law enforcement, and multiple NGOs. Every quarter, DARPA wants to expand user testing until they are comfortable handing the tool over to law enforcement agencies and
  • 6. DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 6 prosecutors. Eventually, the plan is to have the Memex capability installed locally at law enforcement agencies and to ensure police could access the software from anywhere (Greenemeier, 2015). Hypothetical Privacy Abuse The Dark Web provides an avenue for cyber criminals, human traffickers, child pornographers, and other criminals to conduct business (Bradbury, 2014). Additionally, it also provides an avenue for planning and coordination of terrorism activities posing a threat to National Security (Sachan, 2012). It is only logical that the United States Government would be interested in gaining better insight and surveillance into the Dark Web for both national security and law enforcement reasons; however, with that comes the need to protect citizen’s privacy. Memex, as an international “Big Data” tool, provides the capability for content discovery, index, search, aggregation, and extraction on a very large scale introducing a swarm of information privacy concerns. Scope of Use One major privacy concern involves the breadth of data collected by Memex’s web crawling indexers. Since the web crawlers are designed to ignore the content owner’s explicit crawler prohibitions – often done by robots.txt files – as well as penetrating paywalls and membership areas, it is highly feasible that content intended to be private will get vacuumed into the Memex system. Therefore, Memex can be seen as another attempt by the United States Government to increase the size of the figurative information haystack in order to find more needles. Despite the usage of “domain-specific” collection techniques, it is inevitable that non- domain data will get introduced into Memex.
  • 7. DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 7 Another privacy concern involves the scope of users being allowed access to Memex- siphoned data. As mentioned earlier in this paper, DARPA has already given access to one private Non-Governmental Organization (NGO) which introduces concerns, as noted by Mueller (2010) in regards to private organizations and content regulation, about governance arrangements that may “avoid substantive and due process rights” (p. 213). With DARPA planning to extend access to many more NGOs, citizens should be concerned about who has access to this potentially massive trove of well-organized and readily accessible private information. Transparency and Oversight By design, Memex provides for dynamic on-demand content domain generation that enables Federal agencies, local law enforcement, and other organizations to select surveillance zones without judicial oversight. The content domain example given by DARPA is for human trafficking; however, as noted earlier in this paper, web crawlers have the ability to take commands from controllers to add or modify content domains. This functionality is powerful as it potentially allows investigators and private organizations access to personal information collected without oversight and may compromise citizen’s due process protections. Although currently unknown, it seems feasible that a law-abiding citizen, who posts a request online for a consensual intimate meeting, may get her information indexed under Memex’s human trafficking content domain. This functionality without oversight and transparency should invoke concerns from privacy advocates. Data Safeguards and Retention As with any “big data” system, details over how data is collected, stored, and transmitted is a significant concern. Hackers, identity thieves, and foreign governments may have the potential and desire to target Memex’s web crawlers or central repositories. These systems must
  • 8. DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 8 be protected from possible exploitation or hijacking considering the amount of sensitive information indexed. The data collected, stored, and transmitted – as well as the webcrawlers themselves – must have safeguards protecting them from compromise by malicious actors intent on invading citizen’s privacy or committing crimes. Along with concerns over safeguarding data is the duration of time that collected information will be retained. In a related situation where governmental authorities built databases of citizen’s vehicle location information via Automated License Plate Recognition (ALPR) technology, privacy advocates argued over concerns of data retention (Lynch, 2013). Advocates have already filed lawsuits attempting to make usage of this technology more transparent (Electronic Frontier Foundation, 2013). Similarly, privacy advocates will likely be concerned with Memex’s data retention policies. Conclusion and the Author’s Personal Reflections Based off currently published public information, DARPA’s Memex Program is a powerful tool built with the best of intentions: to protect national security and to combat cybercrimes. As a technical feat, its ability to crawl the vast scope of the Internet, evade paywalls and member-only areas, perform in-depth analytical functions, and extract information is impressive. The breadth of deployment of this tool to include law enforcement, district attorneys, and NGOs introduces the possibility of abuse. Big Data tools that collect citizen’s information will inherently encroach on privacy rights and Constitutional protections. Memex invokes discussion over Internet privacy and whether any and all information posted online is considered public and open for collection or analysis. Regardless, usage of a tool like Memex should require strict usage rules that outline scope, additional oversight and transparency, and must be properly secured. Additionally, information collected by Memex about citizens should
  • 9. DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 9 be accessible under current freedom of information act laws. Memex is not the first – and will not be the last – information tool that challenges citizen’s privacy rights.
  • 10. DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 10 References Bradbury, D. (2014). Unveiling the dark web. Network Security, 14-17. Chandler, N. (n.d.). How the deep web works. Retrieved from How Stuff Works: Computer: http://computer.howstuffworks.com/internet/basics/how-the-deep-web-works.htm DARPA. (2014, 02 09). DARPA:News Events. Retrieved from Memex aims to create a new paradigm for domain-specific search.: http://www.darpa.mil/newsevents/releases/2014/02/09.aspx DARPA: Information Innovation Office. (2014). Broad Agency Announcement: Memex. Retrieved from http://go.usa.gov/BBc5 Electronic Frontier Foundation. (2013, 05 06). EFF and ACLU Sue LA Law-Enforcement Agencies Over License-Plate Reader Records. EFF Press Release. Greenemeier, L. (2015, 02 08). Human traffickers caught on hidden Internet. Scientific American. Retrieved from http://www.scientificamerican.com/article/human-traffickers- caught-on-hidden-internet/ Zetter, K. (2015, 02 15). Darpa is developing a search engine for the dark web. Wired. Retrieved from http://www.wired.com/2015/02/darpa-memex-dark-web/ Lynch, J. (2013, 05 06). Automated License Plate Readers Threaten Our Privacy. EFF DeepLinks. Mueller, M. (2010). Networks and States: The Global Politics of Internet Governance. Massachusettes Institute of Technology. Sachan, A. (2012). Countering terrorism through Dark Web analysis. IEEE.
  • 11. About the author Author: Christopher Furton Website: Http://christopher.furton.net Certified professional with over 12 years of Information Technology experience and 8 years of hands-on leadership. An expert in cyber security with both managerial and technical skills proven throughout a career with increasing responsibility and performance expectations. Known ability to translate complex information for universal understanding. Detail-driven, results- focused leader with superior analytical, multitasking, and communication skills. Well-versed in industry best practices including Project Management and IT Service Management. Currently holding active CISSP, CEH, ITIL Foundations, Security+, and Network+ certifications. Visit the auhor’s blog: IT Management Perspectives - https://christopherfurton.wordpress.com/ Social Sphere: LinkedIn Twitter Google+ Quora Wordpress Flavors.me Slide Share Tumblr YouTube Pinterest About.me Vimeo