SlideShare a Scribd company logo
Running head: DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 1
DARPA Memex Project Erodes Internet Privacy
Christopher Furton
Syracuse University
DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 2
Abstract
In February of 2014, the Defense Advanced Research Projects Agency (DARPA)
announced the Memex Project that is currently being used by Federal agencies, law enforcement,
and Non-Governmental Organizations (NGOs). The Memex project deploys technology that
crawls, indexes, analyzes, extracts, and provides search functionality across the entire Internet
including the criminal underground referred to as the Dark Net. Despite the good intention of
DARPA, the Memex tool raises several privacy concerns such as scope of use, oversight and
transparency, data retention, and information security. With this powerful big data capability,
precautions must be taken to protect citizen’s privacy rights.
DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 3
DARPA Memex Project Further Erodes Internet Privacy
What is Memex?
The Defense Advanced Research Projects Agency (DARPA) announced plans to create a
project known as Memex on February 9th
, 2014 (DARPA, 2014, p. 1). A further look into the
Broad Agency Announcement (BAA) shows that DARPA is looking for proposals from industry
to “maintain technological superiority in the area of content indexing and web search on the
Internet” (DARPA: Information Innovation Office, 2014, p. 4). DARPA identifies a problem
with current web search functionality stating that it has limitations on what gets indexed and the
richness of available details. For government researchers and law enforcement personnel,
current methods used involve manual searching by input of exact information one entry at a time.
Further analysis must be done to organize or aggregate beyond a list of links (DARPA:
Information Innovation Office, 2014, p. 4).
DARPA plans to solve this problem with the Memex Project by developing technologies
that “provide the mechanisms for content discovery, information extraction, information
retrieval, user collaboration, and other areas need to address distributed aggregation, analysis,
and presentation of web content” (DARPA: Information Innovation Office, 2014, p. 5). To
accomplish this, DARPA has divided the work into three technical areas: domain-specific
indexing, domain-specific search, and applications. DARPA specifies the need for technology to
reach beyond traditional content, specifically naming the Dark Web as a target. The Dark Web
refers to the large mass of Internet content not relatively accessible through search engines often
requiring special encryption software to access (Chandler, n.d.).
The first technical area DARPA is interested in is domain-specific indexing. This
technical area focuses on developing a highly scalable web crawling capability with both content
DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 4
discovery and information extraction. This crawling process will provide automated link
discovery including obfuscated links, discovery of deep and dark web content, and hidden
services – the function of providing web services such as chat or web page hosting on the Dark
Web. Additionally, this capability will include counter-crawling measures such as paywalls or
member-only areas, crawler bans, and even human detection. Lastly, this capability must also be
able to extract information and include normalization of heterogeneous data, natural language
processing for translation, image analysis, extraction of multimedia, and several other functions
(DARPA: Information Innovation Office, 2014, p. 7).
The second technical area DARPA is interested in is domain-specific searching. This
capability is not the same as current commercial web searching; instead, it will have configurable
interfaces into web content indexed by the first technical area. The interfaces, as outlined in the
BAA, may include conceptually aggregated results, conceptually connected content, task
relevant facets, implicit collaboration for enriched content, explicit collaboration with shared
tags, and several other capabilities. Lastly, this technical area will include a query language so
that DARPA personnel may modify instructions for the crawlers and information extraction
algorithms (DARPA: Information Innovation Office, 2014, p. 7).
The last technical area DARPA is interested in is generically referred to as
“applications.” This technical area is where system-level concepts of operation and use cases are
developed. The utility of the system must be able to evolve over time based off the needs of the
Department of Defense and other agencies, and it involves the development of possible new
content domains including missing persons, found data, and counterfeit goods. Specified in this
technical area is that the integration, testing, and evaluation is to be performed on the open public
Internet (DARPA: Information Innovation Office, 2014, pp. 8-9).
DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 5
Stated Purpose
According to Wired Magazine (2015), Memex’s purpose is to uncover patterns and
relationships in online data for law enforcement and others who track illegal activity (para. 1).
Memex uses automated methods to analyze content in order to uncover hidden relationships
between data points. Additionally, it helps researchers determine how much of the dark web’s
traffic is related to hidden services where content could be indexed. Furthermore, Memex can
help investigators understand the turnover of sites, specifically, the relationship between sites
when one shuts down and a seemingly unrelated site opens up (Zetter, 2015).
DARPA has gone through great measures to specify that the Memex Project is aimed at
indexing “domain-specific” content and gives the example of Human Trafficking – both labor
and sex - in the BAA (DARPA: Information Innovation Office, 2014, p. 4). At least one
instance of Memex in action has been documented: the New York District Attorney’s Office
claims that an experimental set of Internet search tools is part of the prosecutor’s arsenal that
helped secure a sex trafficking conviction. In this instance, Memex was used to scour the
Internet looking for advertisements used to lure victims into servitude and to promote their
sexual exploitation (Greenemeier, 2015, para. 2-3).
DARPA is an agency under the Department of Defense; however, the stated purpose of
Memex has a direct correlation to law enforcement. DARPA has confirmed that in August of
2014, several beta testers were approved to use Memex including two district attorney’s offices,
a law enforcement group, and a nongovernmental organization (NGO). The next set of tests are
expected to begin in early 2015 and include federal and district prosecutors, regional and
national law enforcement, and multiple NGOs. Every quarter, DARPA wants to expand user
testing until they are comfortable handing the tool over to law enforcement agencies and
DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 6
prosecutors. Eventually, the plan is to have the Memex capability installed locally at law
enforcement agencies and to ensure police could access the software from anywhere
(Greenemeier, 2015).
Hypothetical Privacy Abuse
The Dark Web provides an avenue for cyber criminals, human traffickers, child
pornographers, and other criminals to conduct business (Bradbury, 2014). Additionally, it also
provides an avenue for planning and coordination of terrorism activities posing a threat to
National Security (Sachan, 2012). It is only logical that the United States Government would be
interested in gaining better insight and surveillance into the Dark Web for both national security
and law enforcement reasons; however, with that comes the need to protect citizen’s privacy.
Memex, as an international “Big Data” tool, provides the capability for content discovery, index,
search, aggregation, and extraction on a very large scale introducing a swarm of information
privacy concerns.
Scope of Use
One major privacy concern involves the breadth of data collected by Memex’s web
crawling indexers. Since the web crawlers are designed to ignore the content owner’s explicit
crawler prohibitions – often done by robots.txt files – as well as penetrating paywalls and
membership areas, it is highly feasible that content intended to be private will get vacuumed into
the Memex system. Therefore, Memex can be seen as another attempt by the United States
Government to increase the size of the figurative information haystack in order to find more
needles. Despite the usage of “domain-specific” collection techniques, it is inevitable that non-
domain data will get introduced into Memex.
DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 7
Another privacy concern involves the scope of users being allowed access to Memex-
siphoned data. As mentioned earlier in this paper, DARPA has already given access to one
private Non-Governmental Organization (NGO) which introduces concerns, as noted by Mueller
(2010) in regards to private organizations and content regulation, about governance arrangements
that may “avoid substantive and due process rights” (p. 213). With DARPA planning to extend
access to many more NGOs, citizens should be concerned about who has access to this
potentially massive trove of well-organized and readily accessible private information.
Transparency and Oversight
By design, Memex provides for dynamic on-demand content domain generation that
enables Federal agencies, local law enforcement, and other organizations to select surveillance
zones without judicial oversight. The content domain example given by DARPA is for human
trafficking; however, as noted earlier in this paper, web crawlers have the ability to take
commands from controllers to add or modify content domains. This functionality is powerful as
it potentially allows investigators and private organizations access to personal information
collected without oversight and may compromise citizen’s due process protections. Although
currently unknown, it seems feasible that a law-abiding citizen, who posts a request online for a
consensual intimate meeting, may get her information indexed under Memex’s human trafficking
content domain. This functionality without oversight and transparency should invoke concerns
from privacy advocates.
Data Safeguards and Retention
As with any “big data” system, details over how data is collected, stored, and transmitted
is a significant concern. Hackers, identity thieves, and foreign governments may have the
potential and desire to target Memex’s web crawlers or central repositories. These systems must
DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 8
be protected from possible exploitation or hijacking considering the amount of sensitive
information indexed. The data collected, stored, and transmitted – as well as the webcrawlers
themselves – must have safeguards protecting them from compromise by malicious actors intent
on invading citizen’s privacy or committing crimes.
Along with concerns over safeguarding data is the duration of time that collected
information will be retained. In a related situation where governmental authorities built databases
of citizen’s vehicle location information via Automated License Plate Recognition (ALPR)
technology, privacy advocates argued over concerns of data retention (Lynch, 2013). Advocates
have already filed lawsuits attempting to make usage of this technology more transparent
(Electronic Frontier Foundation, 2013). Similarly, privacy advocates will likely be concerned
with Memex’s data retention policies.
Conclusion and the Author’s Personal Reflections
Based off currently published public information, DARPA’s Memex Program is a
powerful tool built with the best of intentions: to protect national security and to combat
cybercrimes. As a technical feat, its ability to crawl the vast scope of the Internet, evade
paywalls and member-only areas, perform in-depth analytical functions, and extract information
is impressive. The breadth of deployment of this tool to include law enforcement, district
attorneys, and NGOs introduces the possibility of abuse. Big Data tools that collect citizen’s
information will inherently encroach on privacy rights and Constitutional protections. Memex
invokes discussion over Internet privacy and whether any and all information posted online is
considered public and open for collection or analysis. Regardless, usage of a tool like Memex
should require strict usage rules that outline scope, additional oversight and transparency, and
must be properly secured. Additionally, information collected by Memex about citizens should
DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 9
be accessible under current freedom of information act laws. Memex is not the first – and will
not be the last – information tool that challenges citizen’s privacy rights.
DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 10
References
Bradbury, D. (2014). Unveiling the dark web. Network Security, 14-17.
Chandler, N. (n.d.). How the deep web works. Retrieved from How Stuff Works: Computer:
http://computer.howstuffworks.com/internet/basics/how-the-deep-web-works.htm
DARPA. (2014, 02 09). DARPA:News Events. Retrieved from Memex aims to create a new
paradigm for domain-specific search.:
http://www.darpa.mil/newsevents/releases/2014/02/09.aspx
DARPA: Information Innovation Office. (2014). Broad Agency Announcement: Memex.
Retrieved from http://go.usa.gov/BBc5
Electronic Frontier Foundation. (2013, 05 06). EFF and ACLU Sue LA Law-Enforcement
Agencies Over License-Plate Reader Records. EFF Press Release.
Greenemeier, L. (2015, 02 08). Human traffickers caught on hidden Internet. Scientific
American. Retrieved from http://www.scientificamerican.com/article/human-traffickers-
caught-on-hidden-internet/
Zetter, K. (2015, 02 15). Darpa is developing a search engine for the dark web. Wired. Retrieved
from http://www.wired.com/2015/02/darpa-memex-dark-web/
Lynch, J. (2013, 05 06). Automated License Plate Readers Threaten Our Privacy. EFF
DeepLinks.
Mueller, M. (2010). Networks and States: The Global Politics of Internet Governance.
Massachusettes Institute of Technology.
Sachan, A. (2012). Countering terrorism through Dark Web analysis. IEEE.
About the author
Author: Christopher Furton
Website: Http://christopher.furton.net
Certified professional with over 12 years of Information Technology experience and 8 years of
hands-on leadership. An expert in cyber security with both managerial and technical skills
proven throughout a career with increasing responsibility and performance expectations. Known
ability to translate complex information for universal understanding. Detail-driven, results-
focused leader with superior analytical, multitasking, and communication skills. Well-versed in
industry best practices including Project Management and IT Service Management. Currently
holding active CISSP, CEH, ITIL Foundations, Security+, and Network+ certifications.
Visit the auhor’s blog:
IT Management Perspectives - https://christopherfurton.wordpress.com/
Social Sphere:
LinkedIn Twitter Google+ Quora Wordpress Flavors.me
Slide Share Tumblr YouTube Pinterest About.me Vimeo

More Related Content

What's hot

A study of index poisoning in peer topeer
A study of index poisoning in peer topeerA study of index poisoning in peer topeer
A study of index poisoning in peer topeer
IJCI JOURNAL
 
Online text data for machine learning, data science, and research - Who can p...
Online text data for machine learning, data science, and research - Who can p...Online text data for machine learning, data science, and research - Who can p...
Online text data for machine learning, data science, and research - Who can p...
Fredrik Olsson
 
Final Next Generation Content Management
Final    Next  Generation  Content  ManagementFinal    Next  Generation  Content  Management
Final Next Generation Content Management
Scott Abel
 
Electronic Surveillance Of Communications 100225
Electronic Surveillance Of Communications 100225Electronic Surveillance Of Communications 100225
Electronic Surveillance Of Communications 100225
Klamberg
 
Electronic Surveillance of Communications 100225
Electronic Surveillance of Communications 100225Electronic Surveillance of Communications 100225
Electronic Surveillance of Communications 100225
Klamberg
 
A42020106
A42020106A42020106
A42020106
IJERA Editor
 
Instructions please write a 5 page paper answering the question con
Instructions please write a 5 page paper answering the question conInstructions please write a 5 page paper answering the question con
Instructions please write a 5 page paper answering the question con
simba35
 
Computer Forensic: A Reactive Strategy for Fighting Computer Crime
Computer Forensic: A Reactive Strategy for Fighting Computer CrimeComputer Forensic: A Reactive Strategy for Fighting Computer Crime
Computer Forensic: A Reactive Strategy for Fighting Computer Crime
CSCJournals
 
What Your Tweets Tell Us About You, Speaker Notes
What Your Tweets Tell Us About You, Speaker NotesWhat Your Tweets Tell Us About You, Speaker Notes
What Your Tweets Tell Us About You, Speaker NotesKrisKasianovitz
 
Hello dr. aguiar and classmates,for this week’s forum we were as
Hello dr. aguiar and classmates,for this week’s forum we were asHello dr. aguiar and classmates,for this week’s forum we were as
Hello dr. aguiar and classmates,for this week’s forum we were as
simba35
 
Findability Primer by Information Architected - the IA Primer Series
Findability Primer by Information Architected - the IA Primer SeriesFindability Primer by Information Architected - the IA Primer Series
Findability Primer by Information Architected - the IA Primer Series
Dan Keldsen
 
Challenges and emerging practices for knowledge organization in the electron...
Challenges and emerging practices for knowledge  organization in the electron...Challenges and emerging practices for knowledge  organization in the electron...
Challenges and emerging practices for knowledge organization in the electron...Anil Mishra
 
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
ijtsrd
 
Privacidad: La Tensión entre las Capacidades Tecnológicas y las Expectativas ...
Privacidad: La Tensión entre las Capacidades Tecnológicas y las Expectativas ...Privacidad: La Tensión entre las Capacidades Tecnológicas y las Expectativas ...
Privacidad: La Tensión entre las Capacidades Tecnológicas y las Expectativas ...
Facultad de Informática UCM
 

What's hot (20)

A study of index poisoning in peer topeer
A study of index poisoning in peer topeerA study of index poisoning in peer topeer
A study of index poisoning in peer topeer
 
Online text data for machine learning, data science, and research - Who can p...
Online text data for machine learning, data science, and research - Who can p...Online text data for machine learning, data science, and research - Who can p...
Online text data for machine learning, data science, and research - Who can p...
 
Final Next Generation Content Management
Final    Next  Generation  Content  ManagementFinal    Next  Generation  Content  Management
Final Next Generation Content Management
 
Polinter09
Polinter09Polinter09
Polinter09
 
Electronic Surveillance Of Communications 100225
Electronic Surveillance Of Communications 100225Electronic Surveillance Of Communications 100225
Electronic Surveillance Of Communications 100225
 
Electronic Surveillance of Communications 100225
Electronic Surveillance of Communications 100225Electronic Surveillance of Communications 100225
Electronic Surveillance of Communications 100225
 
A42020106
A42020106A42020106
A42020106
 
nm
nmnm
nm
 
Instructions please write a 5 page paper answering the question con
Instructions please write a 5 page paper answering the question conInstructions please write a 5 page paper answering the question con
Instructions please write a 5 page paper answering the question con
 
Computer forencis
Computer forencisComputer forencis
Computer forencis
 
Computer Forensic: A Reactive Strategy for Fighting Computer Crime
Computer Forensic: A Reactive Strategy for Fighting Computer CrimeComputer Forensic: A Reactive Strategy for Fighting Computer Crime
Computer Forensic: A Reactive Strategy for Fighting Computer Crime
 
What Your Tweets Tell Us About You, Speaker Notes
What Your Tweets Tell Us About You, Speaker NotesWhat Your Tweets Tell Us About You, Speaker Notes
What Your Tweets Tell Us About You, Speaker Notes
 
Hello dr. aguiar and classmates,for this week’s forum we were as
Hello dr. aguiar and classmates,for this week’s forum we were asHello dr. aguiar and classmates,for this week’s forum we were as
Hello dr. aguiar and classmates,for this week’s forum we were as
 
Findability Primer by Information Architected - the IA Primer Series
Findability Primer by Information Architected - the IA Primer SeriesFindability Primer by Information Architected - the IA Primer Series
Findability Primer by Information Architected - the IA Primer Series
 
Challenges and emerging practices for knowledge organization in the electron...
Challenges and emerging practices for knowledge  organization in the electron...Challenges and emerging practices for knowledge  organization in the electron...
Challenges and emerging practices for knowledge organization in the electron...
 
Digital forensics
Digital forensicsDigital forensics
Digital forensics
 
digital stega
digital stegadigital stega
digital stega
 
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...
 
Paper24
Paper24Paper24
Paper24
 
Privacidad: La Tensión entre las Capacidades Tecnológicas y las Expectativas ...
Privacidad: La Tensión entre las Capacidades Tecnológicas y las Expectativas ...Privacidad: La Tensión entre las Capacidades Tecnológicas y las Expectativas ...
Privacidad: La Tensión entre las Capacidades Tecnológicas y las Expectativas ...
 

Similar to Christopher furton-darpa-project-memex-erodes-internet-privacy

DARPA Project Memex Erodes Privacy
DARPA Project Memex Erodes PrivacyDARPA Project Memex Erodes Privacy
DARPA Project Memex Erodes Privacy
Chris Furton
 
Terrorism Analysis through Social Media using Data Mining
Terrorism Analysis through Social Media using Data MiningTerrorism Analysis through Social Media using Data Mining
Terrorism Analysis through Social Media using Data Mining
IRJET Journal
 
Mining in Ontology with Multi Agent System in Semantic Web : A Novel Approach
Mining in Ontology with Multi Agent System in Semantic Web : A Novel ApproachMining in Ontology with Multi Agent System in Semantic Web : A Novel Approach
Mining in Ontology with Multi Agent System in Semantic Web : A Novel Approach
ijma
 
Cybercrimes in the Darknet and Their Detections: A Comprehensive Analysis and...
Cybercrimes in the Darknet and Their Detections: A Comprehensive Analysis and...Cybercrimes in the Darknet and Their Detections: A Comprehensive Analysis and...
Cybercrimes in the Darknet and Their Detections: A Comprehensive Analysis and...
dannyijwest
 
Data Models and the DMCA
Data Models and the DMCAData Models and the DMCA
Data Models and the DMCA
professormadison
 
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Amit Sheth
 
DATA, TEXT, AND WEB MINING FOR BUSINESS INTELLIGENCE: A SURVEY
DATA, TEXT, AND WEB MINING FOR BUSINESS INTELLIGENCE: A SURVEYDATA, TEXT, AND WEB MINING FOR BUSINESS INTELLIGENCE: A SURVEY
DATA, TEXT, AND WEB MINING FOR BUSINESS INTELLIGENCE: A SURVEY
ijdkp
 
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
Kumar Goud
 
High Accuracy Location Information Extraction From Social Network Texts Using...
High Accuracy Location Information Extraction From Social Network Texts Using...High Accuracy Location Information Extraction From Social Network Texts Using...
High Accuracy Location Information Extraction From Social Network Texts Using...
kevig
 
High Accuracy Location Information Extraction From Social Network Texts Using...
High Accuracy Location Information Extraction From Social Network Texts Using...High Accuracy Location Information Extraction From Social Network Texts Using...
High Accuracy Location Information Extraction From Social Network Texts Using...
kevig
 
Deeplight Intelliagg
Deeplight IntelliaggDeeplight Intelliagg
Deeplight IntelliaggGavin O'Toole
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applicationsSubrat Swain
 
Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...
Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...
Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...
CSEIJJournal
 
EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...
EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...
EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...
CSEIJJournal
 
On How the Darknet and its Access to SCADA is a Threat to National Critical I...
On How the Darknet and its Access to SCADA is a Threat to National Critical I...On How the Darknet and its Access to SCADA is a Threat to National Critical I...
On How the Darknet and its Access to SCADA is a Threat to National Critical I...Matthew Kurnava
 
Web Mining
Web MiningWeb Mining
Web Mining
Shobha Rani
 
Edu.03
Edu.03 Edu.03
Edu.03 assignment
Edu.03 assignment Edu.03 assignment
Edu.03 assignment
LudiyaStanlySG
 
On line footprint @upc
On line footprint @upcOn line footprint @upc
On line footprint @upcSilvia Puglisi
 

Similar to Christopher furton-darpa-project-memex-erodes-internet-privacy (20)

DARPA Project Memex Erodes Privacy
DARPA Project Memex Erodes PrivacyDARPA Project Memex Erodes Privacy
DARPA Project Memex Erodes Privacy
 
Introduction abstract
Introduction abstractIntroduction abstract
Introduction abstract
 
Terrorism Analysis through Social Media using Data Mining
Terrorism Analysis through Social Media using Data MiningTerrorism Analysis through Social Media using Data Mining
Terrorism Analysis through Social Media using Data Mining
 
Mining in Ontology with Multi Agent System in Semantic Web : A Novel Approach
Mining in Ontology with Multi Agent System in Semantic Web : A Novel ApproachMining in Ontology with Multi Agent System in Semantic Web : A Novel Approach
Mining in Ontology with Multi Agent System in Semantic Web : A Novel Approach
 
Cybercrimes in the Darknet and Their Detections: A Comprehensive Analysis and...
Cybercrimes in the Darknet and Their Detections: A Comprehensive Analysis and...Cybercrimes in the Darknet and Their Detections: A Comprehensive Analysis and...
Cybercrimes in the Darknet and Their Detections: A Comprehensive Analysis and...
 
Data Models and the DMCA
Data Models and the DMCAData Models and the DMCA
Data Models and the DMCA
 
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...
 
DATA, TEXT, AND WEB MINING FOR BUSINESS INTELLIGENCE: A SURVEY
DATA, TEXT, AND WEB MINING FOR BUSINESS INTELLIGENCE: A SURVEYDATA, TEXT, AND WEB MINING FOR BUSINESS INTELLIGENCE: A SURVEY
DATA, TEXT, AND WEB MINING FOR BUSINESS INTELLIGENCE: A SURVEY
 
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
Ijeee 7-11-privacy preserving distributed data mining with anonymous id assig...
 
High Accuracy Location Information Extraction From Social Network Texts Using...
High Accuracy Location Information Extraction From Social Network Texts Using...High Accuracy Location Information Extraction From Social Network Texts Using...
High Accuracy Location Information Extraction From Social Network Texts Using...
 
High Accuracy Location Information Extraction From Social Network Texts Using...
High Accuracy Location Information Extraction From Social Network Texts Using...High Accuracy Location Information Extraction From Social Network Texts Using...
High Accuracy Location Information Extraction From Social Network Texts Using...
 
Deeplight Intelliagg
Deeplight IntelliaggDeeplight Intelliagg
Deeplight Intelliagg
 
Fundamentals of data mining and its applications
Fundamentals of data mining and its applicationsFundamentals of data mining and its applications
Fundamentals of data mining and its applications
 
Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...
Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...
Exploratory Data Analysis and Feature Selection for Social Media Hackers Pred...
 
EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...
EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...
EXPLORATORY DATA ANALYSIS AND FEATURE SELECTION FOR SOCIAL MEDIA HACKERS PRED...
 
On How the Darknet and its Access to SCADA is a Threat to National Critical I...
On How the Darknet and its Access to SCADA is a Threat to National Critical I...On How the Darknet and its Access to SCADA is a Threat to National Critical I...
On How the Darknet and its Access to SCADA is a Threat to National Critical I...
 
Web Mining
Web MiningWeb Mining
Web Mining
 
Edu.03
Edu.03 Edu.03
Edu.03
 
Edu.03 assignment
Edu.03 assignment Edu.03 assignment
Edu.03 assignment
 
On line footprint @upc
On line footprint @upcOn line footprint @upc
On line footprint @upc
 

More from Chris Furton

Christopher Furton - Cybersecurity Threat Brief: Malvertising and Watering Holes
Christopher Furton - Cybersecurity Threat Brief: Malvertising and Watering HolesChristopher Furton - Cybersecurity Threat Brief: Malvertising and Watering Holes
Christopher Furton - Cybersecurity Threat Brief: Malvertising and Watering Holes
Chris Furton
 
Information Architecture Techniques and Best Practices
Information Architecture Techniques and Best PracticesInformation Architecture Techniques and Best Practices
Information Architecture Techniques and Best Practices
Chris Furton
 
Case Study on Effective IS Governance within a Department of Defense Organiza...
Case Study on Effective IS Governance within a Department of Defense Organiza...Case Study on Effective IS Governance within a Department of Defense Organiza...
Case Study on Effective IS Governance within a Department of Defense Organiza...
Chris Furton
 
IT Capital Planning: Enterprise Architecture and Exhibit 300 processes for th...
IT Capital Planning: Enterprise Architecture and Exhibit 300 processes for th...IT Capital Planning: Enterprise Architecture and Exhibit 300 processes for th...
IT Capital Planning: Enterprise Architecture and Exhibit 300 processes for th...
Chris Furton
 
Configuration Management: a Critical Component to Vulnerability Management
Configuration Management: a Critical Component to Vulnerability ManagementConfiguration Management: a Critical Component to Vulnerability Management
Configuration Management: a Critical Component to Vulnerability Management
Chris Furton
 
Analysis of Enterprise Risk Management of Two Retail Industry Competitors
Analysis of Enterprise Risk Management of Two Retail Industry CompetitorsAnalysis of Enterprise Risk Management of Two Retail Industry Competitors
Analysis of Enterprise Risk Management of Two Retail Industry Competitors
Chris Furton
 
Mock Panel debate on hate speech
Mock Panel debate on hate speechMock Panel debate on hate speech
Mock Panel debate on hate speech
Chris Furton
 
IT Strategy in Airlines Industry
IT Strategy in Airlines IndustryIT Strategy in Airlines Industry
IT Strategy in Airlines Industry
Chris Furton
 

More from Chris Furton (8)

Christopher Furton - Cybersecurity Threat Brief: Malvertising and Watering Holes
Christopher Furton - Cybersecurity Threat Brief: Malvertising and Watering HolesChristopher Furton - Cybersecurity Threat Brief: Malvertising and Watering Holes
Christopher Furton - Cybersecurity Threat Brief: Malvertising and Watering Holes
 
Information Architecture Techniques and Best Practices
Information Architecture Techniques and Best PracticesInformation Architecture Techniques and Best Practices
Information Architecture Techniques and Best Practices
 
Case Study on Effective IS Governance within a Department of Defense Organiza...
Case Study on Effective IS Governance within a Department of Defense Organiza...Case Study on Effective IS Governance within a Department of Defense Organiza...
Case Study on Effective IS Governance within a Department of Defense Organiza...
 
IT Capital Planning: Enterprise Architecture and Exhibit 300 processes for th...
IT Capital Planning: Enterprise Architecture and Exhibit 300 processes for th...IT Capital Planning: Enterprise Architecture and Exhibit 300 processes for th...
IT Capital Planning: Enterprise Architecture and Exhibit 300 processes for th...
 
Configuration Management: a Critical Component to Vulnerability Management
Configuration Management: a Critical Component to Vulnerability ManagementConfiguration Management: a Critical Component to Vulnerability Management
Configuration Management: a Critical Component to Vulnerability Management
 
Analysis of Enterprise Risk Management of Two Retail Industry Competitors
Analysis of Enterprise Risk Management of Two Retail Industry CompetitorsAnalysis of Enterprise Risk Management of Two Retail Industry Competitors
Analysis of Enterprise Risk Management of Two Retail Industry Competitors
 
Mock Panel debate on hate speech
Mock Panel debate on hate speechMock Panel debate on hate speech
Mock Panel debate on hate speech
 
IT Strategy in Airlines Industry
IT Strategy in Airlines IndustryIT Strategy in Airlines Industry
IT Strategy in Airlines Industry
 

Recently uploaded

APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
Gal Baras
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
Javier Lasa
 
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptxInternet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
VivekSinghShekhawat2
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
laozhuseo02
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
laozhuseo02
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
JeyaPerumal1
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
3ipehhoa
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Brad Spiegel Macon GA
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
3ipehhoa
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
ufdana
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
JungkooksNonexistent
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
Arif0071
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
3ipehhoa
 
Comptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guideComptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guide
GTProductions1
 
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
keoku
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
nirahealhty
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
eutxy
 
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesMulti-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Sanjeev Rampal
 
BASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptxBASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptx
natyesu
 

Recently uploaded (20)

APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
 
How to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptxHow to Use Contact Form 7 Like a Pro.pptx
How to Use Contact Form 7 Like a Pro.pptx
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
 
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptxInternet-Security-Safeguarding-Your-Digital-World (1).pptx
Internet-Security-Safeguarding-Your-Digital-World (1).pptx
 
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shopHistory+of+E-commerce+Development+in+China-www.cfye-commerce.shop
History+of+E-commerce+Development+in+China-www.cfye-commerce.shop
 
The+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptxThe+Prospects+of+E-Commerce+in+China.pptx
The+Prospects+of+E-Commerce+in+China.pptx
 
1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...1.Wireless Communication System_Wireless communication is a broad term that i...
1.Wireless Communication System_Wireless communication is a broad term that i...
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
 
Latest trends in computer networking.pptx
Latest trends in computer networking.pptxLatest trends in computer networking.pptx
Latest trends in computer networking.pptx
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
 
Comptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guideComptia N+ Standard Networking lesson guide
Comptia N+ Standard Networking lesson guide
 
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
一比一原版(SLU毕业证)圣路易斯大学毕业证成绩单专业办理
 
This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!This 7-second Brain Wave Ritual Attracts Money To You.!
This 7-second Brain Wave Ritual Attracts Money To You.!
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
 
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesMulti-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
 
BASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptxBASIC C++ lecture NOTE C++ lecture 3.pptx
BASIC C++ lecture NOTE C++ lecture 3.pptx
 

Christopher furton-darpa-project-memex-erodes-internet-privacy

  • 1. Running head: DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 1 DARPA Memex Project Erodes Internet Privacy Christopher Furton Syracuse University
  • 2. DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 2 Abstract In February of 2014, the Defense Advanced Research Projects Agency (DARPA) announced the Memex Project that is currently being used by Federal agencies, law enforcement, and Non-Governmental Organizations (NGOs). The Memex project deploys technology that crawls, indexes, analyzes, extracts, and provides search functionality across the entire Internet including the criminal underground referred to as the Dark Net. Despite the good intention of DARPA, the Memex tool raises several privacy concerns such as scope of use, oversight and transparency, data retention, and information security. With this powerful big data capability, precautions must be taken to protect citizen’s privacy rights.
  • 3. DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 3 DARPA Memex Project Further Erodes Internet Privacy What is Memex? The Defense Advanced Research Projects Agency (DARPA) announced plans to create a project known as Memex on February 9th , 2014 (DARPA, 2014, p. 1). A further look into the Broad Agency Announcement (BAA) shows that DARPA is looking for proposals from industry to “maintain technological superiority in the area of content indexing and web search on the Internet” (DARPA: Information Innovation Office, 2014, p. 4). DARPA identifies a problem with current web search functionality stating that it has limitations on what gets indexed and the richness of available details. For government researchers and law enforcement personnel, current methods used involve manual searching by input of exact information one entry at a time. Further analysis must be done to organize or aggregate beyond a list of links (DARPA: Information Innovation Office, 2014, p. 4). DARPA plans to solve this problem with the Memex Project by developing technologies that “provide the mechanisms for content discovery, information extraction, information retrieval, user collaboration, and other areas need to address distributed aggregation, analysis, and presentation of web content” (DARPA: Information Innovation Office, 2014, p. 5). To accomplish this, DARPA has divided the work into three technical areas: domain-specific indexing, domain-specific search, and applications. DARPA specifies the need for technology to reach beyond traditional content, specifically naming the Dark Web as a target. The Dark Web refers to the large mass of Internet content not relatively accessible through search engines often requiring special encryption software to access (Chandler, n.d.). The first technical area DARPA is interested in is domain-specific indexing. This technical area focuses on developing a highly scalable web crawling capability with both content
  • 4. DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 4 discovery and information extraction. This crawling process will provide automated link discovery including obfuscated links, discovery of deep and dark web content, and hidden services – the function of providing web services such as chat or web page hosting on the Dark Web. Additionally, this capability will include counter-crawling measures such as paywalls or member-only areas, crawler bans, and even human detection. Lastly, this capability must also be able to extract information and include normalization of heterogeneous data, natural language processing for translation, image analysis, extraction of multimedia, and several other functions (DARPA: Information Innovation Office, 2014, p. 7). The second technical area DARPA is interested in is domain-specific searching. This capability is not the same as current commercial web searching; instead, it will have configurable interfaces into web content indexed by the first technical area. The interfaces, as outlined in the BAA, may include conceptually aggregated results, conceptually connected content, task relevant facets, implicit collaboration for enriched content, explicit collaboration with shared tags, and several other capabilities. Lastly, this technical area will include a query language so that DARPA personnel may modify instructions for the crawlers and information extraction algorithms (DARPA: Information Innovation Office, 2014, p. 7). The last technical area DARPA is interested in is generically referred to as “applications.” This technical area is where system-level concepts of operation and use cases are developed. The utility of the system must be able to evolve over time based off the needs of the Department of Defense and other agencies, and it involves the development of possible new content domains including missing persons, found data, and counterfeit goods. Specified in this technical area is that the integration, testing, and evaluation is to be performed on the open public Internet (DARPA: Information Innovation Office, 2014, pp. 8-9).
  • 5. DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 5 Stated Purpose According to Wired Magazine (2015), Memex’s purpose is to uncover patterns and relationships in online data for law enforcement and others who track illegal activity (para. 1). Memex uses automated methods to analyze content in order to uncover hidden relationships between data points. Additionally, it helps researchers determine how much of the dark web’s traffic is related to hidden services where content could be indexed. Furthermore, Memex can help investigators understand the turnover of sites, specifically, the relationship between sites when one shuts down and a seemingly unrelated site opens up (Zetter, 2015). DARPA has gone through great measures to specify that the Memex Project is aimed at indexing “domain-specific” content and gives the example of Human Trafficking – both labor and sex - in the BAA (DARPA: Information Innovation Office, 2014, p. 4). At least one instance of Memex in action has been documented: the New York District Attorney’s Office claims that an experimental set of Internet search tools is part of the prosecutor’s arsenal that helped secure a sex trafficking conviction. In this instance, Memex was used to scour the Internet looking for advertisements used to lure victims into servitude and to promote their sexual exploitation (Greenemeier, 2015, para. 2-3). DARPA is an agency under the Department of Defense; however, the stated purpose of Memex has a direct correlation to law enforcement. DARPA has confirmed that in August of 2014, several beta testers were approved to use Memex including two district attorney’s offices, a law enforcement group, and a nongovernmental organization (NGO). The next set of tests are expected to begin in early 2015 and include federal and district prosecutors, regional and national law enforcement, and multiple NGOs. Every quarter, DARPA wants to expand user testing until they are comfortable handing the tool over to law enforcement agencies and
  • 6. DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 6 prosecutors. Eventually, the plan is to have the Memex capability installed locally at law enforcement agencies and to ensure police could access the software from anywhere (Greenemeier, 2015). Hypothetical Privacy Abuse The Dark Web provides an avenue for cyber criminals, human traffickers, child pornographers, and other criminals to conduct business (Bradbury, 2014). Additionally, it also provides an avenue for planning and coordination of terrorism activities posing a threat to National Security (Sachan, 2012). It is only logical that the United States Government would be interested in gaining better insight and surveillance into the Dark Web for both national security and law enforcement reasons; however, with that comes the need to protect citizen’s privacy. Memex, as an international “Big Data” tool, provides the capability for content discovery, index, search, aggregation, and extraction on a very large scale introducing a swarm of information privacy concerns. Scope of Use One major privacy concern involves the breadth of data collected by Memex’s web crawling indexers. Since the web crawlers are designed to ignore the content owner’s explicit crawler prohibitions – often done by robots.txt files – as well as penetrating paywalls and membership areas, it is highly feasible that content intended to be private will get vacuumed into the Memex system. Therefore, Memex can be seen as another attempt by the United States Government to increase the size of the figurative information haystack in order to find more needles. Despite the usage of “domain-specific” collection techniques, it is inevitable that non- domain data will get introduced into Memex.
  • 7. DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 7 Another privacy concern involves the scope of users being allowed access to Memex- siphoned data. As mentioned earlier in this paper, DARPA has already given access to one private Non-Governmental Organization (NGO) which introduces concerns, as noted by Mueller (2010) in regards to private organizations and content regulation, about governance arrangements that may “avoid substantive and due process rights” (p. 213). With DARPA planning to extend access to many more NGOs, citizens should be concerned about who has access to this potentially massive trove of well-organized and readily accessible private information. Transparency and Oversight By design, Memex provides for dynamic on-demand content domain generation that enables Federal agencies, local law enforcement, and other organizations to select surveillance zones without judicial oversight. The content domain example given by DARPA is for human trafficking; however, as noted earlier in this paper, web crawlers have the ability to take commands from controllers to add or modify content domains. This functionality is powerful as it potentially allows investigators and private organizations access to personal information collected without oversight and may compromise citizen’s due process protections. Although currently unknown, it seems feasible that a law-abiding citizen, who posts a request online for a consensual intimate meeting, may get her information indexed under Memex’s human trafficking content domain. This functionality without oversight and transparency should invoke concerns from privacy advocates. Data Safeguards and Retention As with any “big data” system, details over how data is collected, stored, and transmitted is a significant concern. Hackers, identity thieves, and foreign governments may have the potential and desire to target Memex’s web crawlers or central repositories. These systems must
  • 8. DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 8 be protected from possible exploitation or hijacking considering the amount of sensitive information indexed. The data collected, stored, and transmitted – as well as the webcrawlers themselves – must have safeguards protecting them from compromise by malicious actors intent on invading citizen’s privacy or committing crimes. Along with concerns over safeguarding data is the duration of time that collected information will be retained. In a related situation where governmental authorities built databases of citizen’s vehicle location information via Automated License Plate Recognition (ALPR) technology, privacy advocates argued over concerns of data retention (Lynch, 2013). Advocates have already filed lawsuits attempting to make usage of this technology more transparent (Electronic Frontier Foundation, 2013). Similarly, privacy advocates will likely be concerned with Memex’s data retention policies. Conclusion and the Author’s Personal Reflections Based off currently published public information, DARPA’s Memex Program is a powerful tool built with the best of intentions: to protect national security and to combat cybercrimes. As a technical feat, its ability to crawl the vast scope of the Internet, evade paywalls and member-only areas, perform in-depth analytical functions, and extract information is impressive. The breadth of deployment of this tool to include law enforcement, district attorneys, and NGOs introduces the possibility of abuse. Big Data tools that collect citizen’s information will inherently encroach on privacy rights and Constitutional protections. Memex invokes discussion over Internet privacy and whether any and all information posted online is considered public and open for collection or analysis. Regardless, usage of a tool like Memex should require strict usage rules that outline scope, additional oversight and transparency, and must be properly secured. Additionally, information collected by Memex about citizens should
  • 9. DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 9 be accessible under current freedom of information act laws. Memex is not the first – and will not be the last – information tool that challenges citizen’s privacy rights.
  • 10. DARPA MEMEX PROJECT ERODES INTERNET PRIVACY 10 References Bradbury, D. (2014). Unveiling the dark web. Network Security, 14-17. Chandler, N. (n.d.). How the deep web works. Retrieved from How Stuff Works: Computer: http://computer.howstuffworks.com/internet/basics/how-the-deep-web-works.htm DARPA. (2014, 02 09). DARPA:News Events. Retrieved from Memex aims to create a new paradigm for domain-specific search.: http://www.darpa.mil/newsevents/releases/2014/02/09.aspx DARPA: Information Innovation Office. (2014). Broad Agency Announcement: Memex. Retrieved from http://go.usa.gov/BBc5 Electronic Frontier Foundation. (2013, 05 06). EFF and ACLU Sue LA Law-Enforcement Agencies Over License-Plate Reader Records. EFF Press Release. Greenemeier, L. (2015, 02 08). Human traffickers caught on hidden Internet. Scientific American. Retrieved from http://www.scientificamerican.com/article/human-traffickers- caught-on-hidden-internet/ Zetter, K. (2015, 02 15). Darpa is developing a search engine for the dark web. Wired. Retrieved from http://www.wired.com/2015/02/darpa-memex-dark-web/ Lynch, J. (2013, 05 06). Automated License Plate Readers Threaten Our Privacy. EFF DeepLinks. Mueller, M. (2010). Networks and States: The Global Politics of Internet Governance. Massachusettes Institute of Technology. Sachan, A. (2012). Countering terrorism through Dark Web analysis. IEEE.
  • 11. About the author Author: Christopher Furton Website: Http://christopher.furton.net Certified professional with over 12 years of Information Technology experience and 8 years of hands-on leadership. An expert in cyber security with both managerial and technical skills proven throughout a career with increasing responsibility and performance expectations. Known ability to translate complex information for universal understanding. Detail-driven, results- focused leader with superior analytical, multitasking, and communication skills. Well-versed in industry best practices including Project Management and IT Service Management. Currently holding active CISSP, CEH, ITIL Foundations, Security+, and Network+ certifications. Visit the auhor’s blog: IT Management Perspectives - https://christopherfurton.wordpress.com/ Social Sphere: LinkedIn Twitter Google+ Quora Wordpress Flavors.me Slide Share Tumblr YouTube Pinterest About.me Vimeo