SlideShare a Scribd company logo
1 of 132
Download to read offline
University of Glamorgan
Prifysol Morgannwg
Faculty of Advanced Technology
M.Sc. Project
Careless e-Talk Costs Money; The Risk of Open Source
Intelligence based attacks
Student Name: John Dunne
Student Enrolment Number: 11058374
Primary Project Supervisor: Konstantinos Xynos
Secondary Project Supervisor: Iain Sutherland
Academic Year: 2012/ 2013
Scheme: Part-time
2
Faculty of Advanced Technology
STATEMENT OF ORIGINALITY
This is to certify that, except where specific reference is made, the work described
in this project is the result of the investigation carried out by the student, and that
neither this project nor any part of it has been presented, or is currently being
submitted in candidature for any award other than in part for the M.Sc. award,
Faculty of Advanced Technology from the University of Glamorgan.
Signed
Student
3
Abstract
Deep Web and Open Source Intelligence (OSINT) based attacks are a growing
problem within the UK and worldwide. One of the fastest emerging technological
threats being faced by organisations today is that of losing sensitive or confidential
data, through illegal activities such as data exfiltration and intellectual property
theft.
The purpose of this dissertation is to examine the level of understanding within the
Information Systems (IS) Security community about the threat posed from Deep
Web and OSINT-based attacks. Also, to consider whether the IS controls that are
currently in place are sufficient to counter the threat.
It also describes, by means of case studies, the potential threat to organisations
and nation states and proposes a method of control through the adoption of an
audit program to measure the exposure to OSINT-based attacks.
Disclaimer
A number of the websites and resources identified in this paper are part of the
Internet that is at the very end of regulation, almost to the point of no regulation
save the users own common sense. Any reader who seeks out these resources
must use all sensible precautions when viewing or downloading content. The
author of this paper bears no responsibility for the actions or consequences of any
reader who utilises the resources identified herein.
4
Acknowledgements
I would like to thank my employer Grant Thornton UK LLP, and in particular Greg
Swift - National Director of Information Systems, for their support and
encouragement during my studies.
I would also like to thank all of my colleagues in the IS Security profession for
sharing their knowledge and experience in completing the survey that informed this
dissertation. I would especially like to thank Pete Wood of First Base Technologies
and James Chappell of Digital Shadows for their invaluable guidance and
knowledge.
In addition I would also like to thank my tutors and my project supervisors
Konstantinos Xynos and Iain Sutherland for their help and guidance.
And finally I would like to thank my beautiful wife Helen and my children Zachary
and Grace for their patience, love and support and my sister Clare for her priceless
advice.
5
Contents
Glossary 7
Introduction 8
Aims and Objectives 8
Section 1 – The Deep Web 10
1.1 What is the Deep Web? 10
1.2 How big is the Deep Web? 12
1.3 What kind of information is available in the Deep Web? 14
1.4 Accessing the information on the Deep Web 17
1.5 The Deep Web and Hacktivism 19
1.6 Extraction of data from the Deep Web 19
Section 2 – Open Source Intelligence 26
2.1 What is Open Source Intelligence? (OSINT) 26
2.2 How can OSINT Information be aggregated? 29
2.3 OSINT versus Big Data 30
Section 3 – Gaining Unauthorised Access 31
3.1 The use of "Spear Phishing" and other hacking techniques 31
3.2 Case Studies 34
Section 4 – Survey Design, Data Collection and Analysis 40
4.1 Selection and justification of the data collection tool 40
4.2 Survey Design 43
4.3 Use of the ISO 27001 standard 45
4.4 Analysis of survey results 47
Section 5 – Conclusions and Recommendations 59
5.1 Conclusions 59
5.2 Recommendations 61
5.3 Application of Recommendations to Case Studies 63
5.4 Summation 66
5.5 Suggestions for Future Research 67
References 68
Appendix 1 – Example Audit Program for Evaluating and Organising Controls
to prevent OSINT–type attacks on an Organisation 71
Appendix 2 – Copy of MSc proposal 74
Appendix 3 – Survey Questionnaire 77
6
Appendix 4 – Survey Results 90
Appendix 5 – MSc Project Logbook 123
Appendix 6 – Electronic copy of Dissertation 132
Table 1: Top 10 search engines according to Alexa Internet analytics ..................15
Table 2: QFD chart for selection of data collection tool..........................................43
Figure 1: Screenshot of the front screen of TOR Web browser .............................17
Figure 2: Screenshot of the login page for the Silk Road .......................................18
Figure 3: Screenshot of the results returned from the Maltego software................21
Figure 4: Screenshot of Copernic Desktop Agent (CDA) software.........................22
Figure 5: Screenshot of the results returned from the FOCA application ...............23
Figure 6: Screenshot of the Shodan Internet engine..............................................24
Figure 7 Return on Security Investment equation ..................................................64
7
Glossary
APT
Advanced Persistent Threat – A complex, determined
and well-resourced attack on the infrastructure of large
organisations or nation states, typically orchestrated by
another nation state.
Controls Framework
A holistic term used in this paper to describe the
collection of logical, physical and managerial IS
Security controls that protect an organisation.
Deep Web
The databases and collected resources that make up
the vast, hidden part of the Internet not normally
accessed by "everyday" users.
Footprint(ing)
The practice of researching a target to establish attack
vectors or weaknesses within the infrastructure or
online presence that can be exploited. Differs from
OSINT in that it incorporates techniques such as port
scans to identify technical weaknesses.
Grey Literature
Published material that is not indexed and often lacks
data about the publisher.
IPR
Intellectual Property Rights. Legal term referring to the
creation of the mind for which ownership is recognised.
Keylogger
Software program that, when installed on a machine
(often maliciously), records all of the keys struck on a
keyboard. The information can then be overtly or
covertly collected by the person who installed the
software.
OSINT
Open Source Intelligence – The practice of searching
for, collating and analysing information on an intended
target.
Spear Phishing
The practice of sending emails that contain malicious
Trojans and viruses to high profile individuals in order to
gain unauthorised access to their network. The emails
usually contain some sort of enticement or incentive.
Surface Web
The webpages and other internet resources used by
the majority of users.
Whaling
A specific form of 'phishing' or 'spear phishing' that
targets upper managers in private companies and
usually contain some sort of pseudo-legal or corporate
instruction designed to galvanise the recipient into
action.
8
Introduction
Within the Internet there is a flood of information available on individuals and
organisations that are available to those with the knowledge of how to access it.
This data goes beyond what is visible on the "Surface Web" and can be extracted,
aggregated and analysed to provide extremely comprehensive profiles for research
or informing sophisticated cyber attacks. The data is held in what is commonly
referred to as the "Deep Web" and the practice of extrapolating data from it is
known as Open Source Intelligence gathering, or OSINT.
Aims and Objectives
The aim of this dissertation is to evaluate whether the current level of IS Security
controls are sufficient to deter Deep Web and OSINT-based attacks and to make
recommendations for additional controls as required.
The objectives of this dissertation are:
1. To determine the current level of understanding about Deep Web and
OSINT-based attacks within the IS Security community;
2. Determine the effectiveness of IS Security controls currently in place in
deterring OSINT-based attacks;
3. Make recommendations as appropriate for additional logical, physical or
managerial controls to guard against such attacks;
In order to effectively meet the aims and objectives of this paper it has been
structured as follows:
The first section describes the Deep Web, its size and the type of data stored
within its databases. It also describes some of the common tools for accessing
and extracting the data.
The second section provides an overview of the techniques of OSINT and
describes how the aggregation of such data can be used to build a picture of the
target for subsequent use.
The third section outlines some of the techniques that are used in conjunction with
the OSINT data to attack organisations and individuals (e.g. spear phishing). This
9
section also contains two case studies as an example to demonstrate how the
practice of Deep Web and OSINT attacks are perpetrated.
The fourth section describes the research methodology applied to the collection of
data that informs this dissertation, describes the survey design and presents an
analysis of the collated results.
The final section contains the conclusions drawn from the survey results, the
recommendations arising from the conclusions and suggestions for subsequent
study to build on the work undertaken in this dissertation.
10
Section 1 – The Deep Web
1.1 What is the Deep Web?
Personal and corporate data has never been more available than it is today. With
the exponential growth in corporate websites, personal web spaces, social media
sites (e.g. Facebook, Bebo, Twitter, Myspace) and image sharing sites (e.g.
Picasa, Flickr) there is a wide range of personal information available on the
Internet. Add to this the innumerable other data sources such as business forums
(e.g. Linkedin), special interest forums, blogs, videos, news feeds and databases
the Internet is bloated with a plethora of information However, this is barely the tip
of the iceberg.
Behind this "Surface Web" of websites and search engines there lies a multitude of
databases that dwarf the part of the internet that the typical user will browse. This
is the Deep Web.
The term "Deep Web" was initially coined by Mike Bergman, the founder of the
internet research company Brightplanet™. The term refers to the myriad of
databases and other data sources that sit behind the websites that make up the
"Surface Web", that is the websites typically accessed by the majority of users.
In order to differentiate between the two entities this paper uses the term "Surface
Web" to define the internet resources referenced by proprietary search engines
and used by the majority of internet users, and "Deep Web" to refer to the unlinked
databases that are not normally included within surface searches.
By searching the Deep Web for articles of information on an organisation or
individual it is possible to build up a comprehensive picture of them and utilise it for
a number of purposes, such as corporate intelligence gathering and targeted
marketing. This practice is known as Open Source Intelligence, or OSINT,
gathering. However, the data can also be used for more nefarious purposes such
as committing identity fraud, informing "spear phishing" attacks and stealing or
contravening Intellectual Property Rights (IPR).
In their seminal paper "Accessing the Deep Web", the authors describe the Deep
Web thus;
11
"The Web has been rapidly ―deepened‖ by massive databases online and
current search engines do not reach most of the data on the Internet. While
the surface Web has linked billions of static HTML pages, a far more
significant amount of information is believed to be ―hidden‖ in the deep Web,
behind the query forms of searchable databases…. Such information may
not be accessible through static URL links because they are assembled into
Web pages as responses to queries submitted through the query interface
of an underlying database. Because current search engines cannot
effectively crawl databases, such data remains largely hidden from users
(thus often also referred to as the invisible or hidden Web). (He, Patel,
Zhang, & Chang, 2007)
On the Surface web applications known as "Web Crawlers"1
operate behind search
engines to find the information that is being searched for. When a term or search
string is entered into the search engine, the web crawler software scans the
internet for relevant hyperlinks and metadata within web sites code and content.
The software maps the structure of the web site, records and describes all the links
and the search engine presents the results for the user to see. However, websites
that do not have an index or system of visible links, or contain data sets outside of
normal Hyper Text Mark-up Language (HTML) code are unlikely to be scanned or
detected by a web crawler and are therefore considered un-indexed.
This is the structure of information within the Deep Web. In essence it is
comprised of Internet resources that are not displayed via searches using
conventional search engines. Even though the vast majority of information is
publicly accessible it does not appear on search engines or, if it does, the
relevance rating is so low that the result is unlikely to be viewed by the enquirer. In
the 2008 paper "Google's Deep Web crawl", the authors recognise the difficulty of
extrapolating data from the Deep Web thus:
"The Deep Web, i.e., content hidden behind HTML forms, has long been
acknowledged as a significant gap in search engine coverage. Since it
represents a large portion of the structured data on the Web, accessing
Deep-Web content has been a long-standing challenge for the database
community." (Madhavan, Ko, Kot, Ganapathy, Rasmussen, & Halevy, 2008)
1 Web Crawler – An automated piece of software that browses the World Wide Web, typically for the
purpose of Web indexing
12
1.2 How big is the Deep Web?
Estimates on the size of the Deep Web vary. In 2001 the paper "The Deep Web,
Surfacing Hidden Value" made the following assertions about the size of the Deep
Web:
 Public information on the deep Web is currently 400 to 550 times larger than
the commonly defined World Wide Web.
 The deep Web contains 7,500 terabytes of information compared to
nineteen terabytes of information in the surface Web.
 The deep Web contains nearly 550 billion individual documents compared to
the one billion of the surface Web.
 More than 200,000 deep Web sites presently exist.
 Sixty of the largest deep-Web sites collectively contain about 750 terabytes
of information — sufficient by themselves to exceed the size of the surface
Web forty times.
 On average, deep Web sites receive fifty per cent greater monthly traffic
than surface sites and are more highly linked to than surface sites; however,
the typical (median) deep Web site is not well known to the Internet-
searching public.
 The deep Web is the largest growing category of new information on the
Internet.
 Deep Web sites tend to be narrower, with deeper content, than conventional
surface sites.
 Total quality content of the deep Web is 1,000 to 2,000 times greater than
that of the surface Web.
 Deep Web content is highly relevant to every information need, market, and
domain.
 More than half of the deep Web content resides in topic-specific databases.
 A full ninety-five per cent of the deep Web is publicly accessible information
— not subject to fees or subscriptions. (Bergman, 2001)
In 2007 a more accurate analysis was used to calculate the extent of the Deep
Web. The calculation confirmed the previous estimate that the Deep Web is
considerably larger than the Surface web;
Using overlap analysis between pairs of search engines, it was estimated
that 43,000–96,000 ―deep Web sites‖ and an informal estimate of 7,500
terabytes of data exist—500 times larger than the surface Web. (He, Patel,
Zhang, & Chang, 2007)
More recent papers suggest that, now that size of the visible Internet has
expanded, the gap between the two realms has reduced. However, in its present
state, the Deep Web still dwarfs the Surface Web;
Today’s internet is significantly bigger with an estimated 555 million
domains, each containing thousands or millions of unique web pages. As
13
the web continues to grow, so too will the Deep Web and the value attained
from Deep Web content. (Pederson, 2013)
14
1.3 What kind of information is available in the Deep Web?
There is a vast repository of information on the Deep Web. It is comprised of data
from a number of sources, such as:
 Research databases such as Wikipedia, Ebscohost2
and Jstor3
. Databases
such as these have their own search engines that can extrapolate data from
the database tables. However, the user must be able to access the website
and utilise its search function, rather than conducting a string-specific search
using tools from outside the website. Alternatively, the database tables may
contain data in formats not supported by the web browser software.
Additionally, the majority of website search engines maintain archive
databases of popular searches and analytical information within their cache
that is available online. According to the Alexa Internet analytics website the
top ten search Internet engines are shown below. (Position in world rankings
according to Alexa monitoring software shown in brackets):
# Site Rank according
to Alexa
Host Country
1 Google 2 USA
2 Yahoo 4 USA
3 Baidu 5 China
4 QQ 8 China
5 Windows Live 9 USA
2
EBSCOhost is a powerful online reference system accessible via the Internet. It offers a variety of
proprietary full text databases and popular databases from leading information providers.
(EBSCOhost, 2013)
3 JSTOR is a digital library of more than 1,500 academic journals, books, and primary sources.
(JSTOR, 2013)
15
6 Google India 12 USA
7 Yahoo! Japan 15 USA
8 Bing 16 USA
9 Sian.com.cn 17 China
10 Yandex.ru 18 Russia
Table 1: Top 10 search engines according to Alexa Internet analytics
(Top Sites,
2013);
 Dynamic content (e.g. searches) that are generated by the use of search
boxes and forms on websites (e.g. the results on a search on Google,
Yahoo, Bing or Facebook that are archived by the service providers);
 Unlinked content, stuff that does not have any public linked website pointing
to it. For example, an individual may set up their own internet site and
upload documents such marketing materials or their resume. Whilst this
content is available to all it will remain hidden until the person sends the
requester a link to the web site;
 Private Web – Websites that require authentication via a username and
password such as web-based mail accounts or data storage (e.g. cloud data
storage);
 Real-time content; live streams and feeds by their nature cannot be indexed
but are archived;
 Contextual web - web resources that are restricted by IP address that will
restrict some resources based on your location, for example YouTube
limiting content by location or the Great Firewall of China;
16
 Limited access content – web resources which are invisible to internet users
until the link is sent to them. For example, private business webs such as
the UK Business Forums Website (UK Business Forums, 2013) or mail
groups such as Yahoo mail or Googlemail;
 Non-html content – textual content encoded in multimedia formats or
specific file formats not handled by search engines
Some of the most revealing information cannot be found by using proprietary
search engines, but is readily available by using web tools available to retrieve it.
17
1.4 Accessing the information on the Deep Web
In recent years Google™, the Internet's largest search engine provider [according
to the Internet analytics website Alexa (Top Sites, 2013)], has recognised the need
for accessing and indexing the content of the Deep Web. They have published a
series of papers identifying different mechanisms by which the Deep Web can be
accessed through the Google search engine. However, there are still inherent
issues with this type of search because they rely on searching for specific
elements, such as entities on shopping sites, rather than an all-encompassing
search that identifies all of the items that relate to the search string.
The most widely known software for accessing the Deep Web is The Onion Router,
also known as TOR (Internet Defense League, 2013). TOR is an open source
browser (fig.1) that protects the user's anonymity by relaying the internet packets
through a private network of routers, avoiding the proprietary search engine
routers. Part of the TOR network are so called "onion" sites, that is, sites that
have the suffix .onion as the (pseudo) Top-Level Domain4
(TLD) hostname. Such
addresses are not actual Domain Name System (DNS) names, and the .onion TLD
is not in the Internet DNS root, but with the appropriate proxy software installed,
Internet programs such as the TOR web browser can access sites with .onion
addresses by sending the request through the network of TOR servers.
Figure 1: Screenshot of the front screen of TOR Web browser
4 Top Level Domain –the highest level of architecture in the hierarchical Domain Name System of
the Internet, typically reserved for countries or sectors (e.g. .com, .co.uk, .gov, .ac.uk)
18
One of the more infamous areas of the deep web that can be accessed using TOR,
is that of the Silk Road. The Silk Road (fig.2) is an e-marketplace that has
proliferated in the last few years, known primarily for the buying and selling of
illegal drugs. The website administrators claim to be trustworthy and only allow
genuine (if illegal in some countries) transactions. In order to facilitate a
transaction on the Silk Road the user must first create an account on the website
and be in possession of a bitcoin5
account to enable the transfer of funds to the
appropriate vendor. Other Deep Web sites include the Hidden Wiki, a website that
offers resources such as hacking, forgery, weaponry and other nefarious activities.
Figure 2: Screenshot of the login page for the Silk Road
5 Bitcoin – an electronic method of making and receiving payments.
19
1.5 The Deep Web and Hacktivism
One of the primary users of the Deep Web is the hacktivist6
, who use its resources
and anonymity to promote their political ideology. The hacktivists use the Deep
Web to attack systems and architectures with legal and illegal tools in order to
manifest their dissent. Techniques include activities such as denial-of-service
attacks, data breaches and web site defacement as well as other methods of digital
sabotage. The hacktivists undertake their operations in the belief that they can
effect the changes that the normal forms of protest (e.g. civil disobedience)
produce. In the IS Security Forum website the author identifies two types of
hacktivism;
"[There are] two different participative approaches to the Dark Web. The
hacktivist, in fact, could surf in the hidden space for information gathering
purposes, the ―passive mode‖, and also in ―active mode‖ conducting cyber
operations similar to ones promoted in the ordinary web. (Paganini, 2012)
Typically the activities of the "passive" hacktivist outweigh the activities of the
"active" hacktivist. However they all draw from the same information source which
is ripe with information, as the author continues;
"The deep web is an ocean of information, and to find the right way in this
world on the first approach may seem very complicated, but with good will
and some evidence from the earliest voyages it is possible to obtain
satisfactory results" (Paganini, 2012)
1.6 Extraction of data from the Deep Web
Whilst it is possible to use TOR and other tools to find data within the Deep Web
the user must utilise other pieces of software to extract and aggregate it. In the
article "How to Hack a Nation's Infrastructure" the author commented
"Security experts are finding lots of holes in the software they run that, in the
hands of a skilled attacker, can be exploited to grant unauthorised access."
(Ward, 2013)
Tools such as Maltego, Copernic, FOCA and Shodan can be used to extract and
aggregate information from the Internet to build a comprehensive picture of the
target. Whilst they can extrapolate HTML, XML and other web-based data, it is
6 Hacktivist - a person who make use of hacking tools and techniques to promote a political agenda.
Examples of this practice are organisations such as Lulzsec and Anonymous.
20
also possible to use resources such as media feeds, geo-location and archive data
to complete the picture of the target.
21
Maltego
Maltego is an open source intelligence and forensics application, developed and
sold by Paterva. It is capable of the in-depth mining and collation of information
which can be presented in several different formats. Maltego is also capable of
identifying and displaying the complex relationships between the identified target
and its subsidiary connections. For example, in figure 3 the software has
identified all the sub domains of the URL7
http:/www.glam.ac.uk.
This was a relatively simple exercise for the software but it can also undertake
more complex tasks, such as identifying connections to an email address, IP
address or network entity. It can also cross over media platforms and find
connections to other media streams, such as twitter or Facebook.
Figure 3: Screenshot of the results returned from the Maltego software
In addition Maltego allows the creation of custom entities, allowing it to represent
any type of information as well as the basic entity types which are part of the
software. The basic focus of the application is analyzing real-world relationships
between people, groups, websites, domains, networks, internet infrastructure, and
affiliations with online services such as Twitter and Facebook.
7 URL – Uniform Resource Locator: A specific character string that represents a reference to the
location of Internet resources such as web pages.
22
Copernic
Copernic is a commercial metasearch utility that operates on the Windows family of
operating systems. The company offers a range of products that have been
designed to operate in different environments. For example, once installed the
Copernic Desktop Agent (CDA) software can search for all files relating to a
specific search string on either a host machine or network and display them by
category. (See figure 4 for an example of the search string glam.ac.uk on the
authors computer)
Figure 4: Screenshot of Copernic Desktop Agent (CDA) software
Copernic Agent is the web-based version of the software that can search for
Internet content and displays according to category and relevance. The software
also includes filters in order to clarify the results returned and, combined with
Copernic Tracker (a utility that checks for changes of content on defined pages and
alerts the user) can be used to build and maintain comprehensive target profiles.
23
FOCA
FOCA is a freeware tool capable of undertaking foot-printing (the defining of the
security profile of an organisation such as user accounts, corporate officers and
email addresses) and fingerprinting (the defining of an organisations operating
system and other technical information) in order to inform a security audit or attack.
Foot-printing and fingerprinting are methodical stages of a security sweep that also
includes scanning and enumeration (defining which ports and services are open
and available).
The freeware version performs searches on servers, domains, URLs and
documents published on the web. In figure 5 it is possible to see elements of the
search conducted on the same glam.ac.uk domain as used in the Maltego
example.
Figure 5: Screenshot of the results returned from the FOCA application
When used as part of a Deep Web or OSINT-based attack, the information
gathered using this tool would inform a cyber attack such as Network or Technical
Intrusion that is described in the next chapter.
24
Shodan
Shodan is a search engine designed to search for devices and computer systems
that connect to the World Wide Web. It was launched in 2009 by computer
programmer who had conceived the idea of searching devices linked to the
Internet. Shodan users are able to find control systems for large organisations
such as amusement parks, petrol stations and utility plants as well as low level
systems such as security cameras, home heating systems and traffic lights. It is
possible to access these systems because, in many cases, the default password
has not been changed and the only software required to connect them is a web
browser. The website searches the Internet for publicly accessible devices,
concentrating on Supervisory Control And Data Acquisition (SCADA) systems, like
those used to operate major installations. It is possible to use Shodan without
creating an account on the software, but the search results are limited.
It is proposed by the website that the primary users of Shodan are cyber security
professionals, researchers and law enforcement agencies. Whilst it has been
argued that cybercriminals could also use the website, the reality is that many
would typically have access to botnets that could accomplish the same task without
detection.
Figure 6: Screenshot of the Shodan Internet engine
25
Attackers will also use social media sites and other media platforms such as
Picasa™ or Flickr™ to add a geolocation dimension to its attack vector. In the
2011 article "In Plain View - Open Source Intelligence" the author details an
example where a fellow journalist utilised available web information to identify and
research a complete stranger;
He [the journalist] saw her taking pictures with an iPhone 3G in a San
Francisco Park. Searching on Flickr that night, he found the picture that she
had taken, and was quickly able to work out where she lived and what her
apartment looked like, simply by examining her photo stream. (Bradbury,
April 2011)
Commercial Deep Web harvesting
Companies such as Bright Planet and Digital Shadows and independent IS
Security professionals use these applications, and their own specially developed
tools and techniques, to develop an internet profile of their client which can then be
used to inform the risk register.
Now that the location of the information is known and the tools are available to
collect it, the information can be harvested, aggregated and analysed using OSINT
techniques, in order to develop a profile of the organisation. From there it can be
determined which is the best method of attack.
26
Section 2 – Open Source Intelligence
2.1 What is Open Source Intelligence? (OSINT)
Now that the scope of the data sources has been identified as wider than that of
the snippets available on the Surface Web, the next stage is to extract and develop
them so that it builds a comprehensive picture of the target. An individual data
item has an intrinsic value but, once it is combined with other data items and their
relationship analysed, the value increases exponentially.
OSINT is the practice of searching for and aggregating information on a target
organisation or individual in order to develop a profile of them. In the 2008 paper
"Using open source data in developing competitive and marketing intelligence" the
author suggests the following definition:
Open source intelligence, more commonly known as OSINT, is an
information
processing discipline. More specifically for the purposes of this paper, it is
defined as
the scanning, finding, gathering, exploitation, validation, analysis, and
sharing with
intelligence-seeking clients of publicly available print and digital/electronic
data from
unclassified, non-secret, and ―grey literature‖ sources. Grey literature is
published
material that is not indexed and often lacks data about the publisher.
(Fleisher, 2008)
True OSINT is an intelligence level activity conducted by nation states in order to
inform the threat analysis against it or its enemies. In the 2008 paper "Can Open
Source Intelligence emerge as an indispensible discipline for the Intelligence
community in the 21st Century?" presented to the Research Institute for European
and American Studies (RIEAS) the author argues;
OSINT offer governments what they cannot get from their close networks
and they have the capability of choosing what they need from the variety of
products available. This indicates that OSINT has opened a new window to
government agencies and departments since they can fill the gaps that
emerge in their analyses by including the added value of open sources.
(Minas, 2010)
27
However, OSINT is being used by a wider audience than just the intelligence
services of nation states, it is also being used by people everyday to resolve
problems.
In the 2011 paper "Intelligence in the age: The emergence and evolution of Open
Source Intelligence" the authors argue that the Internet is changing the way
problems are resolved by changing the dynamic between "crystallised"8
and "fluid"9
intelligence. By sharing problems, solutions and information in ever greater
quantities people are populating the Internet with more and more information
(Glassman & Kang, 2012). Whilst this undoubtedly has benefits for the growth and
development of human understanding, it is also creating a huge database of
targets and their vulnerabilities for those of less noble intent to exploit. Criminals
could potentially search for targets with specific vulnerabilities and target them with
"false flag" solutions (e.g. a piece of software purporting to be a patch but is in
reality a Trojan or malware).
In the commercial world the use of OSINT has been practised for a number of
years and is referred to as Competitive and Marketing Intelligence (C/MI). C/MI
can described as the;
systematic, targeted, timely and ethical effort to collect, synthesize, and
analyse competition, markets and the external environment in order to
produce actionable insights for decision-makers. (Fleisher, 2008)
The rationale and benefit for conducting CM/I is that it should better inform the
practitioner with regard to the current market and their competitors abilities, which
in turn should underpin better decision making and lead to enhanced
economic/financial performance (Fahey, 2007).
The difference between C/MI and OSINT is that C/MI is restricted to published
information such as public accounts and marketing material, or contacts within
competitors prepared to share information. OSINT however, makes use of the
"grey literature" resources available to provide the researcher with a greater degree
of raw material.
8 Crystallised intelligence – human behaviour that has been developed through experience and is
part of normal behaviour.
9 Fluid Intelligence – experience that is gained as the individual meets new challenges and resolves
problems. It is updated to reflect changes in the situation.
28
Technical examples of grey literature include technical reports, working papers or
patents. Social or personal grey literature could be from blogs, social media posts,
online biographies or website entries of business forums. OSINT also includes
information from those resources that may not be immediately visible to the
average user, e.g. the Deep Web.
29
2.2 How can OSINT Information be aggregated?
The easiest way to answer this question is through example. An attacker identifies
a target (either for commercial or political gain) but only knows limited information
about them, perhaps their email address, but not much else. Using Maltego they
discover that the target has entered their email address on a number of classic car
web sites as part of their mailing lists. Using the geo-location facility they identify a
number of tweets and social media entries that have been posted from various
classic car rallies, parties, adverts selling cars and other events.
Using image recognition software they initiate a search using Copernic on the
target's face which returns a number of locations and events where their target is
positively identified. The grouping and rationalisation utilities allow the attacker to
develop the favoured locations of the target which may lead to the identification of
their home address. The software may also identify posts or websites that display
a number of personal bits of information such as friends, pets, schools attended,
possibly even parents names – all useful information that is typically used as
answers to security verification questions, such as those used by banks or other
institutions when a password change request is received. In this example it is
possible to see that a lot of information can be aggregated in a very short space of
time. However, the cybercriminal will use the information gleaned to inform an
attack on the target using one, a series or all of the techniques identified in the next
chapter.
30
2.3 OSINT versus Big Data
During the research phase of this paper it was noted that some authors had
confused OSINT with Big Data. For clarity, the practice of Big Data has been
described here and compared to OSINT.
Gartner describes the term "big data" as
"…high-volume, high-velocity and high-variety information assets that
demand cost-effective, innovative forms of information processing for
enhanced insight and decision making." (Gartner, 2013)
Some fields of study, e.g. meteorology or the biomedical study of genomics, can
involve the collection of Petabytes10
or Exabytes11
of data. As more and more
data is collected, the tools and processes that have been traditionally used to
analyse them have become ineffective, thus requiring more innovative solutions.
Typically Big Data is processed using parallel processing techniques whereby the
data is divided between several hundred computers all analysing a portion of the
whole data set. The results from each are collated and amalgamated for analysis.
This is not the same as OSINT as it involves wholesale data analysis of a collated
set of results, rather than the targeted search for information, its amalgamation and
analysis.
10 Petabyte = a multiple of the unit byte for digital information. 1 petabyte = 1000 terabytes
11 Exabyte = similarly a unit of measurement for digital information. 1 Exabyte = 1000 petabytes
31
Section 3 – Gaining Unauthorised Access
3.1 The use of "Spear Phishing" and other hacking techniques
In order to make use of the information gleaned from the Deep Web and OSINT
activities, the cybercriminal needs to combine it with hacking techniques, such as
the ones identified below, in order to gain access to the target organisation. Such
techniques were traditionally the preserve of those persons who had sufficient
technical knowledge to develop the software used to trick their intended victims.
However, with the increase of videos and tutorials on how to develop these
techniques on the Internet, the threat has become much more widespread.
The techniques below are not ranked in any particular order, and their
effectiveness is reliant on the ability of the person using them.
Spear Phishing
A target communication, usually an email with a malware-infected attachment or
hyperlink, is sent to a specific set of users accounts at the target organisation (e.g.
the sales team, marketing or senior management). The email usually contains a
special offer or other prize enticing the user to download an attachment or click on
a URL within the body of the email. When the payload is activated the attacker
uses the malicious payload of the email to gain access to the target network. An
example of this are the attacks perpetrated on Yahoo and Hotmail accounts in
2011 (McMillan, 2011) whereby the accounts of thousands of individuals were
duped into either downloading a virus or redirecting their browsers to a malicious
website. A variation on this type of attack is "Whaling".
Whaling
Whaling is a more specific form of spear phishing that targets upper managers in
private companies. The objective is to deceive them into divulging the confidential
company information on their hard drives. The content of a whaling attack email is
often written as a legal subpoena, customer complaint, or executive issue and
usually involves some kind of falsified company-wide concern. The goal is to force
the manager into action similar to a phishing attack whereby they download a
document or click on a link.
In the case of the recent 2008 FBI subpoena whaling scam 20,000 corporate
C.E.Os. were targeted. Approximately 10% clicked on the whaling link, believing it
32
would download a "special" browser add-on to view the entire subpoena document.
In truth, the linked software was a key logger that secretly recorded the C.E.Os.
passwords, and forwarded those passwords onto the attackers. As a result, each
of the 2000 compromised companies was further hacked in some way.
Network intrusion / Technical Intrusion / Procedural Compromise
Network intrusion is an attack where a hacker has successfully foot-printed an
organisation and calculated what their security measures are (e.g. identifying the
correct versions of the internet server, antivirus and firewall software in use),
through the use of tools such as FOCA. They can then craft a virus or Trojan to
take advantage of the holes in the security architecture and utilise the "lag" time
between the virus being created and the patch being released to attack the
organisation. Such attacks are often known as "Zero day" attacks, an example of
which is the Conficker worm (Microsoft, 2013).
A more technical type of attack is that which uses reconnaissance to discover the
layout of an IS infrastructure which then helps focus efforts at the weakest points
the organisation's defences. This may be involve attacking a lower priority
organisation that may have lesser security controls and, once access has been
gained, attempting to promote the account to gain access to better protected
organisations that are connected to it, for example the Gary McKinnon attack on
the US government infrastructure.
Once he was inside a network, especially a military network, McKinnon
found that other computer systems considered him a trusted user. This was
how he was able to get into the Pentagon's network. "It was really by
accident," he says. (Computer Weekly, 2008)
Having used the information gathered to understand the organisation's security
processes in detail, the attacker is able to bypass them.
Data Exfiltration
It is possible to construct viruses that, once installed on the host machine, will steal
the data in small, piecemeal chunks, and pass it out through the network firewall
controls as part of the normal network traffic. Infection is typically through a user
opening an email attachment and, once the payload is executed, the data is
exfiltrated In addition the data packets are typically encrypted using a custom
coding or encryption algorithm to further obfuscate the real intentions of the
33
software from the security controls (CodeInjector, 2013) . This is a very difficult
attack to guard against as normal Intrusion Detection / Prevention Systems (IDPS)
are unable to distinguish between the malware traffic and normal network traffic. It
requires a Stateful IDPS12
that can determine the destination of the packets and
either block the traffic or alert the IS Security officer of any suspicious activity. In
the 2003 paper " A Stateful Intrusion Detection System for World-Wide Web
Servers" the authors describe a system for detecting and preventing such
malicious traffic. (Vigna, Robertson, Kher, & Kemmerer, 2003)
Targeted personal attacks / Social Media Impersonation
Another method for attacking an organisation is to use hostile reconnaissance to
identify the target's personal profile and acquaintances, and then masquerade as a
friend to gain access to their personal data. As previously stated, a large
proportion of this is now online and can be used to inform an attack. In a 2012
paper from the SANS Institute the following comment is made about social media:
"Social media provides an attack vector which can enable an attack on the
organization. But that is not the only risk. Social media is a tool, and there
may be
consequences if that tool is not used properly. A firearm is a tool that
someone can use to provide protection, but improper use of the firearm
could lead to shooting oneself in the foot. Social media can work in the
same way. Social media can also be used in a positive way as a tool to
make money, enhance the business, or reduce business costs" (Shullich,
2011)
The attacker compromises the social media account of an associate of a targeted
individual or impersonates an apparently legitimate entity that the target would be
interested in. The attacker then posts infected hyperlinks that, when the targeted
individual follows the links, allows the attacker access to their network.
Other attacks, similar to the techniques described above include Helpdesk
coercion where the attacker uses the information gained to pose as a legitimate
member of staff and dupe IS staff into granting access. Similarly attackers can
impersonate the IS staff and target users to obtain sensitive information.
12 IDPS software that is capable of analysing the traffic of a network and identifying whether
amalgamated blocks of data would contravene the security thresholds that have been set.
34
3.2 Case Studies
As a means of demonstrating some of the risks to organisations from Deep Web
and OSINT-based attacks, two case studies that incorporate some of the
techniques highlighted previously have been included in this chapter. The first
case study is based on a fictional organisation that could be representative of any
Small or Medium-sized Enterprise (SME) and describes some of the Deep Web
research and phishing techniques used to bypass the security controls of the
network. The second examines the international dimension of Deep Web and
OSINT-based attacks and examines the global threat of Advanced Persistent
Threats (APT's)13
.
Case Study 1
The first example concerns a medium sized manufacturing firm based in the UK
that employs approximately 1000 staff. It has a fairly flat management structure
and administration is split between the development, operations, sales, Human
Resources (HR), IS and facilities departments.
The IS department supports the internal network and utilises a national provider for
access to the WAN and the Internet. Services requiring larger resources or skill
sets, such as SAN provision or SaaS14
for technical applications, are outsourced to
service providers and the contract is managed through the IS Manager. Network
controls include password and data access profiles for all staff. Management
controls include an IS Security policy that all staff are required to read and confirm
their agreement with during their induction phase. The network is not physically
segregated, but logical controls delimit access for departments to separate shares.
There is no on-going staff training programme in IS Security or data handling.
Staff are permitted to use the email system for limited personal use and, whilst
access to the internet is monitored for inappropriate use, there is no restriction on
the viewing and connection to personal sites such as social media and media
sharing. In addition there is no Social Media policy, nor is there any reference to
what staff can post on their own account. Staff are trusted to exercise good
judgement.
13 APT's are usually nation states that have the resources and capability to conduct sophisticated,
high volume attacks on an opposing nation's infrastructure.
14 Software as a Service – the provision of applications or network services to the organisation by a
third party agency, e.g. Microsoft 365
35
The company holds the patent on a number of small but successful inventions and
is active in the development of new products. As part of an annual staff message
the Chief Executive announces the development of a number of new products that
will significantly increase the profitability of the company, but confirms that it will
take a number of weeks to get the products to market because of operational
issues. He also mentions that they are planning a wide ranging marketing
campaign that will guarantee the success of the company but asks that nobody
makes any announcements, so that the full impact of the campaign is not
diminished.
Despite the warning several of the staff make oblique (and sometimes direct)
references to the new products in their profiles on social media sites, in emails to
friends and via other media streams. Departments in the company also make
discrete references to a new product in emails to suppliers and clients. Both
activities draw the attention of cyber criminals who recognise that the theft of the
IPR would be extremely valuable to the company's competitors.
The criminals begin by "foot-printing" of the organisation and identifying the names
and email addresses of the management team of the company. By using Maltego,
Copernic and other tools and techniques described previously to search the
Surface Web and Deep Web, the cyber criminals discover that the Chief Executive
and several of his senior staff have more than a passing interest in golf. In
addition, the company has previously sponsored competitions at the local golf club.
They craft an innocuous looking email that has an attractive golfing membership
deal and brochure included as an attachment and send it to the Senior
Management Team (SMT) of the company. The email is received by the members
of the SMT and, believing that the email is both genuine and safe since it has
passed through the network security, click on the attachment.
When the attachment is activated the brochure opens but, at the same time, a
malicious Trojan programme also installs an enterprise level account on the host
machine. Once installed it sends an activation message back to the cyber criminal.
This is an example of spear phishing as described in the previous chapter.
36
Once the activation message if received the cyber criminal logs onto the host
machine and installs a rootkit15
to collect keystrokes and screen captures any sites
that has a Hypertext Transfer Protocol Secure (https) designation.
The company uses a proprietary antivirus package that, whilst is automatically
updated with the latest virus definition files, does not detect the Trojan as it was
installed in the "lag time" between the Trojan being propagated and the patch being
released. However, once the virus definition files update, the Trojan is detected
and the IS Manager instigates a clean-up routine that removes it. He does not ,
however, notice that a number of the machines have an additional access account
on them, or that a root kit has been installed. Using the enterprise level account
the cyber criminals are now free to search the data directories of the company for
documentation relating to, as yet unpatented or unreleased inventions. They
identify a number of documents including the technical drawings for several
designs, which they either FTP16
to their own server or email to a dummy account
under their control.
The cyber-criminal now has a choice of how to maximise payment from the
company; they can either;
 Sell the designs they have stolen to a competitor, typically in another
geographical region to the original manufacturer so that legal arguments
over IPR are complex and protracted, or
 Attempt to blackmail the company for the return of the data. Rather than
use an off-shore banks to collect payment, the criminal can demand
payment via bitcoins which can be instantly transferred across several
nation states and can be collected anonymously.
Typically, most criminals will attempt extortion and then, whether payment is
received or not, sell the designs anyway.
15 A rootkit is a type of software, often malicious, that has been designed to hide itself or the
existence of certain processes or programs from the normal methods of detection and enable
continued privileged access to a computer.
16 FTP - File Transfer Protocol (FTP) is a standard network protocol used to transfer files from one
host to another host over a TCP-based network, such as the Internet.
37
Case Study 2
The second case study focuses on recent industry and media reports of alleged,
highly organised Advanced Persistent Threats (APT's) undertaken by nation states.
Ever since the publication of a story about the creation of the W32.Stuxnet worm
by the USA and Israeli security forces, the presence of APT's have been within the
public forum. However, the creation of such complex malware is highly technical
and only targets specific systems. In the Symantec dossier on the w32.Stuxnet
worm, the authors comment that;
"Stuxnet is a threat targeting a specific industrial control system likely in
Iran, such as a gas pipeline or power plant. The ultimate goal of Stuxnet is
to sabotage that facility by reprogramming programmable logic controllers
(PLCs) to operate as the attackers intend them to, most likely out of their
specified boundaries." (Falliere, Murchu, & Chien, 2011)
More recently the Mandiant has published a report on the activities of Unit 61398 of
the Chinese army. Mandiant maintain that for at least the past 7 years, this unit
has been systematically attacking the infrastructures of the western world, gaining
unauthorised access and stealing the IPR of a broad range of industries. From the
executive summary of the most recent report Mandiant have this damning
indictment;
"Our analysis has led us to conclude that APT1 is likely government-
sponsored and one of the most persistent of China’s cyber threat actors. We
believe that APT1 is able to wage such a long-running and extensive cyber
espionage campaign in large part because it receives direct government
support. In seeking to identify the organization behind this activity, our
research found that People’s Liberation Army (PLA’s) Unit 61398 is similar
to APT1 in its mission, capabilities, and resources. PLA Unit 61398 is also
located in precisely the same area from which APT1 activity appears to
originate." (Mandiant, 2013).
Additionally, controversy rages within the UK and the western world about the
possibility that non-national companies supporting the IS infrastructure, are spying
and stealing IPR. Huawei, one of the largest telecommunications companies in the
world, provides IS services to a number of western governments and companies
but has been accused of spying and stealing IPR from its customers. (Osawa,
2013)
38
The Internet has several similar stories concerning suspected attacks and exploits
by APT's. However, rather than relying on the services of specialist programmers
and knowledge combined with trial runs in test environments, this case study is
based on the hackers having only generalised technical knowledge and an ability
to utilise the information contained within the Deep Web.
A rogue nation state, terrorists or hacktivists plot to disrupt the critical national
infrastructure of a western nation using Deep Web, OSINT and hacking
techniques. Through the use of foot-printing techniques the culprits identify a
number of lightly protected national assets as well as the logon details of multiple
social media user accounts. As described previously this information is readily
available on the Deep Web and can be readily extracted and aggregated.
At a specified point in time the hackers initiate phase 1 of the attack whereby they
attack and disable the networks of several public and corporate services. An
example of this could be the disruption to the management and logistic software for
high street supermarket chains. Without the ability to plan and execute deliveries
from the central warehouses to the retail outlets the shops would begin to run low
on goods. In addition the hackers also initiate Distributed Denial of Service attacks
on better protected national assets (e.g. the police, government websites and the
military) which would hamper their ability to effectively communicate with each
other and the public.
At the same time phase 2 is initiated whereby a number of false rumours are
circulated using the compromised social media accounts. Examples of the type of
rumour could be "Low crop yields cause supermarkets to run short of food" or
"Government hiding extent of food shortage to avoid panicking public". The
rumours are picked up by and forwarded by other social media users and rapidly
spread in a "viral" fashion which causes panic buying at the supermarkets,
compounding the problem of supply. A real-life example of this scenario is the
panic buying of fuel during the threat of a tanker driver strike in March 2012. (BBC,
2012)
The government are forced to use public resources (police, PCSO's etc.) to guard
resources and maintain order, and may consider suspending such internet services
39
which would further inflame the situation as it would give credence to rumours of
government complicity in the crisis.
In this scenario it is easy to see how a country could quickly become paralysed
through the use of minimal technical resources and knowledge. Such actions
could be used by hostile nation states to divert attention from strategic military
actions (e.g. the invasion of another country) or by terrorists in order to blackmail
governments into releasing prisoners or using their influence at an international
level (e.g. the United Nations) to the benefit of the perpetrators.
40
Section 4 – Survey Design, Data Collection and Analysis
In order to determine the level of awareness about Deep Web and OSINT-based
attacks and how effective current IS Security controls framework is in protecting
organisations from them, a survey was used to poll a selection of IS Security
professionals. The use of the term "Controls Framework" is used in this paper to
describe the logical, physical and managerial IS Security controls of an
organisation.
4.1 Selection and justification of the data collection tool
There are a number of data collection tools that could have been used to collect
the primary data used in this paper. Amongst the most appropriate are:
1. Structured Person to Person Interviews
2. Telephone Interviews
3. Postal Questionnaire
4. Electronic Survey
Description of Data collection tools
Person to person Structured interviews
Structured interviews are the most flexible, and perhaps the most prestigious of all
research techniques. The respondent is contacted in advance to confirm their
participation and a date arranged. On this date the researcher meets with the
respondent and use a pre-defined questionnaire to interview them whilst their
answers are recorded (either transcribed or by using Audio Visual equipment).
The same questionnaire is utilised for all of the respondents so that a direct
comparison of responses can be made and ambiguities kept to a minimum.
Interviews are excellent form of primary research but can be expensive to
undertake and, due to the time limitations involved, it can be difficult to collate a
statistically relevant sample of respondents. Also, if they sample of respondents is
not sufficiently large, the views of one respondent may "skew" the results of the
group towards erroneous conclusions.
41
Telephone Interviews
A telephone questionnaire is similar to person-to-person interviews in that it follows
a structured format that is determined in advance. The respondent is again
contacted in advance, usually be letter or email, which is then followed up with a
telephone call to confirm their participation and to arrange a suitable time. The
respondent is then telephoned at the agreed time and the interview is conducted.
Telephone interviews have an advantage over postal questionnaires in that the
data is collected directly from the respondent who can be asked to clarify
ambiguous points. However, it can be perceived that this technique does carry
with it a certain stigma from overuse by telemarketing companies and may not
necessarily communicate the gravitas required of the research.
Postal Questionnaire
A postal questionnaire is an effective way of obtaining qualitative primary research.
It follows a number of stages from conception to collating the data, and includes
piloting the questionnaire (to eliminate any possible mistakes such as vague or
misleading questions) and tabulating the results. For the option of using a
questionnaire to be considered it would have to incorporate a large proportion of
respondents which is time consuming and costly. In addition, the response rate
can not be guaranteed and this method does not have the flexibility afforded by
electronic methods.
Electronic Survey
According to the book "Mail and Internet surveys" by Don A. Dillman the best
method is an electronic survey as utilises the power of the Internet and computers
to disseminate and tailor the questionnaire. (Dillman, 2000) In addition it adds a
gravitas to the study as all of the targeted respondents are technology
professionals.
In order to objectively ascertain the best method the options were assessed
against a set of criteria. These criteria are:
42
1. Cost: Which of the options offers the best chance of providing the most
number of quality responses in the amount and scope of information for the
financial outlay?
2. Depth: Which option can cover all of the areas identified in the depth
required?
3. Flexibility: Which option is best at covering unforeseen circumstances such
as unpredicted or unexpected answers?
4. Time: Which option will guarantee the return of the information within the
time constraints to enable the author to make use of it? Also, what amount
of time will a respondent have to answer the survey? Will an electronic form
be easier to complete than a paper one?
It was decided to use an objective technique such as the Ishikawa Quality Function
Deployment (QFD) chart (Inwood & Hammond, 1993) but adjusted to suit the
purpose for determining the best method to use. The decision matrix uses a
similar scoring system to the QFD chart and is demonstrated below:
Key
Strong correlation of criteria against method = ++
Correlation between criteria and method = +
No correlation between criteria and method = 0
Negative correlation between criteria and method = -
Strong negative correlation between criteria and method = - -
Person to
person
interviews
Telephone
Interviews
Postal
Questionnaire
Electronic
Survey
Cost - + + ++
Depth ++ + - +
Flexibility + + -- ++
Time + + + + ++
43
Total 4 5 -1 7
Table 2: QFD chart for selection of data collection tool
From the chart it was determined that an electronic survey was the most suitable
research method to use and primary research was collected using this method.
4.2 Survey Design
Once the research method had been agreed upon the next step was to design the
questionnaire and select the best target group. It was decided that the survey
should be directed at IS Security practitioners from within the IS Security
community. The rationale for this was that the perception of Deep Web and OSINT
is still very low, certainly within the general populous, and so to send an electronic
survey to a general audience may not generate a suitably informed response. It
must be emphasised that, whilst this was not intended to be a representative
sample, the results give a clear indication of the opinion and depth of feeling within
the chosen sector.
Consideration for the type of questions to be asked included:
 The size and type of organisation that the respondent works for;
 Awareness of the Deep Web and OSINT within the organisation;
 Activities that would make the organisation a target for attack;
 Awareness of availability of publicly held information about the organisation
 Attempts to analyse publicly held information
 Current managerial, physical and logical IS Security controls;
 Additional organisational security controls
It was also considered important that the respondents have an opportunity to voice
their opinions and to add any comments as to what they consider are important
controls for controlling and deterring Deep Web and OSINT-based attacks.
With these questions in mind the questionnaire was into 6 subject areas:
1. Introduction
2. The Organisation
3. Deep Web and Open Source Intelligence
44
4. Organisational Information Security Controls
5. Additional Organisational Controls
6. Concluding remarks
A sample of the questionnaire can be seen in Appendix 3.
Questions about the country in which the respondent is located were omitted from
the final survey as it was considered that Deep Web and OSINT-based attacks are
not geography specific. Once the final draft of the survey had been approved the
URL was finalised and included in a blanket email to the selected target audience.
45
4.3 Use of the ISO 27001 standard
In order to provide a measure of definition within the survey, the respondents were
asked to compare their organisation's security policy to a common IS Security
standard. For this purpose the standard selected was ISO 27001, published by the
ISO/IEC (International Standards Organisation / International Electrotechnical
Commission).
ISO 27001, "or ISO/IEC 27001:2005 – Information Technology – Security
techniques – Information Security Management Systems – Requirements" to give it
the full title, is a widely published and accepted standard that, according to the
ISO/IEC),
"..specifies the requirements for establishing, implementing, operating,
monitoring, reviewing, maintaining and improving a documented Information
Security Management System within the context of the organization's overall
business risks. It specifies requirements for the implementation of security
controls customized to the needs of individual organizations or parts
thereof." (ISO/IEC, 2012)
Organisations are able to certify themselves against the ISO 27001 standard as an
external validation of the measure of IS Security they have in place. The question
of whether any of the respondents had attained the standard was not included
within the survey, and none of the respondents within the survey confirmed that
they had attained certification.
The control objectives included within the standard are:
 Security Policy
 Organisation of Information Security
 Asset Management
 Human Resources Security
 Physical Environment Security
 Communications and Operations Management
 Access Control
 Information Systems Acquisition, Development and Control
 Information Security Incident Reporting
 Business Continuity Management
 Compliance
46
Properly implemented the controls detailed within the standard should give a large
measure of protection against IS Security threats. The standard is under review at
present and in 2013 the ISO issued a final draft which details a number of new
controls which address advances in technology and systems management (e.g.
the greater use of external service providers). However, it is understood that none
have been included that specifically address the threat of data exfiltration or
OSINT-based attacks.
47
4.4 Analysis of survey results
Over 400 IS Security professionals were emailed with an explanatory note of the
reason for the research and the URL to the questionnaire There were 74
respondents in total, although some chose to skip certain questions, either for
reasons of security or relevance to their particular sector.
The results were analysed and have been presented in the following paragraphs.
The sanitised raw data has been included in the appendices.
Question 1: How Many Full-time Employees currently work for your
organisation?
The largest group of respondents (41%) was the 5000+ bracket, followed by the 1
to 100 bracket (21%). This is in line with the supposition that the majority of
respondents will be employed in large organisations, or are security consultants on
contract to large organisations, and typically have the requirement and resources
to manage IT Security within their operation.
Question 2: What is the primary industry sector that your organisation
operates in?
The largest group of respondents are within Information Technology (27%),
followed by Financial Services (23%). Analysis of the third group (Other) indicated
that these were primarily self–employed IS Security consultants. It is suggested
that the rationale for so many responses from the Financial Services sector is that
they have a greater requirement for compliance and for external validation than the
majority of the other sectors. Additionally, there is a perception that these are a
greater target for hackers since there are greater rewards (e.g. financial or in terms
of data value) if the control framework can be compromised.
Question 3: At what level would you currently rate awareness of the Deep
Web and OSINT within your organisation?
Not surprisingly, the majority of the respondents rated awareness of the Deep Web
and OSINT-based attacks within their organisations as Low or Negligible (39% and
28% respectively). This may be because of the low media coverage of Deep Web
48
and OSINT-based attacks, in relation to other attacks such Technical Intrusion or
Social Media Impersonation.
Question 4: Is your organisation engaged in technological or controversial
activities that would make it a high profile target of criminals or hacktivists?
The results for this question were fairly evenly split with approximately half of the
respondents' organisations considering that they are involved in activities that
would make them a potential target for crime or hacktivism. Given that almost half
of the respondents (47%) answered Yes to this question and, in conjunction with
the result of the previous question (i.e. low awareness), there is a high probability
that Deep Web and OSINT-based attacks could proliferate in these "perfect storm"
conditions.
Question 5: How would you currently rate the threat of Deep Web attacks
(i.e. the aggregated data analysis of the publicly available information) to
your organisation?
The largest group of respondents to this question rated the threat as medium, with
low being the second highest group. This can be attributed to a number of factors,
either:
1 Those organisations that responded have assessed the threat of Deep Web
and OSINT attacks and implemented appropriate controls to mitigate it;
2 The threat has not been properly assessed;
3 The belief that their organisation benefits from the "Security through Obscurity"
principle, namely that there are several organisations out here and that they are
not high profile enough to warrant attacking.
This last point is particularly erroneous due to the nature of Deep Web and OSINT
based threats. the attack. If a criminal identifies the organisation as a potential
target through Deep Web harvesting they will attack regardless of how many other
organisations exist.
Question 6: Has your organisation ever carried out any analysis of the
publicly available information on your organisation?
49
Over half the respondents have carried out an analysis of the publicly available
information for their organisation. This is only to be expected in the media-centric
world in which businesses currently operate, where poor Public Relations (PR) can
adversely affect share price.
Question 7: Are you aware of any targeted attacks (e.g. phishing emails or
similar) on high profile individuals in your organisation, or on your
organisation as a whole?
Over half the respondents could confirm that there had been a targeted attack on
their organisation or on high profile individuals within it. This was an expected
result and can possibly be attributed due to the proliferation of material available on
the internet, which allows cyber criminals to inform their attacks. Alternatively it
could be because the respondents to the survey are the security officers in large
organisations that are more likely to be the target of criminals or hacktivists due to
their profile or activities.
Question 8: Has your organisation implemented an Information Systems
Security policy?
Reassuringly the majority of respondents confirmed that their organisation had
implemented an IS Security policy. In the modern corporate business culture that
requires a high level of regulation and compliance, this is an expected control.
Question 9: Has it been designed to incorporate good practice over its
logical, physical and managerial controls? (For this purpose as defined by
ISO 27001)
Again, the majority of respondents confirmed that they had utilised a standard,
typically the ISO 27001 standard, when designing the organisation's IS Security
policy. This is because it is well established and is seen as a de-facto measure of
IS Security.
Question 10: Has the policy been approved and supported by senior
management?
Again the majority of respondents confirmed that the IS Security policy had been
approved by the organisation's Senior Management Team. It is suggested that this
50
is because the policy is a corporate-level document and requires the approval and
support of the organisation's executive in order to be effective.
Question 11: Has the policy been distributed to all staff?
78% of the respondents confirmed that the policy had been distributed to all staff.
Again this in keeping with the answers from the previous question in that the policy
is an organisation-wide document.
The circulation of this policy is important in the promotion of an IS Security aware
culture within the organisation and, whilst not explicit in the prevention of Deep
Web and OSINT-based attacks, is an important aspect in the security framework
for preventing the same.
Question 12: Has the policy been confirmed as read and understood by all
staff (either through return of a signed acceptance sheet or via electronic
confirmation)?
Confirmation that the IS Security policy has been received and understood is an
important aspect of its promotion. 53% of the respondents confirmed that the
policy had been confirmed as read and understood with a further 23% confirming
that it had been partially received. More worryingly 20% of the respondents said
that it had not been, or could not be, confirmed, which suggests that 1 in 5 of all
users have not read or do not understand the dictates and restrictions of the policy.
Education and confirmation of employees understanding about their Information
Security responsibilities is an important aspect of the IS Security framework and
imperative to preventing Deep Web and OSINT-based attacks.
Question 13: Is the IS Security policy enforced by logical controls? For
example; processes that restricts access to data to authorised personnel,
database encryption, email monitoring or Data Centric security applications
such as Boole server
The majority of respondents (68%) confirmed that the IS Security policy is enforced
through logical controls (e.g. processes that restricts access to data to authorised
personnel, database encryption, email monitoring or Data Centric security
applications such as Boole server) which is essential in the provision of a mature
IS Security framework. 30% of the respondents confirmed that the policy was not
51
or only partially enforced by logical controls, which suggests a disconnect between
the managerial and operational controls.
In such organisations it is easier to circumvent control procedures and manipulate
processes and data to their own ends.
52
Question 14: Is staff compliance with the policy regularly tested through
measures such as user access and internal network penetration testing,
reviews of audit and access logs?
Again over half the respondents (52%) confirmed that staff compliance with the IS
Security policy is regularly tested. More disturbingly, 40% confirmed that they did
not test, or only partially tested, compliance with the policy.
Compliance with the defined controls is crucial in all aspects of IS Security in order
to prevent cyber attacks, OSINT-based or otherwise.
Question 15: How often do staff receive training on IS Security?
From the graph it is possible to see that, by a large majority, staff in the
respondents organisations receive annual training in IS Security (38%). The next
largest group is that which receive training on induction (27%), although it was not
possible to determine from the results if this is includes staff that receive training at
induction and then none afterwards.
A large percentage (18%) receive no training which would leave the organisations
involved susceptible to all manner of IS attacks, and particularly those of a subtle
nature such as Deep Web or OSINT, as they may not be perceived as threatening.
0
5
10
15
20
25
Monthly Quarterly 6 Months Annually On Induction No Training
IS Security Training
53
Question 16: How often are IS Security messages published within the
organisation?
The majority of respondents confirmed that security messages are published in
response to a security event (43%) with the second largest group of respondents
having an on-going security awareness initiative of some description (27%).
One of the most powerful controls within the IS Security armoury is the fostering
and maintaining of an IS Security culture. This is often confused with paranoia
about security events that can obstruct the operation of the business, but a true IS
Security awareness program is embedded within the culture of the organisation,
reinforced through regular training programmes and enhances and protects the
business operation.
Question 17: Has the organisation implemented an Incident Management
policy and procedure to minimise the impact of security incidents?
The majority of respondents had confirmed they had an Incident Management
policy in place, with one quarter of the respondents only partially implementing a
policy and a small percentage not having implemented a policy at all.
This is a fundamental control in regard to business resilience and is crucial in
recovering from any type of incident. Any organisation that does not have a clear
Incident Management policy, or has an incomplete or untested one, will the have
0
5
10
15
20
25
30
Continually Monthly Quarterly Every 6 Months As Required Never
Security Messages
54
the impact and effects of a cyber attack compounded and take longer and cost
more to recover from.
Question 18: Has the policy been invoked within the last 12 months?
The majority of respondents (41%) confirmed that their Incident Management
policy had been invoked within the last 12 months. A further 13% confirmed that
there had been a partial invocation in the same period. This demonstrates the
necessity of having properly developed and implemented Incident management
procedures as over half of the respondents had need of their preparations.
Question 19: Was the reason for invoking the policy identified an controls
implemented to prevent re-occurrence?
The greatest number of respondents in this question chose not to answer (43%),
possibly because they did not wish to reveal the nature of the incident. The
second largest group confirmed that they had identified and resolved the issue
(37%).
A documented, approved and implemented Incident Management policy requires
effective post-event review to ensure that any issues are effectively mitigated to
prevent reoccurrence.
55
Question 20: What was the impact of the incident on the organisation?
From the chart it is possible to see that the majority of events were classified as
minor (32%), with negligible as the second highest group (17%). It is impossible to
say whether the implementation of an Incident Management policy was the
deciding factor in mitigating these incidents from more severe responses.
Question 21: Has your organisation implemented a Social Media policy that
contains clear and concise guidance on what, how and where information
about the organisation should be published?
Approximately one half of the respondents confirmed that their organisations had
implemented a Social Media policy describing how information on the
organisations should be published. In the modern business age all organisations
need to have a comprehensive approach to how information about itself is
published and disseminated. With the growth in available media channels, both
those over which it can exercise control (e.g. company website, press releases,
corporate social media sites, blogs) and those where it has less control (e.g. staff
personal social media pages, consumer forums) need to be managed and guided
to ensure that the overall internet presence is positive.
0
5
10
15
20
25
Negligible Minor Moderate Significant Not Applicable
Event Severity
56
Those without a Social Media policy, or a partial one, (approx. 52% of
respondents) risk having possibly sensitive information being disseminated to the
Internet which could be used to formulate a Deep Web type of attack.
Question 22: Has your organisation implemented an Email policy?
Again the overwhelming majority of respondent's confirmed that there is an
implemented policy that specifies the use of the facility with the organisation. This
is an important control in protecting against Deep Web and OSINT-based attacks
because the information sent in emails can be harvested and used to attack the
organisation.
Question 23: Have your staff been trained on good email security practice
such as not opening suspicious emails or emails from unsolicited sources?
Two thirds of the respondents confirmed that their organisations have been trained
on good security practices. However, it is possible to compare this with the results
from question 15 and consider, once the staff had been trained, was there any
follow up training or a continuation training programme?
Clear and concise policies, appropriate technical controls and staff training are the
three most important elements in preventing Deep Web or OSINT-based attacks.
Question 24: Has your organisation implemented gateway controls to
quarantine suspicious emails and alert the intended recipient for release?
The majority of respondents have implemented this control which is important in
ensuring that suspicious emails and attachments are not imported onto the
organisation's network. Those organisations that did not have this control may
have implemented other compensatory controls, such as limiting incoming
attachment types (e.g. java script)
Question 25: Does your organisation have a documented and published
disciplinary process for infringements of IS Security controls or policies?
Again, the majority of organisation's have responded that this control is in place,
but perhaps a more relevant question should have been "how many times have
57
staff been disciplined for security breaches?" Whilst it is important to have clear,
concise policies it is also important that those polices are enforced with appropriate
sanctions against the perpetrator, and for those sanctions to be clear to all staff.
Having this control in place will ensure that staff do not inadvertently publish
information as the threat of sanction will enforce the mind-set.
Question 26: Do you regularly update your malware software in order to
ensure that you have the greatest possibility of detecting and stopping
malicious code or Trojan programmes designed to circumvent your security?
Unsurprisingly, the majority of responses were positive for the question (91%) with
a small minority answering no or partially. The minority answers may due to the
confusion over the wording of the question as most software products have the
capability of automatically updating with definition files. Alternatively, in systems
requiring high availability, the infrastructure may be operated in a protected "shell"
restricting all updates (malicious or otherwise) to ensure continuity. Updates are
tested first on a mirrored system and then implemented at low demand times, or in
a phased process to ensure continuity of service. This control is important
because, as we have seen in case study 1, the network was infected by the Trojan
and root kit during the "lag" time between virus being released and the patch being
applied.
Question 27: Have you implemented any of the following Data Leakage
Prevention (DLP) controls within your organisation? (Please tick all that
apply)
Of the 45 respondents three quarters confirmed that they have implemented
additional DLP controls within their organisation. These were evenly split across
hardware, software and managerial controls with two respondents commenting that
one had implemented an audit control over flash memory devices issued to its
staff, and the other had blocked access to web mail and file sharing sites.
An area for further research might include probing what manner of additional
controls have been implemented and how effective they are.
58
Question 28: Do you regularly measure and report to the Senior
Management Team or Executive on the effectiveness of IS controls?
43% of the respondents regularly report on the effectiveness of the IS controls to
the organisation's executive body, but more disturbingly 57% do not. In order for
any control framework to be effective it must (and be seen to have) senior
management support. It is possible that this figure is actually higher, with a greater
percentage of the respondents incorporating Is Security metrics within an overall
management report rather than a specific security reported that is delivered
regularly.
Question 29: In your opinion, what control/s would be most effective in
securing your organisation against a Deep Web or OSINT type attack?
Of the 75 respondents to the survey 23 offered additional comments regarding
Deep Web attacks. Of the 23 respondents;
 8 suggested that additional training is required to raise awareness of the
issue within the employees and senior management of the organisation, a
total of 34%.
 a similar number suggested the implementation of greater technical controls
(firewall technologies, SIEM systems, etc.) for limiting data leakage.
Other suggestions included the regular scanning of Deep Web sites and other
forums (e.g. pastebin) for compromised credentials or other confidential
information, regular pen testing and better techniques for eliciting senior
management understanding and support.
59
Section 5 – Conclusions and Recommendations
After describing the Deep Web and OSINT-based attacks and the type of tools and
attacks that can be launched against organisations, illuminating them with case
studies and collecting and analysing the results of the survey, it is now possible to
draw conclusions on how effective the controls framework is and whether
additional controls are necessary.
5.1 Conclusions
From the survey results it is possible draw the following conclusions:
1. All of the organisations have implemented an IS Security controls
framework which incorporates logical, physical and managerial controls that
underpins the IS Security policy. Additionally, most organisations have
implemented supplementary controls such as Social Media policies, email
policies, DLP software, email monitoring or restriction of USB ports. This is
in response to the realisation that threats to data security extend wider than
the organisation's internal network. However, OSINT-based attacks target
resources that has already been "leaked" outside of the organisation and,
whilst the controls framework will assist in protecting the organisation from
a direct attack, it does not directly address the risk.
2. Whilst over half the respondents have carried out an analysis of the publicly
available information for their organisation, a large proportion have not.
This was expected as the proliferation of media streams continues and
more organisations struggle to control how information about themselves is
published. There is also greater appreciation of the risk of loss of
reputation that can adversely affect share price.
3. Awareness of the threat from Deep Web and OSINT-based attacks was
significant, but awareness was greatest in those organisations that are
either highly regulated (e.g. financial services) or are more technically
informed (e.g. Information Technology). As the survey was directed at IS
Security personnel the majority of respondents were aware of the risks but
that appreciation was not reflected in the senior management. In addition
60
almost half of the respondents confirmed that their organisations were
involved in activities that would make them a potential target for crime or
hacktivism. These two factors make for perfect conditions in fostering an
OSINT-based attack.
4. Most organisations undertake some form of regular training of their staff in
IS Security matters, but this may be in regard to required compliance with
applicable legislation rather than a realisation of the need to continually
update staff on the threat to IS Security. By far the most common training
programme is conducted annually (38%) or on induction into the
organisation (27%). A large percentage (18%) receive no training which
would leave the organisations involved susceptible to all manner of IS
attacks, and particularly those of a subtle nature such as Deep Web or
OSINT-based, as they may not be perceived as a threat.
5. As expected, most organisations had either fully or partially developed and
implemented an Incident Management policy, There were also a minority
that confirmed they had no such policy in place. Most respondents also
confirmed that they had experienced a security incident within the last 12
months.
This is a fundamental control in regard to business resilience and is crucial
in the successful recovery from any type of security or business continuity
incident . Any organisation that does not have a clear Incident
Management policy, or has an incomplete or untested one, will the have the
impact and effects of a Deep Web type of attack compounded, and it will
take longer and cost more to recover from.
61
5.2 Recommendations
As mentioned previously a number of the respondents confirmed that they had
implemented additional controls to address specific or perceived risks. However,
in order to be effective against Deep Web and OSINT-based attacks, organisations
should consider undertaking additional control measures that specifically address
Deep Web and OSINT-based attacks, such as:
a) Review of the organisation's online presence and the logical, physical and
managerial controls that support the IS Security policy and protect against
accidental and deliberate disclosure of information. Most organisations will
have implemented controls to prevent data loss or disclosure in a reactive
manner, in order to address perceived risks or in response to a security
incident. It is suggested that, in order to develop this framework to prevent
Deep Web and OSINT-based attacks, the organisation should audit the
information available and controls in place so that it can evaluate the current
exposure of the organisation to Deep Web and OSINT-based attacks,
identify any specific risks and suggest controls to mitigate them.
As part of this dissertation an audit programme for evaluating and
organising controls to prevent OSINT-type attacks on an organisation has
been included in Appendix 1.
b) Instigate a corporate awareness and IS Security training campaign for all of
its users on the dangers of sharing to much information on the Internet.
Some examples of good practice are the UK Ministry of Defence's (MOD)
"Think before you Tweet" campaign (Defence Management, 2011) or the
Belgian Internet Security group "Safe Internet Banking" campaigns
"Amazing mind reader reveals his 'gift" and "New Best Friend" (Belgian
Internet Security Group, 2013). Excellent training materials are available
from the Get Safe Online website (Get Safe Online, 2013) that elucidate the
threat from revealing to much information online and offer guidance on how
to prevent it.
62
c) Review, develop and test the organisation's Incident Management Policy to
ensure that should an incident occur it can be investigated, negated and
resolved as quickly as possible with the minimum of disruption to the
business operation.
63
5.3 Application of Recommendations to Case Studies
In the case studies it is possible to see that the use of the audit program would
have identified a number of areas where additional controls should have been
implemented to prevent the loss of the data.
Case Study 1
In case study 1 several of the management team opened an email from an
unknown source and activated hyper links within it which allowed the Trojan to be
downloaded and infiltrate the network. As part of the IS Security controls, the Is
Manager should have implemented a quarantine area for suspicious emails that
would allow each manger to verify the validity of the email before opening it.
In addition the organisation should have incorporated an IS Communications policy
which detailed how all aspects of external communication should be securely
handled. The policy should also detail how user should not download attachments
from unsolicited emails or click on hyper links contained therein.
Good practice dictates that the user should either;
1.) Search for the same offer the senders website using a proprietary search
engine. A variation of this is to copy the text of the URL and paste it as the
search string in the search engine website, This may identify if it is a scam
that has been attempted previously; or
2.) Hover the cursor over link to identify the URL that is associated with the link
and confirm whether it is associated with the sender of the email;
In both cases if there is any doubt as to the validity of the link, it should not be
activated.
The organisation should also undertake a review of its publicly available
information to understand how the attack was crafted and how they are exposed to
similar attacks. This attack was possible through the use of information that was
publicly available on the Internet. Whilst it might not be possible to remove archive
data, the results of any review can be used to inform the corporate risk register so
that any future emails can be treated appropriately.
64
The possibility of utilising different anti-malware products on the Internet and mail
gateways and on the network should also be investigated in order to increase the
chances of detecting "zero-day" malware. Software vendors have differing
priorities and response times to threats. By diversifying the products used to
protect the network, the coverage is increased and the threat reduced. Inevitably,
the costs of software licences and maintenance increase but this needs to be
considered in conjunction with the cost of another security event. It is possible to
evaluate this through the use of a "Return on Security Investment " (ROSI)
equation such as demonstrated in figure 7. This example is taken from the
European Network and Information Security Agency website (ENISA, 2012)
Figure 7 Return on Security Investment equation
With ROSI equations, the higher the ROSI figure, the more value there is in
undertaking the investment. For example, if the cost of a solution is £50,000 and
the Monetary loss reduction is £200,000 then the ROSI index is 3, namely it is
more cost effective to implement the software.
However, if the Monetary loss reduction is only £75,000 and the solution is
£50,000, the ROSI figure is only 0.5, so there is little value in implementing the
solution.
Equations such as this are a useful guide in justifying expenditure on security
measures but can be subjective (it is difficult to place a value on an intangible
asset such as corporate reputation). Additionally other factors need to be taken into
consideration, such as technical coordination with the existing architecture and
synergy with the organisation's culture
65
Case Study 2
In the second case study we see how Deep web and OSINT-based attacks can be
used to inform attacks that interfere with vital public services and cause major
disruption to a nations infrastructure.
Applying the audit program here is not as straightforward as the previous example
because the threat is spread across several organisations, however, the principles
remain the same.
Essentially each organisation should follow the steps of the audit programme
namely, review of information available about each organisation, review and
validate the controls designed to prevent the unauthorised disclosure information
and promote good practice through training and awareness.
What is required in this example is an overarching organisation or government
body to enforce the implementation of the programme to ensure a more
orchestrated response to Deep Web and OSINT-based threats.
The UK government has recognised the growing importance of a cohesive IS
Security response and, according to the UK Cyber Security Strategy 2011, is
investing £650 in cyber security over the next four years (Cabinet Office, 2011).
Whilst the threat of Deep Web or OSINT-based attacks are not specifically
mentioned, there is a call for a more joined-up response from all parties
(government, business and individuals).
66
5.4 Summation
Online information is a fact of life. The commoditisation of IS services such as
SaaS, cloud storage and 2nd
and 3rd
party service providers mean the de-
perimiterisation of business services. Combined with the development of multiple
media streams the digital footprint of organisations is getting bigger. In this
situation the effectiveness of technical controls in stopping Deep Web and OSINT-
based attacks is limited, as the attack is based on publicly available information
rather than as a direct attack on the network.
There are two facets to an organisation's digital footprint, the good side and the
bad. Unstructured data leakage is not controlled and is bad for business as it can
foster a situation which encourages OSINT-based attacks.
However, a mature digital footprint is where information is spread via a controlled
release and can be very beneficial to an organisation. Successful organisations will
thrive if they can effectively manage the information available about it, and this is
attained through appropriate logical, physical and managerial controls combined
with a robust review process and comprehensive staff training that promotes good
security practice.
Through the use of the audit program it is possible to assess how an organisation
is promoted and perceived on the internet, how the information arrives in the public
domain and how it can be restricted to dissuade Deep Web and OSINT-based
attacks.
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks
Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks

More Related Content

What's hot

Preventing Distributed Denial of Service Attacks in Cloud Environments
Preventing Distributed Denial of Service Attacks in Cloud Environments Preventing Distributed Denial of Service Attacks in Cloud Environments
Preventing Distributed Denial of Service Attacks in Cloud Environments IJITCA Journal
 
A survey on RBF Neural Network for Intrusion Detection System
A survey on RBF Neural Network for Intrusion Detection SystemA survey on RBF Neural Network for Intrusion Detection System
A survey on RBF Neural Network for Intrusion Detection SystemIJERA Editor
 
Adversary Emulation and Its Importance for Improving Security Posture in Orga...
Adversary Emulation and Its Importance for Improving Security Posture in Orga...Adversary Emulation and Its Importance for Improving Security Posture in Orga...
Adversary Emulation and Its Importance for Improving Security Posture in Orga...Digit Oktavianto
 
The VOHO Campaign: An In Depth Analysis
The VOHO Campaign: An In Depth AnalysisThe VOHO Campaign: An In Depth Analysis
The VOHO Campaign: An In Depth AnalysisEMC
 
Advanced Persistent Threat
Advanced Persistent ThreatAdvanced Persistent Threat
Advanced Persistent ThreatAmmar WK
 
INTRUSION DETECTION USING FEATURE SELECTION AND MACHINE LEARNING ALGORITHM WI...
INTRUSION DETECTION USING FEATURE SELECTION AND MACHINE LEARNING ALGORITHM WI...INTRUSION DETECTION USING FEATURE SELECTION AND MACHINE LEARNING ALGORITHM WI...
INTRUSION DETECTION USING FEATURE SELECTION AND MACHINE LEARNING ALGORITHM WI...ijcsit
 
Optimised malware detection in digital forensics
Optimised malware detection in digital forensicsOptimised malware detection in digital forensics
Optimised malware detection in digital forensicsIJNSA Journal
 
e-AGE 2014 Proceedings_1st article
e-AGE 2014 Proceedings_1st articlee-AGE 2014 Proceedings_1st article
e-AGE 2014 Proceedings_1st articleAizharkyn Burkanova
 
Ethical hacking interview questions and answers
Ethical hacking interview questions and answersEthical hacking interview questions and answers
Ethical hacking interview questions and answersShivamSharma909
 
Survey on classification techniques for intrusion detection
Survey on classification techniques for intrusion detectionSurvey on classification techniques for intrusion detection
Survey on classification techniques for intrusion detectioncsandit
 
Cyber Threat Intelligence - La rilevanza del dato per il business
Cyber Threat  Intelligence - La rilevanza del dato per il businessCyber Threat  Intelligence - La rilevanza del dato per il business
Cyber Threat Intelligence - La rilevanza del dato per il businessFrancesco Faenzi
 
Advanced Phishing The Art of Stealing
Advanced Phishing The Art of StealingAdvanced Phishing The Art of Stealing
Advanced Phishing The Art of StealingAvinash Sinha
 
An automated approach to fix buffer overflows
An automated approach to fix buffer overflows An automated approach to fix buffer overflows
An automated approach to fix buffer overflows IJECEIAES
 
IRJET- An Intrusion Detection and Protection System by using Data Mining ...
IRJET-  	  An Intrusion Detection and Protection System by using Data Mining ...IRJET-  	  An Intrusion Detection and Protection System by using Data Mining ...
IRJET- An Intrusion Detection and Protection System by using Data Mining ...IRJET Journal
 

What's hot (18)

Preventing Distributed Denial of Service Attacks in Cloud Environments
Preventing Distributed Denial of Service Attacks in Cloud Environments Preventing Distributed Denial of Service Attacks in Cloud Environments
Preventing Distributed Denial of Service Attacks in Cloud Environments
 
A survey on RBF Neural Network for Intrusion Detection System
A survey on RBF Neural Network for Intrusion Detection SystemA survey on RBF Neural Network for Intrusion Detection System
A survey on RBF Neural Network for Intrusion Detection System
 
Cisco Annual Security Report 2016
Cisco Annual Security Report 2016Cisco Annual Security Report 2016
Cisco Annual Security Report 2016
 
Adversary Emulation and Its Importance for Improving Security Posture in Orga...
Adversary Emulation and Its Importance for Improving Security Posture in Orga...Adversary Emulation and Its Importance for Improving Security Posture in Orga...
Adversary Emulation and Its Importance for Improving Security Posture in Orga...
 
The VOHO Campaign: An In Depth Analysis
The VOHO Campaign: An In Depth AnalysisThe VOHO Campaign: An In Depth Analysis
The VOHO Campaign: An In Depth Analysis
 
Advanced Persistent Threat
Advanced Persistent ThreatAdvanced Persistent Threat
Advanced Persistent Threat
 
INTRUSION DETECTION USING FEATURE SELECTION AND MACHINE LEARNING ALGORITHM WI...
INTRUSION DETECTION USING FEATURE SELECTION AND MACHINE LEARNING ALGORITHM WI...INTRUSION DETECTION USING FEATURE SELECTION AND MACHINE LEARNING ALGORITHM WI...
INTRUSION DETECTION USING FEATURE SELECTION AND MACHINE LEARNING ALGORITHM WI...
 
How To Protect Your Website From Bot Attacks
How To Protect Your Website From Bot AttacksHow To Protect Your Website From Bot Attacks
How To Protect Your Website From Bot Attacks
 
Advanced Threat Detection in ICS – SCADA Environments
Advanced Threat Detection in ICS – SCADA EnvironmentsAdvanced Threat Detection in ICS – SCADA Environments
Advanced Threat Detection in ICS – SCADA Environments
 
Optimised malware detection in digital forensics
Optimised malware detection in digital forensicsOptimised malware detection in digital forensics
Optimised malware detection in digital forensics
 
Bulletproof IT Security
Bulletproof IT SecurityBulletproof IT Security
Bulletproof IT Security
 
e-AGE 2014 Proceedings_1st article
e-AGE 2014 Proceedings_1st articlee-AGE 2014 Proceedings_1st article
e-AGE 2014 Proceedings_1st article
 
Ethical hacking interview questions and answers
Ethical hacking interview questions and answersEthical hacking interview questions and answers
Ethical hacking interview questions and answers
 
Survey on classification techniques for intrusion detection
Survey on classification techniques for intrusion detectionSurvey on classification techniques for intrusion detection
Survey on classification techniques for intrusion detection
 
Cyber Threat Intelligence - La rilevanza del dato per il business
Cyber Threat  Intelligence - La rilevanza del dato per il businessCyber Threat  Intelligence - La rilevanza del dato per il business
Cyber Threat Intelligence - La rilevanza del dato per il business
 
Advanced Phishing The Art of Stealing
Advanced Phishing The Art of StealingAdvanced Phishing The Art of Stealing
Advanced Phishing The Art of Stealing
 
An automated approach to fix buffer overflows
An automated approach to fix buffer overflows An automated approach to fix buffer overflows
An automated approach to fix buffer overflows
 
IRJET- An Intrusion Detection and Protection System by using Data Mining ...
IRJET-  	  An Intrusion Detection and Protection System by using Data Mining ...IRJET-  	  An Intrusion Detection and Protection System by using Data Mining ...
IRJET- An Intrusion Detection and Protection System by using Data Mining ...
 

Similar to Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks

Cisco 2016 Security Report
Cisco 2016 Security Report Cisco 2016 Security Report
Cisco 2016 Security Report Steve Fantauzzo
 
Cisco 2016 Annual Security Report
Cisco 2016 Annual Security ReportCisco 2016 Annual Security Report
Cisco 2016 Annual Security ReportJames Gachie
 
Social Media Monitoring tools as an OSINT platform for intelligence
Social Media Monitoring tools as an OSINT platform for intelligenceSocial Media Monitoring tools as an OSINT platform for intelligence
Social Media Monitoring tools as an OSINT platform for intelligenceE Hacking
 
Evaluating Network Forensics Applying Advanced Tools
Evaluating Network Forensics Applying Advanced ToolsEvaluating Network Forensics Applying Advanced Tools
Evaluating Network Forensics Applying Advanced ToolsIJAEMSJORNAL
 
Ashar Shaikh A-84 SEMINAR.pptx
Ashar Shaikh A-84 SEMINAR.pptxAshar Shaikh A-84 SEMINAR.pptx
Ashar Shaikh A-84 SEMINAR.pptxasharshaikh8
 
Use of network forensic mechanisms to formulate network security
Use of network forensic mechanisms to formulate network securityUse of network forensic mechanisms to formulate network security
Use of network forensic mechanisms to formulate network securityIJMIT JOURNAL
 
USE OF NETWORK FORENSIC MECHANISMS TO FORMULATE NETWORK SECURITY
USE OF NETWORK FORENSIC MECHANISMS TO FORMULATE NETWORK SECURITYUSE OF NETWORK FORENSIC MECHANISMS TO FORMULATE NETWORK SECURITY
USE OF NETWORK FORENSIC MECHANISMS TO FORMULATE NETWORK SECURITYIJMIT JOURNAL
 
Open Source and Cyber Security: Open Source Software's Role in Government Cyb...
Open Source and Cyber Security: Open Source Software's Role in Government Cyb...Open Source and Cyber Security: Open Source Software's Role in Government Cyb...
Open Source and Cyber Security: Open Source Software's Role in Government Cyb...Great Wide Open
 
6528 opensource intelligence as the new introduction in the graduate cybersec...
6528 opensource intelligence as the new introduction in the graduate cybersec...6528 opensource intelligence as the new introduction in the graduate cybersec...
6528 opensource intelligence as the new introduction in the graduate cybersec...Damir Delija
 
Credential Harvesting Using Man in the Middle Attack via Social Engineering
Credential Harvesting Using Man in the Middle Attack via Social EngineeringCredential Harvesting Using Man in the Middle Attack via Social Engineering
Credential Harvesting Using Man in the Middle Attack via Social Engineeringijtsrd
 
Privacy, Accountability and Trust Privacy, Accountability and Trust Privacy, ...
Privacy, Accountability and Trust Privacy, Accountability and Trust Privacy, ...Privacy, Accountability and Trust Privacy, Accountability and Trust Privacy, ...
Privacy, Accountability and Trust Privacy, Accountability and Trust Privacy, ...Karlos Svoboda
 
Cyber Threat Intelligence
Cyber Threat IntelligenceCyber Threat Intelligence
Cyber Threat IntelligenceMarlabs
 
UMUC Monitoring, Auditing, Intrusion Detection, Intrusion Prev.docx
UMUC Monitoring, Auditing, Intrusion Detection, Intrusion Prev.docxUMUC Monitoring, Auditing, Intrusion Detection, Intrusion Prev.docx
UMUC Monitoring, Auditing, Intrusion Detection, Intrusion Prev.docxwillcoxjanay
 
Ethical Hacking
Ethical HackingEthical Hacking
Ethical Hackingijtsrd
 
Multi-vocal Review of security orchestration
Multi-vocal Review of security orchestrationMulti-vocal Review of security orchestration
Multi-vocal Review of security orchestrationChadni Islam
 
Cyber Crime Multi-State Information Sharing and Analysis Center
Cyber Crime Multi-State Information Sharing and Analysis CenterCyber Crime Multi-State Information Sharing and Analysis Center
Cyber Crime Multi-State Information Sharing and Analysis Center- Mark - Fullbright
 

Similar to Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks (20)

Cisco 2016 Security Report
Cisco 2016 Security Report Cisco 2016 Security Report
Cisco 2016 Security Report
 
Cisco asr-2016-160121231711
Cisco asr-2016-160121231711Cisco asr-2016-160121231711
Cisco asr-2016-160121231711
 
Cisco Annual Security Report
Cisco Annual Security ReportCisco Annual Security Report
Cisco Annual Security Report
 
Cisco 2016 Annual Security Report
Cisco 2016 Annual Security ReportCisco 2016 Annual Security Report
Cisco 2016 Annual Security Report
 
Social Media Monitoring tools as an OSINT platform for intelligence
Social Media Monitoring tools as an OSINT platform for intelligenceSocial Media Monitoring tools as an OSINT platform for intelligence
Social Media Monitoring tools as an OSINT platform for intelligence
 
Evaluating Network Forensics Applying Advanced Tools
Evaluating Network Forensics Applying Advanced ToolsEvaluating Network Forensics Applying Advanced Tools
Evaluating Network Forensics Applying Advanced Tools
 
Ashar Shaikh A-84 SEMINAR.pptx
Ashar Shaikh A-84 SEMINAR.pptxAshar Shaikh A-84 SEMINAR.pptx
Ashar Shaikh A-84 SEMINAR.pptx
 
Use of network forensic mechanisms to formulate network security
Use of network forensic mechanisms to formulate network securityUse of network forensic mechanisms to formulate network security
Use of network forensic mechanisms to formulate network security
 
USE OF NETWORK FORENSIC MECHANISMS TO FORMULATE NETWORK SECURITY
USE OF NETWORK FORENSIC MECHANISMS TO FORMULATE NETWORK SECURITYUSE OF NETWORK FORENSIC MECHANISMS TO FORMULATE NETWORK SECURITY
USE OF NETWORK FORENSIC MECHANISMS TO FORMULATE NETWORK SECURITY
 
Open Source and Cyber Security: Open Source Software's Role in Government Cyb...
Open Source and Cyber Security: Open Source Software's Role in Government Cyb...Open Source and Cyber Security: Open Source Software's Role in Government Cyb...
Open Source and Cyber Security: Open Source Software's Role in Government Cyb...
 
6528 opensource intelligence as the new introduction in the graduate cybersec...
6528 opensource intelligence as the new introduction in the graduate cybersec...6528 opensource intelligence as the new introduction in the graduate cybersec...
6528 opensource intelligence as the new introduction in the graduate cybersec...
 
Credential Harvesting Using Man in the Middle Attack via Social Engineering
Credential Harvesting Using Man in the Middle Attack via Social EngineeringCredential Harvesting Using Man in the Middle Attack via Social Engineering
Credential Harvesting Using Man in the Middle Attack via Social Engineering
 
Privacy, Accountability and Trust Privacy, Accountability and Trust Privacy, ...
Privacy, Accountability and Trust Privacy, Accountability and Trust Privacy, ...Privacy, Accountability and Trust Privacy, Accountability and Trust Privacy, ...
Privacy, Accountability and Trust Privacy, Accountability and Trust Privacy, ...
 
Cyber Threat Intelligence
Cyber Threat IntelligenceCyber Threat Intelligence
Cyber Threat Intelligence
 
Case Study.pdf
Case Study.pdfCase Study.pdf
Case Study.pdf
 
UMUC Monitoring, Auditing, Intrusion Detection, Intrusion Prev.docx
UMUC Monitoring, Auditing, Intrusion Detection, Intrusion Prev.docxUMUC Monitoring, Auditing, Intrusion Detection, Intrusion Prev.docx
UMUC Monitoring, Auditing, Intrusion Detection, Intrusion Prev.docx
 
Ethical Hacking
Ethical HackingEthical Hacking
Ethical Hacking
 
The red book
The red book  The red book
The red book
 
Multi-vocal Review of security orchestration
Multi-vocal Review of security orchestrationMulti-vocal Review of security orchestration
Multi-vocal Review of security orchestration
 
Cyber Crime Multi-State Information Sharing and Analysis Center
Cyber Crime Multi-State Information Sharing and Analysis CenterCyber Crime Multi-State Information Sharing and Analysis Center
Cyber Crime Multi-State Information Sharing and Analysis Center
 

Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks

  • 1. University of Glamorgan Prifysol Morgannwg Faculty of Advanced Technology M.Sc. Project Careless e-Talk Costs Money; The Risk of Open Source Intelligence based attacks Student Name: John Dunne Student Enrolment Number: 11058374 Primary Project Supervisor: Konstantinos Xynos Secondary Project Supervisor: Iain Sutherland Academic Year: 2012/ 2013 Scheme: Part-time
  • 2. 2 Faculty of Advanced Technology STATEMENT OF ORIGINALITY This is to certify that, except where specific reference is made, the work described in this project is the result of the investigation carried out by the student, and that neither this project nor any part of it has been presented, or is currently being submitted in candidature for any award other than in part for the M.Sc. award, Faculty of Advanced Technology from the University of Glamorgan. Signed Student
  • 3. 3 Abstract Deep Web and Open Source Intelligence (OSINT) based attacks are a growing problem within the UK and worldwide. One of the fastest emerging technological threats being faced by organisations today is that of losing sensitive or confidential data, through illegal activities such as data exfiltration and intellectual property theft. The purpose of this dissertation is to examine the level of understanding within the Information Systems (IS) Security community about the threat posed from Deep Web and OSINT-based attacks. Also, to consider whether the IS controls that are currently in place are sufficient to counter the threat. It also describes, by means of case studies, the potential threat to organisations and nation states and proposes a method of control through the adoption of an audit program to measure the exposure to OSINT-based attacks. Disclaimer A number of the websites and resources identified in this paper are part of the Internet that is at the very end of regulation, almost to the point of no regulation save the users own common sense. Any reader who seeks out these resources must use all sensible precautions when viewing or downloading content. The author of this paper bears no responsibility for the actions or consequences of any reader who utilises the resources identified herein.
  • 4. 4 Acknowledgements I would like to thank my employer Grant Thornton UK LLP, and in particular Greg Swift - National Director of Information Systems, for their support and encouragement during my studies. I would also like to thank all of my colleagues in the IS Security profession for sharing their knowledge and experience in completing the survey that informed this dissertation. I would especially like to thank Pete Wood of First Base Technologies and James Chappell of Digital Shadows for their invaluable guidance and knowledge. In addition I would also like to thank my tutors and my project supervisors Konstantinos Xynos and Iain Sutherland for their help and guidance. And finally I would like to thank my beautiful wife Helen and my children Zachary and Grace for their patience, love and support and my sister Clare for her priceless advice.
  • 5. 5 Contents Glossary 7 Introduction 8 Aims and Objectives 8 Section 1 – The Deep Web 10 1.1 What is the Deep Web? 10 1.2 How big is the Deep Web? 12 1.3 What kind of information is available in the Deep Web? 14 1.4 Accessing the information on the Deep Web 17 1.5 The Deep Web and Hacktivism 19 1.6 Extraction of data from the Deep Web 19 Section 2 – Open Source Intelligence 26 2.1 What is Open Source Intelligence? (OSINT) 26 2.2 How can OSINT Information be aggregated? 29 2.3 OSINT versus Big Data 30 Section 3 – Gaining Unauthorised Access 31 3.1 The use of "Spear Phishing" and other hacking techniques 31 3.2 Case Studies 34 Section 4 – Survey Design, Data Collection and Analysis 40 4.1 Selection and justification of the data collection tool 40 4.2 Survey Design 43 4.3 Use of the ISO 27001 standard 45 4.4 Analysis of survey results 47 Section 5 – Conclusions and Recommendations 59 5.1 Conclusions 59 5.2 Recommendations 61 5.3 Application of Recommendations to Case Studies 63 5.4 Summation 66 5.5 Suggestions for Future Research 67 References 68 Appendix 1 – Example Audit Program for Evaluating and Organising Controls to prevent OSINT–type attacks on an Organisation 71 Appendix 2 – Copy of MSc proposal 74 Appendix 3 – Survey Questionnaire 77
  • 6. 6 Appendix 4 – Survey Results 90 Appendix 5 – MSc Project Logbook 123 Appendix 6 – Electronic copy of Dissertation 132 Table 1: Top 10 search engines according to Alexa Internet analytics ..................15 Table 2: QFD chart for selection of data collection tool..........................................43 Figure 1: Screenshot of the front screen of TOR Web browser .............................17 Figure 2: Screenshot of the login page for the Silk Road .......................................18 Figure 3: Screenshot of the results returned from the Maltego software................21 Figure 4: Screenshot of Copernic Desktop Agent (CDA) software.........................22 Figure 5: Screenshot of the results returned from the FOCA application ...............23 Figure 6: Screenshot of the Shodan Internet engine..............................................24 Figure 7 Return on Security Investment equation ..................................................64
  • 7. 7 Glossary APT Advanced Persistent Threat – A complex, determined and well-resourced attack on the infrastructure of large organisations or nation states, typically orchestrated by another nation state. Controls Framework A holistic term used in this paper to describe the collection of logical, physical and managerial IS Security controls that protect an organisation. Deep Web The databases and collected resources that make up the vast, hidden part of the Internet not normally accessed by "everyday" users. Footprint(ing) The practice of researching a target to establish attack vectors or weaknesses within the infrastructure or online presence that can be exploited. Differs from OSINT in that it incorporates techniques such as port scans to identify technical weaknesses. Grey Literature Published material that is not indexed and often lacks data about the publisher. IPR Intellectual Property Rights. Legal term referring to the creation of the mind for which ownership is recognised. Keylogger Software program that, when installed on a machine (often maliciously), records all of the keys struck on a keyboard. The information can then be overtly or covertly collected by the person who installed the software. OSINT Open Source Intelligence – The practice of searching for, collating and analysing information on an intended target. Spear Phishing The practice of sending emails that contain malicious Trojans and viruses to high profile individuals in order to gain unauthorised access to their network. The emails usually contain some sort of enticement or incentive. Surface Web The webpages and other internet resources used by the majority of users. Whaling A specific form of 'phishing' or 'spear phishing' that targets upper managers in private companies and usually contain some sort of pseudo-legal or corporate instruction designed to galvanise the recipient into action.
  • 8. 8 Introduction Within the Internet there is a flood of information available on individuals and organisations that are available to those with the knowledge of how to access it. This data goes beyond what is visible on the "Surface Web" and can be extracted, aggregated and analysed to provide extremely comprehensive profiles for research or informing sophisticated cyber attacks. The data is held in what is commonly referred to as the "Deep Web" and the practice of extrapolating data from it is known as Open Source Intelligence gathering, or OSINT. Aims and Objectives The aim of this dissertation is to evaluate whether the current level of IS Security controls are sufficient to deter Deep Web and OSINT-based attacks and to make recommendations for additional controls as required. The objectives of this dissertation are: 1. To determine the current level of understanding about Deep Web and OSINT-based attacks within the IS Security community; 2. Determine the effectiveness of IS Security controls currently in place in deterring OSINT-based attacks; 3. Make recommendations as appropriate for additional logical, physical or managerial controls to guard against such attacks; In order to effectively meet the aims and objectives of this paper it has been structured as follows: The first section describes the Deep Web, its size and the type of data stored within its databases. It also describes some of the common tools for accessing and extracting the data. The second section provides an overview of the techniques of OSINT and describes how the aggregation of such data can be used to build a picture of the target for subsequent use. The third section outlines some of the techniques that are used in conjunction with the OSINT data to attack organisations and individuals (e.g. spear phishing). This
  • 9. 9 section also contains two case studies as an example to demonstrate how the practice of Deep Web and OSINT attacks are perpetrated. The fourth section describes the research methodology applied to the collection of data that informs this dissertation, describes the survey design and presents an analysis of the collated results. The final section contains the conclusions drawn from the survey results, the recommendations arising from the conclusions and suggestions for subsequent study to build on the work undertaken in this dissertation.
  • 10. 10 Section 1 – The Deep Web 1.1 What is the Deep Web? Personal and corporate data has never been more available than it is today. With the exponential growth in corporate websites, personal web spaces, social media sites (e.g. Facebook, Bebo, Twitter, Myspace) and image sharing sites (e.g. Picasa, Flickr) there is a wide range of personal information available on the Internet. Add to this the innumerable other data sources such as business forums (e.g. Linkedin), special interest forums, blogs, videos, news feeds and databases the Internet is bloated with a plethora of information However, this is barely the tip of the iceberg. Behind this "Surface Web" of websites and search engines there lies a multitude of databases that dwarf the part of the internet that the typical user will browse. This is the Deep Web. The term "Deep Web" was initially coined by Mike Bergman, the founder of the internet research company Brightplanet™. The term refers to the myriad of databases and other data sources that sit behind the websites that make up the "Surface Web", that is the websites typically accessed by the majority of users. In order to differentiate between the two entities this paper uses the term "Surface Web" to define the internet resources referenced by proprietary search engines and used by the majority of internet users, and "Deep Web" to refer to the unlinked databases that are not normally included within surface searches. By searching the Deep Web for articles of information on an organisation or individual it is possible to build up a comprehensive picture of them and utilise it for a number of purposes, such as corporate intelligence gathering and targeted marketing. This practice is known as Open Source Intelligence, or OSINT, gathering. However, the data can also be used for more nefarious purposes such as committing identity fraud, informing "spear phishing" attacks and stealing or contravening Intellectual Property Rights (IPR). In their seminal paper "Accessing the Deep Web", the authors describe the Deep Web thus;
  • 11. 11 "The Web has been rapidly ―deepened‖ by massive databases online and current search engines do not reach most of the data on the Internet. While the surface Web has linked billions of static HTML pages, a far more significant amount of information is believed to be ―hidden‖ in the deep Web, behind the query forms of searchable databases…. Such information may not be accessible through static URL links because they are assembled into Web pages as responses to queries submitted through the query interface of an underlying database. Because current search engines cannot effectively crawl databases, such data remains largely hidden from users (thus often also referred to as the invisible or hidden Web). (He, Patel, Zhang, & Chang, 2007) On the Surface web applications known as "Web Crawlers"1 operate behind search engines to find the information that is being searched for. When a term or search string is entered into the search engine, the web crawler software scans the internet for relevant hyperlinks and metadata within web sites code and content. The software maps the structure of the web site, records and describes all the links and the search engine presents the results for the user to see. However, websites that do not have an index or system of visible links, or contain data sets outside of normal Hyper Text Mark-up Language (HTML) code are unlikely to be scanned or detected by a web crawler and are therefore considered un-indexed. This is the structure of information within the Deep Web. In essence it is comprised of Internet resources that are not displayed via searches using conventional search engines. Even though the vast majority of information is publicly accessible it does not appear on search engines or, if it does, the relevance rating is so low that the result is unlikely to be viewed by the enquirer. In the 2008 paper "Google's Deep Web crawl", the authors recognise the difficulty of extrapolating data from the Deep Web thus: "The Deep Web, i.e., content hidden behind HTML forms, has long been acknowledged as a significant gap in search engine coverage. Since it represents a large portion of the structured data on the Web, accessing Deep-Web content has been a long-standing challenge for the database community." (Madhavan, Ko, Kot, Ganapathy, Rasmussen, & Halevy, 2008) 1 Web Crawler – An automated piece of software that browses the World Wide Web, typically for the purpose of Web indexing
  • 12. 12 1.2 How big is the Deep Web? Estimates on the size of the Deep Web vary. In 2001 the paper "The Deep Web, Surfacing Hidden Value" made the following assertions about the size of the Deep Web:  Public information on the deep Web is currently 400 to 550 times larger than the commonly defined World Wide Web.  The deep Web contains 7,500 terabytes of information compared to nineteen terabytes of information in the surface Web.  The deep Web contains nearly 550 billion individual documents compared to the one billion of the surface Web.  More than 200,000 deep Web sites presently exist.  Sixty of the largest deep-Web sites collectively contain about 750 terabytes of information — sufficient by themselves to exceed the size of the surface Web forty times.  On average, deep Web sites receive fifty per cent greater monthly traffic than surface sites and are more highly linked to than surface sites; however, the typical (median) deep Web site is not well known to the Internet- searching public.  The deep Web is the largest growing category of new information on the Internet.  Deep Web sites tend to be narrower, with deeper content, than conventional surface sites.  Total quality content of the deep Web is 1,000 to 2,000 times greater than that of the surface Web.  Deep Web content is highly relevant to every information need, market, and domain.  More than half of the deep Web content resides in topic-specific databases.  A full ninety-five per cent of the deep Web is publicly accessible information — not subject to fees or subscriptions. (Bergman, 2001) In 2007 a more accurate analysis was used to calculate the extent of the Deep Web. The calculation confirmed the previous estimate that the Deep Web is considerably larger than the Surface web; Using overlap analysis between pairs of search engines, it was estimated that 43,000–96,000 ―deep Web sites‖ and an informal estimate of 7,500 terabytes of data exist—500 times larger than the surface Web. (He, Patel, Zhang, & Chang, 2007) More recent papers suggest that, now that size of the visible Internet has expanded, the gap between the two realms has reduced. However, in its present state, the Deep Web still dwarfs the Surface Web; Today’s internet is significantly bigger with an estimated 555 million domains, each containing thousands or millions of unique web pages. As
  • 13. 13 the web continues to grow, so too will the Deep Web and the value attained from Deep Web content. (Pederson, 2013)
  • 14. 14 1.3 What kind of information is available in the Deep Web? There is a vast repository of information on the Deep Web. It is comprised of data from a number of sources, such as:  Research databases such as Wikipedia, Ebscohost2 and Jstor3 . Databases such as these have their own search engines that can extrapolate data from the database tables. However, the user must be able to access the website and utilise its search function, rather than conducting a string-specific search using tools from outside the website. Alternatively, the database tables may contain data in formats not supported by the web browser software. Additionally, the majority of website search engines maintain archive databases of popular searches and analytical information within their cache that is available online. According to the Alexa Internet analytics website the top ten search Internet engines are shown below. (Position in world rankings according to Alexa monitoring software shown in brackets): # Site Rank according to Alexa Host Country 1 Google 2 USA 2 Yahoo 4 USA 3 Baidu 5 China 4 QQ 8 China 5 Windows Live 9 USA 2 EBSCOhost is a powerful online reference system accessible via the Internet. It offers a variety of proprietary full text databases and popular databases from leading information providers. (EBSCOhost, 2013) 3 JSTOR is a digital library of more than 1,500 academic journals, books, and primary sources. (JSTOR, 2013)
  • 15. 15 6 Google India 12 USA 7 Yahoo! Japan 15 USA 8 Bing 16 USA 9 Sian.com.cn 17 China 10 Yandex.ru 18 Russia Table 1: Top 10 search engines according to Alexa Internet analytics (Top Sites, 2013);  Dynamic content (e.g. searches) that are generated by the use of search boxes and forms on websites (e.g. the results on a search on Google, Yahoo, Bing or Facebook that are archived by the service providers);  Unlinked content, stuff that does not have any public linked website pointing to it. For example, an individual may set up their own internet site and upload documents such marketing materials or their resume. Whilst this content is available to all it will remain hidden until the person sends the requester a link to the web site;  Private Web – Websites that require authentication via a username and password such as web-based mail accounts or data storage (e.g. cloud data storage);  Real-time content; live streams and feeds by their nature cannot be indexed but are archived;  Contextual web - web resources that are restricted by IP address that will restrict some resources based on your location, for example YouTube limiting content by location or the Great Firewall of China;
  • 16. 16  Limited access content – web resources which are invisible to internet users until the link is sent to them. For example, private business webs such as the UK Business Forums Website (UK Business Forums, 2013) or mail groups such as Yahoo mail or Googlemail;  Non-html content – textual content encoded in multimedia formats or specific file formats not handled by search engines Some of the most revealing information cannot be found by using proprietary search engines, but is readily available by using web tools available to retrieve it.
  • 17. 17 1.4 Accessing the information on the Deep Web In recent years Google™, the Internet's largest search engine provider [according to the Internet analytics website Alexa (Top Sites, 2013)], has recognised the need for accessing and indexing the content of the Deep Web. They have published a series of papers identifying different mechanisms by which the Deep Web can be accessed through the Google search engine. However, there are still inherent issues with this type of search because they rely on searching for specific elements, such as entities on shopping sites, rather than an all-encompassing search that identifies all of the items that relate to the search string. The most widely known software for accessing the Deep Web is The Onion Router, also known as TOR (Internet Defense League, 2013). TOR is an open source browser (fig.1) that protects the user's anonymity by relaying the internet packets through a private network of routers, avoiding the proprietary search engine routers. Part of the TOR network are so called "onion" sites, that is, sites that have the suffix .onion as the (pseudo) Top-Level Domain4 (TLD) hostname. Such addresses are not actual Domain Name System (DNS) names, and the .onion TLD is not in the Internet DNS root, but with the appropriate proxy software installed, Internet programs such as the TOR web browser can access sites with .onion addresses by sending the request through the network of TOR servers. Figure 1: Screenshot of the front screen of TOR Web browser 4 Top Level Domain –the highest level of architecture in the hierarchical Domain Name System of the Internet, typically reserved for countries or sectors (e.g. .com, .co.uk, .gov, .ac.uk)
  • 18. 18 One of the more infamous areas of the deep web that can be accessed using TOR, is that of the Silk Road. The Silk Road (fig.2) is an e-marketplace that has proliferated in the last few years, known primarily for the buying and selling of illegal drugs. The website administrators claim to be trustworthy and only allow genuine (if illegal in some countries) transactions. In order to facilitate a transaction on the Silk Road the user must first create an account on the website and be in possession of a bitcoin5 account to enable the transfer of funds to the appropriate vendor. Other Deep Web sites include the Hidden Wiki, a website that offers resources such as hacking, forgery, weaponry and other nefarious activities. Figure 2: Screenshot of the login page for the Silk Road 5 Bitcoin – an electronic method of making and receiving payments.
  • 19. 19 1.5 The Deep Web and Hacktivism One of the primary users of the Deep Web is the hacktivist6 , who use its resources and anonymity to promote their political ideology. The hacktivists use the Deep Web to attack systems and architectures with legal and illegal tools in order to manifest their dissent. Techniques include activities such as denial-of-service attacks, data breaches and web site defacement as well as other methods of digital sabotage. The hacktivists undertake their operations in the belief that they can effect the changes that the normal forms of protest (e.g. civil disobedience) produce. In the IS Security Forum website the author identifies two types of hacktivism; "[There are] two different participative approaches to the Dark Web. The hacktivist, in fact, could surf in the hidden space for information gathering purposes, the ―passive mode‖, and also in ―active mode‖ conducting cyber operations similar to ones promoted in the ordinary web. (Paganini, 2012) Typically the activities of the "passive" hacktivist outweigh the activities of the "active" hacktivist. However they all draw from the same information source which is ripe with information, as the author continues; "The deep web is an ocean of information, and to find the right way in this world on the first approach may seem very complicated, but with good will and some evidence from the earliest voyages it is possible to obtain satisfactory results" (Paganini, 2012) 1.6 Extraction of data from the Deep Web Whilst it is possible to use TOR and other tools to find data within the Deep Web the user must utilise other pieces of software to extract and aggregate it. In the article "How to Hack a Nation's Infrastructure" the author commented "Security experts are finding lots of holes in the software they run that, in the hands of a skilled attacker, can be exploited to grant unauthorised access." (Ward, 2013) Tools such as Maltego, Copernic, FOCA and Shodan can be used to extract and aggregate information from the Internet to build a comprehensive picture of the target. Whilst they can extrapolate HTML, XML and other web-based data, it is 6 Hacktivist - a person who make use of hacking tools and techniques to promote a political agenda. Examples of this practice are organisations such as Lulzsec and Anonymous.
  • 20. 20 also possible to use resources such as media feeds, geo-location and archive data to complete the picture of the target.
  • 21. 21 Maltego Maltego is an open source intelligence and forensics application, developed and sold by Paterva. It is capable of the in-depth mining and collation of information which can be presented in several different formats. Maltego is also capable of identifying and displaying the complex relationships between the identified target and its subsidiary connections. For example, in figure 3 the software has identified all the sub domains of the URL7 http:/www.glam.ac.uk. This was a relatively simple exercise for the software but it can also undertake more complex tasks, such as identifying connections to an email address, IP address or network entity. It can also cross over media platforms and find connections to other media streams, such as twitter or Facebook. Figure 3: Screenshot of the results returned from the Maltego software In addition Maltego allows the creation of custom entities, allowing it to represent any type of information as well as the basic entity types which are part of the software. The basic focus of the application is analyzing real-world relationships between people, groups, websites, domains, networks, internet infrastructure, and affiliations with online services such as Twitter and Facebook. 7 URL – Uniform Resource Locator: A specific character string that represents a reference to the location of Internet resources such as web pages.
  • 22. 22 Copernic Copernic is a commercial metasearch utility that operates on the Windows family of operating systems. The company offers a range of products that have been designed to operate in different environments. For example, once installed the Copernic Desktop Agent (CDA) software can search for all files relating to a specific search string on either a host machine or network and display them by category. (See figure 4 for an example of the search string glam.ac.uk on the authors computer) Figure 4: Screenshot of Copernic Desktop Agent (CDA) software Copernic Agent is the web-based version of the software that can search for Internet content and displays according to category and relevance. The software also includes filters in order to clarify the results returned and, combined with Copernic Tracker (a utility that checks for changes of content on defined pages and alerts the user) can be used to build and maintain comprehensive target profiles.
  • 23. 23 FOCA FOCA is a freeware tool capable of undertaking foot-printing (the defining of the security profile of an organisation such as user accounts, corporate officers and email addresses) and fingerprinting (the defining of an organisations operating system and other technical information) in order to inform a security audit or attack. Foot-printing and fingerprinting are methodical stages of a security sweep that also includes scanning and enumeration (defining which ports and services are open and available). The freeware version performs searches on servers, domains, URLs and documents published on the web. In figure 5 it is possible to see elements of the search conducted on the same glam.ac.uk domain as used in the Maltego example. Figure 5: Screenshot of the results returned from the FOCA application When used as part of a Deep Web or OSINT-based attack, the information gathered using this tool would inform a cyber attack such as Network or Technical Intrusion that is described in the next chapter.
  • 24. 24 Shodan Shodan is a search engine designed to search for devices and computer systems that connect to the World Wide Web. It was launched in 2009 by computer programmer who had conceived the idea of searching devices linked to the Internet. Shodan users are able to find control systems for large organisations such as amusement parks, petrol stations and utility plants as well as low level systems such as security cameras, home heating systems and traffic lights. It is possible to access these systems because, in many cases, the default password has not been changed and the only software required to connect them is a web browser. The website searches the Internet for publicly accessible devices, concentrating on Supervisory Control And Data Acquisition (SCADA) systems, like those used to operate major installations. It is possible to use Shodan without creating an account on the software, but the search results are limited. It is proposed by the website that the primary users of Shodan are cyber security professionals, researchers and law enforcement agencies. Whilst it has been argued that cybercriminals could also use the website, the reality is that many would typically have access to botnets that could accomplish the same task without detection. Figure 6: Screenshot of the Shodan Internet engine
  • 25. 25 Attackers will also use social media sites and other media platforms such as Picasa™ or Flickr™ to add a geolocation dimension to its attack vector. In the 2011 article "In Plain View - Open Source Intelligence" the author details an example where a fellow journalist utilised available web information to identify and research a complete stranger; He [the journalist] saw her taking pictures with an iPhone 3G in a San Francisco Park. Searching on Flickr that night, he found the picture that she had taken, and was quickly able to work out where she lived and what her apartment looked like, simply by examining her photo stream. (Bradbury, April 2011) Commercial Deep Web harvesting Companies such as Bright Planet and Digital Shadows and independent IS Security professionals use these applications, and their own specially developed tools and techniques, to develop an internet profile of their client which can then be used to inform the risk register. Now that the location of the information is known and the tools are available to collect it, the information can be harvested, aggregated and analysed using OSINT techniques, in order to develop a profile of the organisation. From there it can be determined which is the best method of attack.
  • 26. 26 Section 2 – Open Source Intelligence 2.1 What is Open Source Intelligence? (OSINT) Now that the scope of the data sources has been identified as wider than that of the snippets available on the Surface Web, the next stage is to extract and develop them so that it builds a comprehensive picture of the target. An individual data item has an intrinsic value but, once it is combined with other data items and their relationship analysed, the value increases exponentially. OSINT is the practice of searching for and aggregating information on a target organisation or individual in order to develop a profile of them. In the 2008 paper "Using open source data in developing competitive and marketing intelligence" the author suggests the following definition: Open source intelligence, more commonly known as OSINT, is an information processing discipline. More specifically for the purposes of this paper, it is defined as the scanning, finding, gathering, exploitation, validation, analysis, and sharing with intelligence-seeking clients of publicly available print and digital/electronic data from unclassified, non-secret, and ―grey literature‖ sources. Grey literature is published material that is not indexed and often lacks data about the publisher. (Fleisher, 2008) True OSINT is an intelligence level activity conducted by nation states in order to inform the threat analysis against it or its enemies. In the 2008 paper "Can Open Source Intelligence emerge as an indispensible discipline for the Intelligence community in the 21st Century?" presented to the Research Institute for European and American Studies (RIEAS) the author argues; OSINT offer governments what they cannot get from their close networks and they have the capability of choosing what they need from the variety of products available. This indicates that OSINT has opened a new window to government agencies and departments since they can fill the gaps that emerge in their analyses by including the added value of open sources. (Minas, 2010)
  • 27. 27 However, OSINT is being used by a wider audience than just the intelligence services of nation states, it is also being used by people everyday to resolve problems. In the 2011 paper "Intelligence in the age: The emergence and evolution of Open Source Intelligence" the authors argue that the Internet is changing the way problems are resolved by changing the dynamic between "crystallised"8 and "fluid"9 intelligence. By sharing problems, solutions and information in ever greater quantities people are populating the Internet with more and more information (Glassman & Kang, 2012). Whilst this undoubtedly has benefits for the growth and development of human understanding, it is also creating a huge database of targets and their vulnerabilities for those of less noble intent to exploit. Criminals could potentially search for targets with specific vulnerabilities and target them with "false flag" solutions (e.g. a piece of software purporting to be a patch but is in reality a Trojan or malware). In the commercial world the use of OSINT has been practised for a number of years and is referred to as Competitive and Marketing Intelligence (C/MI). C/MI can described as the; systematic, targeted, timely and ethical effort to collect, synthesize, and analyse competition, markets and the external environment in order to produce actionable insights for decision-makers. (Fleisher, 2008) The rationale and benefit for conducting CM/I is that it should better inform the practitioner with regard to the current market and their competitors abilities, which in turn should underpin better decision making and lead to enhanced economic/financial performance (Fahey, 2007). The difference between C/MI and OSINT is that C/MI is restricted to published information such as public accounts and marketing material, or contacts within competitors prepared to share information. OSINT however, makes use of the "grey literature" resources available to provide the researcher with a greater degree of raw material. 8 Crystallised intelligence – human behaviour that has been developed through experience and is part of normal behaviour. 9 Fluid Intelligence – experience that is gained as the individual meets new challenges and resolves problems. It is updated to reflect changes in the situation.
  • 28. 28 Technical examples of grey literature include technical reports, working papers or patents. Social or personal grey literature could be from blogs, social media posts, online biographies or website entries of business forums. OSINT also includes information from those resources that may not be immediately visible to the average user, e.g. the Deep Web.
  • 29. 29 2.2 How can OSINT Information be aggregated? The easiest way to answer this question is through example. An attacker identifies a target (either for commercial or political gain) but only knows limited information about them, perhaps their email address, but not much else. Using Maltego they discover that the target has entered their email address on a number of classic car web sites as part of their mailing lists. Using the geo-location facility they identify a number of tweets and social media entries that have been posted from various classic car rallies, parties, adverts selling cars and other events. Using image recognition software they initiate a search using Copernic on the target's face which returns a number of locations and events where their target is positively identified. The grouping and rationalisation utilities allow the attacker to develop the favoured locations of the target which may lead to the identification of their home address. The software may also identify posts or websites that display a number of personal bits of information such as friends, pets, schools attended, possibly even parents names – all useful information that is typically used as answers to security verification questions, such as those used by banks or other institutions when a password change request is received. In this example it is possible to see that a lot of information can be aggregated in a very short space of time. However, the cybercriminal will use the information gleaned to inform an attack on the target using one, a series or all of the techniques identified in the next chapter.
  • 30. 30 2.3 OSINT versus Big Data During the research phase of this paper it was noted that some authors had confused OSINT with Big Data. For clarity, the practice of Big Data has been described here and compared to OSINT. Gartner describes the term "big data" as "…high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making." (Gartner, 2013) Some fields of study, e.g. meteorology or the biomedical study of genomics, can involve the collection of Petabytes10 or Exabytes11 of data. As more and more data is collected, the tools and processes that have been traditionally used to analyse them have become ineffective, thus requiring more innovative solutions. Typically Big Data is processed using parallel processing techniques whereby the data is divided between several hundred computers all analysing a portion of the whole data set. The results from each are collated and amalgamated for analysis. This is not the same as OSINT as it involves wholesale data analysis of a collated set of results, rather than the targeted search for information, its amalgamation and analysis. 10 Petabyte = a multiple of the unit byte for digital information. 1 petabyte = 1000 terabytes 11 Exabyte = similarly a unit of measurement for digital information. 1 Exabyte = 1000 petabytes
  • 31. 31 Section 3 – Gaining Unauthorised Access 3.1 The use of "Spear Phishing" and other hacking techniques In order to make use of the information gleaned from the Deep Web and OSINT activities, the cybercriminal needs to combine it with hacking techniques, such as the ones identified below, in order to gain access to the target organisation. Such techniques were traditionally the preserve of those persons who had sufficient technical knowledge to develop the software used to trick their intended victims. However, with the increase of videos and tutorials on how to develop these techniques on the Internet, the threat has become much more widespread. The techniques below are not ranked in any particular order, and their effectiveness is reliant on the ability of the person using them. Spear Phishing A target communication, usually an email with a malware-infected attachment or hyperlink, is sent to a specific set of users accounts at the target organisation (e.g. the sales team, marketing or senior management). The email usually contains a special offer or other prize enticing the user to download an attachment or click on a URL within the body of the email. When the payload is activated the attacker uses the malicious payload of the email to gain access to the target network. An example of this are the attacks perpetrated on Yahoo and Hotmail accounts in 2011 (McMillan, 2011) whereby the accounts of thousands of individuals were duped into either downloading a virus or redirecting their browsers to a malicious website. A variation on this type of attack is "Whaling". Whaling Whaling is a more specific form of spear phishing that targets upper managers in private companies. The objective is to deceive them into divulging the confidential company information on their hard drives. The content of a whaling attack email is often written as a legal subpoena, customer complaint, or executive issue and usually involves some kind of falsified company-wide concern. The goal is to force the manager into action similar to a phishing attack whereby they download a document or click on a link. In the case of the recent 2008 FBI subpoena whaling scam 20,000 corporate C.E.Os. were targeted. Approximately 10% clicked on the whaling link, believing it
  • 32. 32 would download a "special" browser add-on to view the entire subpoena document. In truth, the linked software was a key logger that secretly recorded the C.E.Os. passwords, and forwarded those passwords onto the attackers. As a result, each of the 2000 compromised companies was further hacked in some way. Network intrusion / Technical Intrusion / Procedural Compromise Network intrusion is an attack where a hacker has successfully foot-printed an organisation and calculated what their security measures are (e.g. identifying the correct versions of the internet server, antivirus and firewall software in use), through the use of tools such as FOCA. They can then craft a virus or Trojan to take advantage of the holes in the security architecture and utilise the "lag" time between the virus being created and the patch being released to attack the organisation. Such attacks are often known as "Zero day" attacks, an example of which is the Conficker worm (Microsoft, 2013). A more technical type of attack is that which uses reconnaissance to discover the layout of an IS infrastructure which then helps focus efforts at the weakest points the organisation's defences. This may be involve attacking a lower priority organisation that may have lesser security controls and, once access has been gained, attempting to promote the account to gain access to better protected organisations that are connected to it, for example the Gary McKinnon attack on the US government infrastructure. Once he was inside a network, especially a military network, McKinnon found that other computer systems considered him a trusted user. This was how he was able to get into the Pentagon's network. "It was really by accident," he says. (Computer Weekly, 2008) Having used the information gathered to understand the organisation's security processes in detail, the attacker is able to bypass them. Data Exfiltration It is possible to construct viruses that, once installed on the host machine, will steal the data in small, piecemeal chunks, and pass it out through the network firewall controls as part of the normal network traffic. Infection is typically through a user opening an email attachment and, once the payload is executed, the data is exfiltrated In addition the data packets are typically encrypted using a custom coding or encryption algorithm to further obfuscate the real intentions of the
  • 33. 33 software from the security controls (CodeInjector, 2013) . This is a very difficult attack to guard against as normal Intrusion Detection / Prevention Systems (IDPS) are unable to distinguish between the malware traffic and normal network traffic. It requires a Stateful IDPS12 that can determine the destination of the packets and either block the traffic or alert the IS Security officer of any suspicious activity. In the 2003 paper " A Stateful Intrusion Detection System for World-Wide Web Servers" the authors describe a system for detecting and preventing such malicious traffic. (Vigna, Robertson, Kher, & Kemmerer, 2003) Targeted personal attacks / Social Media Impersonation Another method for attacking an organisation is to use hostile reconnaissance to identify the target's personal profile and acquaintances, and then masquerade as a friend to gain access to their personal data. As previously stated, a large proportion of this is now online and can be used to inform an attack. In a 2012 paper from the SANS Institute the following comment is made about social media: "Social media provides an attack vector which can enable an attack on the organization. But that is not the only risk. Social media is a tool, and there may be consequences if that tool is not used properly. A firearm is a tool that someone can use to provide protection, but improper use of the firearm could lead to shooting oneself in the foot. Social media can work in the same way. Social media can also be used in a positive way as a tool to make money, enhance the business, or reduce business costs" (Shullich, 2011) The attacker compromises the social media account of an associate of a targeted individual or impersonates an apparently legitimate entity that the target would be interested in. The attacker then posts infected hyperlinks that, when the targeted individual follows the links, allows the attacker access to their network. Other attacks, similar to the techniques described above include Helpdesk coercion where the attacker uses the information gained to pose as a legitimate member of staff and dupe IS staff into granting access. Similarly attackers can impersonate the IS staff and target users to obtain sensitive information. 12 IDPS software that is capable of analysing the traffic of a network and identifying whether amalgamated blocks of data would contravene the security thresholds that have been set.
  • 34. 34 3.2 Case Studies As a means of demonstrating some of the risks to organisations from Deep Web and OSINT-based attacks, two case studies that incorporate some of the techniques highlighted previously have been included in this chapter. The first case study is based on a fictional organisation that could be representative of any Small or Medium-sized Enterprise (SME) and describes some of the Deep Web research and phishing techniques used to bypass the security controls of the network. The second examines the international dimension of Deep Web and OSINT-based attacks and examines the global threat of Advanced Persistent Threats (APT's)13 . Case Study 1 The first example concerns a medium sized manufacturing firm based in the UK that employs approximately 1000 staff. It has a fairly flat management structure and administration is split between the development, operations, sales, Human Resources (HR), IS and facilities departments. The IS department supports the internal network and utilises a national provider for access to the WAN and the Internet. Services requiring larger resources or skill sets, such as SAN provision or SaaS14 for technical applications, are outsourced to service providers and the contract is managed through the IS Manager. Network controls include password and data access profiles for all staff. Management controls include an IS Security policy that all staff are required to read and confirm their agreement with during their induction phase. The network is not physically segregated, but logical controls delimit access for departments to separate shares. There is no on-going staff training programme in IS Security or data handling. Staff are permitted to use the email system for limited personal use and, whilst access to the internet is monitored for inappropriate use, there is no restriction on the viewing and connection to personal sites such as social media and media sharing. In addition there is no Social Media policy, nor is there any reference to what staff can post on their own account. Staff are trusted to exercise good judgement. 13 APT's are usually nation states that have the resources and capability to conduct sophisticated, high volume attacks on an opposing nation's infrastructure. 14 Software as a Service – the provision of applications or network services to the organisation by a third party agency, e.g. Microsoft 365
  • 35. 35 The company holds the patent on a number of small but successful inventions and is active in the development of new products. As part of an annual staff message the Chief Executive announces the development of a number of new products that will significantly increase the profitability of the company, but confirms that it will take a number of weeks to get the products to market because of operational issues. He also mentions that they are planning a wide ranging marketing campaign that will guarantee the success of the company but asks that nobody makes any announcements, so that the full impact of the campaign is not diminished. Despite the warning several of the staff make oblique (and sometimes direct) references to the new products in their profiles on social media sites, in emails to friends and via other media streams. Departments in the company also make discrete references to a new product in emails to suppliers and clients. Both activities draw the attention of cyber criminals who recognise that the theft of the IPR would be extremely valuable to the company's competitors. The criminals begin by "foot-printing" of the organisation and identifying the names and email addresses of the management team of the company. By using Maltego, Copernic and other tools and techniques described previously to search the Surface Web and Deep Web, the cyber criminals discover that the Chief Executive and several of his senior staff have more than a passing interest in golf. In addition, the company has previously sponsored competitions at the local golf club. They craft an innocuous looking email that has an attractive golfing membership deal and brochure included as an attachment and send it to the Senior Management Team (SMT) of the company. The email is received by the members of the SMT and, believing that the email is both genuine and safe since it has passed through the network security, click on the attachment. When the attachment is activated the brochure opens but, at the same time, a malicious Trojan programme also installs an enterprise level account on the host machine. Once installed it sends an activation message back to the cyber criminal. This is an example of spear phishing as described in the previous chapter.
  • 36. 36 Once the activation message if received the cyber criminal logs onto the host machine and installs a rootkit15 to collect keystrokes and screen captures any sites that has a Hypertext Transfer Protocol Secure (https) designation. The company uses a proprietary antivirus package that, whilst is automatically updated with the latest virus definition files, does not detect the Trojan as it was installed in the "lag time" between the Trojan being propagated and the patch being released. However, once the virus definition files update, the Trojan is detected and the IS Manager instigates a clean-up routine that removes it. He does not , however, notice that a number of the machines have an additional access account on them, or that a root kit has been installed. Using the enterprise level account the cyber criminals are now free to search the data directories of the company for documentation relating to, as yet unpatented or unreleased inventions. They identify a number of documents including the technical drawings for several designs, which they either FTP16 to their own server or email to a dummy account under their control. The cyber-criminal now has a choice of how to maximise payment from the company; they can either;  Sell the designs they have stolen to a competitor, typically in another geographical region to the original manufacturer so that legal arguments over IPR are complex and protracted, or  Attempt to blackmail the company for the return of the data. Rather than use an off-shore banks to collect payment, the criminal can demand payment via bitcoins which can be instantly transferred across several nation states and can be collected anonymously. Typically, most criminals will attempt extortion and then, whether payment is received or not, sell the designs anyway. 15 A rootkit is a type of software, often malicious, that has been designed to hide itself or the existence of certain processes or programs from the normal methods of detection and enable continued privileged access to a computer. 16 FTP - File Transfer Protocol (FTP) is a standard network protocol used to transfer files from one host to another host over a TCP-based network, such as the Internet.
  • 37. 37 Case Study 2 The second case study focuses on recent industry and media reports of alleged, highly organised Advanced Persistent Threats (APT's) undertaken by nation states. Ever since the publication of a story about the creation of the W32.Stuxnet worm by the USA and Israeli security forces, the presence of APT's have been within the public forum. However, the creation of such complex malware is highly technical and only targets specific systems. In the Symantec dossier on the w32.Stuxnet worm, the authors comment that; "Stuxnet is a threat targeting a specific industrial control system likely in Iran, such as a gas pipeline or power plant. The ultimate goal of Stuxnet is to sabotage that facility by reprogramming programmable logic controllers (PLCs) to operate as the attackers intend them to, most likely out of their specified boundaries." (Falliere, Murchu, & Chien, 2011) More recently the Mandiant has published a report on the activities of Unit 61398 of the Chinese army. Mandiant maintain that for at least the past 7 years, this unit has been systematically attacking the infrastructures of the western world, gaining unauthorised access and stealing the IPR of a broad range of industries. From the executive summary of the most recent report Mandiant have this damning indictment; "Our analysis has led us to conclude that APT1 is likely government- sponsored and one of the most persistent of China’s cyber threat actors. We believe that APT1 is able to wage such a long-running and extensive cyber espionage campaign in large part because it receives direct government support. In seeking to identify the organization behind this activity, our research found that People’s Liberation Army (PLA’s) Unit 61398 is similar to APT1 in its mission, capabilities, and resources. PLA Unit 61398 is also located in precisely the same area from which APT1 activity appears to originate." (Mandiant, 2013). Additionally, controversy rages within the UK and the western world about the possibility that non-national companies supporting the IS infrastructure, are spying and stealing IPR. Huawei, one of the largest telecommunications companies in the world, provides IS services to a number of western governments and companies but has been accused of spying and stealing IPR from its customers. (Osawa, 2013)
  • 38. 38 The Internet has several similar stories concerning suspected attacks and exploits by APT's. However, rather than relying on the services of specialist programmers and knowledge combined with trial runs in test environments, this case study is based on the hackers having only generalised technical knowledge and an ability to utilise the information contained within the Deep Web. A rogue nation state, terrorists or hacktivists plot to disrupt the critical national infrastructure of a western nation using Deep Web, OSINT and hacking techniques. Through the use of foot-printing techniques the culprits identify a number of lightly protected national assets as well as the logon details of multiple social media user accounts. As described previously this information is readily available on the Deep Web and can be readily extracted and aggregated. At a specified point in time the hackers initiate phase 1 of the attack whereby they attack and disable the networks of several public and corporate services. An example of this could be the disruption to the management and logistic software for high street supermarket chains. Without the ability to plan and execute deliveries from the central warehouses to the retail outlets the shops would begin to run low on goods. In addition the hackers also initiate Distributed Denial of Service attacks on better protected national assets (e.g. the police, government websites and the military) which would hamper their ability to effectively communicate with each other and the public. At the same time phase 2 is initiated whereby a number of false rumours are circulated using the compromised social media accounts. Examples of the type of rumour could be "Low crop yields cause supermarkets to run short of food" or "Government hiding extent of food shortage to avoid panicking public". The rumours are picked up by and forwarded by other social media users and rapidly spread in a "viral" fashion which causes panic buying at the supermarkets, compounding the problem of supply. A real-life example of this scenario is the panic buying of fuel during the threat of a tanker driver strike in March 2012. (BBC, 2012) The government are forced to use public resources (police, PCSO's etc.) to guard resources and maintain order, and may consider suspending such internet services
  • 39. 39 which would further inflame the situation as it would give credence to rumours of government complicity in the crisis. In this scenario it is easy to see how a country could quickly become paralysed through the use of minimal technical resources and knowledge. Such actions could be used by hostile nation states to divert attention from strategic military actions (e.g. the invasion of another country) or by terrorists in order to blackmail governments into releasing prisoners or using their influence at an international level (e.g. the United Nations) to the benefit of the perpetrators.
  • 40. 40 Section 4 – Survey Design, Data Collection and Analysis In order to determine the level of awareness about Deep Web and OSINT-based attacks and how effective current IS Security controls framework is in protecting organisations from them, a survey was used to poll a selection of IS Security professionals. The use of the term "Controls Framework" is used in this paper to describe the logical, physical and managerial IS Security controls of an organisation. 4.1 Selection and justification of the data collection tool There are a number of data collection tools that could have been used to collect the primary data used in this paper. Amongst the most appropriate are: 1. Structured Person to Person Interviews 2. Telephone Interviews 3. Postal Questionnaire 4. Electronic Survey Description of Data collection tools Person to person Structured interviews Structured interviews are the most flexible, and perhaps the most prestigious of all research techniques. The respondent is contacted in advance to confirm their participation and a date arranged. On this date the researcher meets with the respondent and use a pre-defined questionnaire to interview them whilst their answers are recorded (either transcribed or by using Audio Visual equipment). The same questionnaire is utilised for all of the respondents so that a direct comparison of responses can be made and ambiguities kept to a minimum. Interviews are excellent form of primary research but can be expensive to undertake and, due to the time limitations involved, it can be difficult to collate a statistically relevant sample of respondents. Also, if they sample of respondents is not sufficiently large, the views of one respondent may "skew" the results of the group towards erroneous conclusions.
  • 41. 41 Telephone Interviews A telephone questionnaire is similar to person-to-person interviews in that it follows a structured format that is determined in advance. The respondent is again contacted in advance, usually be letter or email, which is then followed up with a telephone call to confirm their participation and to arrange a suitable time. The respondent is then telephoned at the agreed time and the interview is conducted. Telephone interviews have an advantage over postal questionnaires in that the data is collected directly from the respondent who can be asked to clarify ambiguous points. However, it can be perceived that this technique does carry with it a certain stigma from overuse by telemarketing companies and may not necessarily communicate the gravitas required of the research. Postal Questionnaire A postal questionnaire is an effective way of obtaining qualitative primary research. It follows a number of stages from conception to collating the data, and includes piloting the questionnaire (to eliminate any possible mistakes such as vague or misleading questions) and tabulating the results. For the option of using a questionnaire to be considered it would have to incorporate a large proportion of respondents which is time consuming and costly. In addition, the response rate can not be guaranteed and this method does not have the flexibility afforded by electronic methods. Electronic Survey According to the book "Mail and Internet surveys" by Don A. Dillman the best method is an electronic survey as utilises the power of the Internet and computers to disseminate and tailor the questionnaire. (Dillman, 2000) In addition it adds a gravitas to the study as all of the targeted respondents are technology professionals. In order to objectively ascertain the best method the options were assessed against a set of criteria. These criteria are:
  • 42. 42 1. Cost: Which of the options offers the best chance of providing the most number of quality responses in the amount and scope of information for the financial outlay? 2. Depth: Which option can cover all of the areas identified in the depth required? 3. Flexibility: Which option is best at covering unforeseen circumstances such as unpredicted or unexpected answers? 4. Time: Which option will guarantee the return of the information within the time constraints to enable the author to make use of it? Also, what amount of time will a respondent have to answer the survey? Will an electronic form be easier to complete than a paper one? It was decided to use an objective technique such as the Ishikawa Quality Function Deployment (QFD) chart (Inwood & Hammond, 1993) but adjusted to suit the purpose for determining the best method to use. The decision matrix uses a similar scoring system to the QFD chart and is demonstrated below: Key Strong correlation of criteria against method = ++ Correlation between criteria and method = + No correlation between criteria and method = 0 Negative correlation between criteria and method = - Strong negative correlation between criteria and method = - - Person to person interviews Telephone Interviews Postal Questionnaire Electronic Survey Cost - + + ++ Depth ++ + - + Flexibility + + -- ++ Time + + + + ++
  • 43. 43 Total 4 5 -1 7 Table 2: QFD chart for selection of data collection tool From the chart it was determined that an electronic survey was the most suitable research method to use and primary research was collected using this method. 4.2 Survey Design Once the research method had been agreed upon the next step was to design the questionnaire and select the best target group. It was decided that the survey should be directed at IS Security practitioners from within the IS Security community. The rationale for this was that the perception of Deep Web and OSINT is still very low, certainly within the general populous, and so to send an electronic survey to a general audience may not generate a suitably informed response. It must be emphasised that, whilst this was not intended to be a representative sample, the results give a clear indication of the opinion and depth of feeling within the chosen sector. Consideration for the type of questions to be asked included:  The size and type of organisation that the respondent works for;  Awareness of the Deep Web and OSINT within the organisation;  Activities that would make the organisation a target for attack;  Awareness of availability of publicly held information about the organisation  Attempts to analyse publicly held information  Current managerial, physical and logical IS Security controls;  Additional organisational security controls It was also considered important that the respondents have an opportunity to voice their opinions and to add any comments as to what they consider are important controls for controlling and deterring Deep Web and OSINT-based attacks. With these questions in mind the questionnaire was into 6 subject areas: 1. Introduction 2. The Organisation 3. Deep Web and Open Source Intelligence
  • 44. 44 4. Organisational Information Security Controls 5. Additional Organisational Controls 6. Concluding remarks A sample of the questionnaire can be seen in Appendix 3. Questions about the country in which the respondent is located were omitted from the final survey as it was considered that Deep Web and OSINT-based attacks are not geography specific. Once the final draft of the survey had been approved the URL was finalised and included in a blanket email to the selected target audience.
  • 45. 45 4.3 Use of the ISO 27001 standard In order to provide a measure of definition within the survey, the respondents were asked to compare their organisation's security policy to a common IS Security standard. For this purpose the standard selected was ISO 27001, published by the ISO/IEC (International Standards Organisation / International Electrotechnical Commission). ISO 27001, "or ISO/IEC 27001:2005 – Information Technology – Security techniques – Information Security Management Systems – Requirements" to give it the full title, is a widely published and accepted standard that, according to the ISO/IEC), "..specifies the requirements for establishing, implementing, operating, monitoring, reviewing, maintaining and improving a documented Information Security Management System within the context of the organization's overall business risks. It specifies requirements for the implementation of security controls customized to the needs of individual organizations or parts thereof." (ISO/IEC, 2012) Organisations are able to certify themselves against the ISO 27001 standard as an external validation of the measure of IS Security they have in place. The question of whether any of the respondents had attained the standard was not included within the survey, and none of the respondents within the survey confirmed that they had attained certification. The control objectives included within the standard are:  Security Policy  Organisation of Information Security  Asset Management  Human Resources Security  Physical Environment Security  Communications and Operations Management  Access Control  Information Systems Acquisition, Development and Control  Information Security Incident Reporting  Business Continuity Management  Compliance
  • 46. 46 Properly implemented the controls detailed within the standard should give a large measure of protection against IS Security threats. The standard is under review at present and in 2013 the ISO issued a final draft which details a number of new controls which address advances in technology and systems management (e.g. the greater use of external service providers). However, it is understood that none have been included that specifically address the threat of data exfiltration or OSINT-based attacks.
  • 47. 47 4.4 Analysis of survey results Over 400 IS Security professionals were emailed with an explanatory note of the reason for the research and the URL to the questionnaire There were 74 respondents in total, although some chose to skip certain questions, either for reasons of security or relevance to their particular sector. The results were analysed and have been presented in the following paragraphs. The sanitised raw data has been included in the appendices. Question 1: How Many Full-time Employees currently work for your organisation? The largest group of respondents (41%) was the 5000+ bracket, followed by the 1 to 100 bracket (21%). This is in line with the supposition that the majority of respondents will be employed in large organisations, or are security consultants on contract to large organisations, and typically have the requirement and resources to manage IT Security within their operation. Question 2: What is the primary industry sector that your organisation operates in? The largest group of respondents are within Information Technology (27%), followed by Financial Services (23%). Analysis of the third group (Other) indicated that these were primarily self–employed IS Security consultants. It is suggested that the rationale for so many responses from the Financial Services sector is that they have a greater requirement for compliance and for external validation than the majority of the other sectors. Additionally, there is a perception that these are a greater target for hackers since there are greater rewards (e.g. financial or in terms of data value) if the control framework can be compromised. Question 3: At what level would you currently rate awareness of the Deep Web and OSINT within your organisation? Not surprisingly, the majority of the respondents rated awareness of the Deep Web and OSINT-based attacks within their organisations as Low or Negligible (39% and 28% respectively). This may be because of the low media coverage of Deep Web
  • 48. 48 and OSINT-based attacks, in relation to other attacks such Technical Intrusion or Social Media Impersonation. Question 4: Is your organisation engaged in technological or controversial activities that would make it a high profile target of criminals or hacktivists? The results for this question were fairly evenly split with approximately half of the respondents' organisations considering that they are involved in activities that would make them a potential target for crime or hacktivism. Given that almost half of the respondents (47%) answered Yes to this question and, in conjunction with the result of the previous question (i.e. low awareness), there is a high probability that Deep Web and OSINT-based attacks could proliferate in these "perfect storm" conditions. Question 5: How would you currently rate the threat of Deep Web attacks (i.e. the aggregated data analysis of the publicly available information) to your organisation? The largest group of respondents to this question rated the threat as medium, with low being the second highest group. This can be attributed to a number of factors, either: 1 Those organisations that responded have assessed the threat of Deep Web and OSINT attacks and implemented appropriate controls to mitigate it; 2 The threat has not been properly assessed; 3 The belief that their organisation benefits from the "Security through Obscurity" principle, namely that there are several organisations out here and that they are not high profile enough to warrant attacking. This last point is particularly erroneous due to the nature of Deep Web and OSINT based threats. the attack. If a criminal identifies the organisation as a potential target through Deep Web harvesting they will attack regardless of how many other organisations exist. Question 6: Has your organisation ever carried out any analysis of the publicly available information on your organisation?
  • 49. 49 Over half the respondents have carried out an analysis of the publicly available information for their organisation. This is only to be expected in the media-centric world in which businesses currently operate, where poor Public Relations (PR) can adversely affect share price. Question 7: Are you aware of any targeted attacks (e.g. phishing emails or similar) on high profile individuals in your organisation, or on your organisation as a whole? Over half the respondents could confirm that there had been a targeted attack on their organisation or on high profile individuals within it. This was an expected result and can possibly be attributed due to the proliferation of material available on the internet, which allows cyber criminals to inform their attacks. Alternatively it could be because the respondents to the survey are the security officers in large organisations that are more likely to be the target of criminals or hacktivists due to their profile or activities. Question 8: Has your organisation implemented an Information Systems Security policy? Reassuringly the majority of respondents confirmed that their organisation had implemented an IS Security policy. In the modern corporate business culture that requires a high level of regulation and compliance, this is an expected control. Question 9: Has it been designed to incorporate good practice over its logical, physical and managerial controls? (For this purpose as defined by ISO 27001) Again, the majority of respondents confirmed that they had utilised a standard, typically the ISO 27001 standard, when designing the organisation's IS Security policy. This is because it is well established and is seen as a de-facto measure of IS Security. Question 10: Has the policy been approved and supported by senior management? Again the majority of respondents confirmed that the IS Security policy had been approved by the organisation's Senior Management Team. It is suggested that this
  • 50. 50 is because the policy is a corporate-level document and requires the approval and support of the organisation's executive in order to be effective. Question 11: Has the policy been distributed to all staff? 78% of the respondents confirmed that the policy had been distributed to all staff. Again this in keeping with the answers from the previous question in that the policy is an organisation-wide document. The circulation of this policy is important in the promotion of an IS Security aware culture within the organisation and, whilst not explicit in the prevention of Deep Web and OSINT-based attacks, is an important aspect in the security framework for preventing the same. Question 12: Has the policy been confirmed as read and understood by all staff (either through return of a signed acceptance sheet or via electronic confirmation)? Confirmation that the IS Security policy has been received and understood is an important aspect of its promotion. 53% of the respondents confirmed that the policy had been confirmed as read and understood with a further 23% confirming that it had been partially received. More worryingly 20% of the respondents said that it had not been, or could not be, confirmed, which suggests that 1 in 5 of all users have not read or do not understand the dictates and restrictions of the policy. Education and confirmation of employees understanding about their Information Security responsibilities is an important aspect of the IS Security framework and imperative to preventing Deep Web and OSINT-based attacks. Question 13: Is the IS Security policy enforced by logical controls? For example; processes that restricts access to data to authorised personnel, database encryption, email monitoring or Data Centric security applications such as Boole server The majority of respondents (68%) confirmed that the IS Security policy is enforced through logical controls (e.g. processes that restricts access to data to authorised personnel, database encryption, email monitoring or Data Centric security applications such as Boole server) which is essential in the provision of a mature IS Security framework. 30% of the respondents confirmed that the policy was not
  • 51. 51 or only partially enforced by logical controls, which suggests a disconnect between the managerial and operational controls. In such organisations it is easier to circumvent control procedures and manipulate processes and data to their own ends.
  • 52. 52 Question 14: Is staff compliance with the policy regularly tested through measures such as user access and internal network penetration testing, reviews of audit and access logs? Again over half the respondents (52%) confirmed that staff compliance with the IS Security policy is regularly tested. More disturbingly, 40% confirmed that they did not test, or only partially tested, compliance with the policy. Compliance with the defined controls is crucial in all aspects of IS Security in order to prevent cyber attacks, OSINT-based or otherwise. Question 15: How often do staff receive training on IS Security? From the graph it is possible to see that, by a large majority, staff in the respondents organisations receive annual training in IS Security (38%). The next largest group is that which receive training on induction (27%), although it was not possible to determine from the results if this is includes staff that receive training at induction and then none afterwards. A large percentage (18%) receive no training which would leave the organisations involved susceptible to all manner of IS attacks, and particularly those of a subtle nature such as Deep Web or OSINT, as they may not be perceived as threatening. 0 5 10 15 20 25 Monthly Quarterly 6 Months Annually On Induction No Training IS Security Training
  • 53. 53 Question 16: How often are IS Security messages published within the organisation? The majority of respondents confirmed that security messages are published in response to a security event (43%) with the second largest group of respondents having an on-going security awareness initiative of some description (27%). One of the most powerful controls within the IS Security armoury is the fostering and maintaining of an IS Security culture. This is often confused with paranoia about security events that can obstruct the operation of the business, but a true IS Security awareness program is embedded within the culture of the organisation, reinforced through regular training programmes and enhances and protects the business operation. Question 17: Has the organisation implemented an Incident Management policy and procedure to minimise the impact of security incidents? The majority of respondents had confirmed they had an Incident Management policy in place, with one quarter of the respondents only partially implementing a policy and a small percentage not having implemented a policy at all. This is a fundamental control in regard to business resilience and is crucial in recovering from any type of incident. Any organisation that does not have a clear Incident Management policy, or has an incomplete or untested one, will the have 0 5 10 15 20 25 30 Continually Monthly Quarterly Every 6 Months As Required Never Security Messages
  • 54. 54 the impact and effects of a cyber attack compounded and take longer and cost more to recover from. Question 18: Has the policy been invoked within the last 12 months? The majority of respondents (41%) confirmed that their Incident Management policy had been invoked within the last 12 months. A further 13% confirmed that there had been a partial invocation in the same period. This demonstrates the necessity of having properly developed and implemented Incident management procedures as over half of the respondents had need of their preparations. Question 19: Was the reason for invoking the policy identified an controls implemented to prevent re-occurrence? The greatest number of respondents in this question chose not to answer (43%), possibly because they did not wish to reveal the nature of the incident. The second largest group confirmed that they had identified and resolved the issue (37%). A documented, approved and implemented Incident Management policy requires effective post-event review to ensure that any issues are effectively mitigated to prevent reoccurrence.
  • 55. 55 Question 20: What was the impact of the incident on the organisation? From the chart it is possible to see that the majority of events were classified as minor (32%), with negligible as the second highest group (17%). It is impossible to say whether the implementation of an Incident Management policy was the deciding factor in mitigating these incidents from more severe responses. Question 21: Has your organisation implemented a Social Media policy that contains clear and concise guidance on what, how and where information about the organisation should be published? Approximately one half of the respondents confirmed that their organisations had implemented a Social Media policy describing how information on the organisations should be published. In the modern business age all organisations need to have a comprehensive approach to how information about itself is published and disseminated. With the growth in available media channels, both those over which it can exercise control (e.g. company website, press releases, corporate social media sites, blogs) and those where it has less control (e.g. staff personal social media pages, consumer forums) need to be managed and guided to ensure that the overall internet presence is positive. 0 5 10 15 20 25 Negligible Minor Moderate Significant Not Applicable Event Severity
  • 56. 56 Those without a Social Media policy, or a partial one, (approx. 52% of respondents) risk having possibly sensitive information being disseminated to the Internet which could be used to formulate a Deep Web type of attack. Question 22: Has your organisation implemented an Email policy? Again the overwhelming majority of respondent's confirmed that there is an implemented policy that specifies the use of the facility with the organisation. This is an important control in protecting against Deep Web and OSINT-based attacks because the information sent in emails can be harvested and used to attack the organisation. Question 23: Have your staff been trained on good email security practice such as not opening suspicious emails or emails from unsolicited sources? Two thirds of the respondents confirmed that their organisations have been trained on good security practices. However, it is possible to compare this with the results from question 15 and consider, once the staff had been trained, was there any follow up training or a continuation training programme? Clear and concise policies, appropriate technical controls and staff training are the three most important elements in preventing Deep Web or OSINT-based attacks. Question 24: Has your organisation implemented gateway controls to quarantine suspicious emails and alert the intended recipient for release? The majority of respondents have implemented this control which is important in ensuring that suspicious emails and attachments are not imported onto the organisation's network. Those organisations that did not have this control may have implemented other compensatory controls, such as limiting incoming attachment types (e.g. java script) Question 25: Does your organisation have a documented and published disciplinary process for infringements of IS Security controls or policies? Again, the majority of organisation's have responded that this control is in place, but perhaps a more relevant question should have been "how many times have
  • 57. 57 staff been disciplined for security breaches?" Whilst it is important to have clear, concise policies it is also important that those polices are enforced with appropriate sanctions against the perpetrator, and for those sanctions to be clear to all staff. Having this control in place will ensure that staff do not inadvertently publish information as the threat of sanction will enforce the mind-set. Question 26: Do you regularly update your malware software in order to ensure that you have the greatest possibility of detecting and stopping malicious code or Trojan programmes designed to circumvent your security? Unsurprisingly, the majority of responses were positive for the question (91%) with a small minority answering no or partially. The minority answers may due to the confusion over the wording of the question as most software products have the capability of automatically updating with definition files. Alternatively, in systems requiring high availability, the infrastructure may be operated in a protected "shell" restricting all updates (malicious or otherwise) to ensure continuity. Updates are tested first on a mirrored system and then implemented at low demand times, or in a phased process to ensure continuity of service. This control is important because, as we have seen in case study 1, the network was infected by the Trojan and root kit during the "lag" time between virus being released and the patch being applied. Question 27: Have you implemented any of the following Data Leakage Prevention (DLP) controls within your organisation? (Please tick all that apply) Of the 45 respondents three quarters confirmed that they have implemented additional DLP controls within their organisation. These were evenly split across hardware, software and managerial controls with two respondents commenting that one had implemented an audit control over flash memory devices issued to its staff, and the other had blocked access to web mail and file sharing sites. An area for further research might include probing what manner of additional controls have been implemented and how effective they are.
  • 58. 58 Question 28: Do you regularly measure and report to the Senior Management Team or Executive on the effectiveness of IS controls? 43% of the respondents regularly report on the effectiveness of the IS controls to the organisation's executive body, but more disturbingly 57% do not. In order for any control framework to be effective it must (and be seen to have) senior management support. It is possible that this figure is actually higher, with a greater percentage of the respondents incorporating Is Security metrics within an overall management report rather than a specific security reported that is delivered regularly. Question 29: In your opinion, what control/s would be most effective in securing your organisation against a Deep Web or OSINT type attack? Of the 75 respondents to the survey 23 offered additional comments regarding Deep Web attacks. Of the 23 respondents;  8 suggested that additional training is required to raise awareness of the issue within the employees and senior management of the organisation, a total of 34%.  a similar number suggested the implementation of greater technical controls (firewall technologies, SIEM systems, etc.) for limiting data leakage. Other suggestions included the regular scanning of Deep Web sites and other forums (e.g. pastebin) for compromised credentials or other confidential information, regular pen testing and better techniques for eliciting senior management understanding and support.
  • 59. 59 Section 5 – Conclusions and Recommendations After describing the Deep Web and OSINT-based attacks and the type of tools and attacks that can be launched against organisations, illuminating them with case studies and collecting and analysing the results of the survey, it is now possible to draw conclusions on how effective the controls framework is and whether additional controls are necessary. 5.1 Conclusions From the survey results it is possible draw the following conclusions: 1. All of the organisations have implemented an IS Security controls framework which incorporates logical, physical and managerial controls that underpins the IS Security policy. Additionally, most organisations have implemented supplementary controls such as Social Media policies, email policies, DLP software, email monitoring or restriction of USB ports. This is in response to the realisation that threats to data security extend wider than the organisation's internal network. However, OSINT-based attacks target resources that has already been "leaked" outside of the organisation and, whilst the controls framework will assist in protecting the organisation from a direct attack, it does not directly address the risk. 2. Whilst over half the respondents have carried out an analysis of the publicly available information for their organisation, a large proportion have not. This was expected as the proliferation of media streams continues and more organisations struggle to control how information about themselves is published. There is also greater appreciation of the risk of loss of reputation that can adversely affect share price. 3. Awareness of the threat from Deep Web and OSINT-based attacks was significant, but awareness was greatest in those organisations that are either highly regulated (e.g. financial services) or are more technically informed (e.g. Information Technology). As the survey was directed at IS Security personnel the majority of respondents were aware of the risks but that appreciation was not reflected in the senior management. In addition
  • 60. 60 almost half of the respondents confirmed that their organisations were involved in activities that would make them a potential target for crime or hacktivism. These two factors make for perfect conditions in fostering an OSINT-based attack. 4. Most organisations undertake some form of regular training of their staff in IS Security matters, but this may be in regard to required compliance with applicable legislation rather than a realisation of the need to continually update staff on the threat to IS Security. By far the most common training programme is conducted annually (38%) or on induction into the organisation (27%). A large percentage (18%) receive no training which would leave the organisations involved susceptible to all manner of IS attacks, and particularly those of a subtle nature such as Deep Web or OSINT-based, as they may not be perceived as a threat. 5. As expected, most organisations had either fully or partially developed and implemented an Incident Management policy, There were also a minority that confirmed they had no such policy in place. Most respondents also confirmed that they had experienced a security incident within the last 12 months. This is a fundamental control in regard to business resilience and is crucial in the successful recovery from any type of security or business continuity incident . Any organisation that does not have a clear Incident Management policy, or has an incomplete or untested one, will the have the impact and effects of a Deep Web type of attack compounded, and it will take longer and cost more to recover from.
  • 61. 61 5.2 Recommendations As mentioned previously a number of the respondents confirmed that they had implemented additional controls to address specific or perceived risks. However, in order to be effective against Deep Web and OSINT-based attacks, organisations should consider undertaking additional control measures that specifically address Deep Web and OSINT-based attacks, such as: a) Review of the organisation's online presence and the logical, physical and managerial controls that support the IS Security policy and protect against accidental and deliberate disclosure of information. Most organisations will have implemented controls to prevent data loss or disclosure in a reactive manner, in order to address perceived risks or in response to a security incident. It is suggested that, in order to develop this framework to prevent Deep Web and OSINT-based attacks, the organisation should audit the information available and controls in place so that it can evaluate the current exposure of the organisation to Deep Web and OSINT-based attacks, identify any specific risks and suggest controls to mitigate them. As part of this dissertation an audit programme for evaluating and organising controls to prevent OSINT-type attacks on an organisation has been included in Appendix 1. b) Instigate a corporate awareness and IS Security training campaign for all of its users on the dangers of sharing to much information on the Internet. Some examples of good practice are the UK Ministry of Defence's (MOD) "Think before you Tweet" campaign (Defence Management, 2011) or the Belgian Internet Security group "Safe Internet Banking" campaigns "Amazing mind reader reveals his 'gift" and "New Best Friend" (Belgian Internet Security Group, 2013). Excellent training materials are available from the Get Safe Online website (Get Safe Online, 2013) that elucidate the threat from revealing to much information online and offer guidance on how to prevent it.
  • 62. 62 c) Review, develop and test the organisation's Incident Management Policy to ensure that should an incident occur it can be investigated, negated and resolved as quickly as possible with the minimum of disruption to the business operation.
  • 63. 63 5.3 Application of Recommendations to Case Studies In the case studies it is possible to see that the use of the audit program would have identified a number of areas where additional controls should have been implemented to prevent the loss of the data. Case Study 1 In case study 1 several of the management team opened an email from an unknown source and activated hyper links within it which allowed the Trojan to be downloaded and infiltrate the network. As part of the IS Security controls, the Is Manager should have implemented a quarantine area for suspicious emails that would allow each manger to verify the validity of the email before opening it. In addition the organisation should have incorporated an IS Communications policy which detailed how all aspects of external communication should be securely handled. The policy should also detail how user should not download attachments from unsolicited emails or click on hyper links contained therein. Good practice dictates that the user should either; 1.) Search for the same offer the senders website using a proprietary search engine. A variation of this is to copy the text of the URL and paste it as the search string in the search engine website, This may identify if it is a scam that has been attempted previously; or 2.) Hover the cursor over link to identify the URL that is associated with the link and confirm whether it is associated with the sender of the email; In both cases if there is any doubt as to the validity of the link, it should not be activated. The organisation should also undertake a review of its publicly available information to understand how the attack was crafted and how they are exposed to similar attacks. This attack was possible through the use of information that was publicly available on the Internet. Whilst it might not be possible to remove archive data, the results of any review can be used to inform the corporate risk register so that any future emails can be treated appropriately.
  • 64. 64 The possibility of utilising different anti-malware products on the Internet and mail gateways and on the network should also be investigated in order to increase the chances of detecting "zero-day" malware. Software vendors have differing priorities and response times to threats. By diversifying the products used to protect the network, the coverage is increased and the threat reduced. Inevitably, the costs of software licences and maintenance increase but this needs to be considered in conjunction with the cost of another security event. It is possible to evaluate this through the use of a "Return on Security Investment " (ROSI) equation such as demonstrated in figure 7. This example is taken from the European Network and Information Security Agency website (ENISA, 2012) Figure 7 Return on Security Investment equation With ROSI equations, the higher the ROSI figure, the more value there is in undertaking the investment. For example, if the cost of a solution is £50,000 and the Monetary loss reduction is £200,000 then the ROSI index is 3, namely it is more cost effective to implement the software. However, if the Monetary loss reduction is only £75,000 and the solution is £50,000, the ROSI figure is only 0.5, so there is little value in implementing the solution. Equations such as this are a useful guide in justifying expenditure on security measures but can be subjective (it is difficult to place a value on an intangible asset such as corporate reputation). Additionally other factors need to be taken into consideration, such as technical coordination with the existing architecture and synergy with the organisation's culture
  • 65. 65 Case Study 2 In the second case study we see how Deep web and OSINT-based attacks can be used to inform attacks that interfere with vital public services and cause major disruption to a nations infrastructure. Applying the audit program here is not as straightforward as the previous example because the threat is spread across several organisations, however, the principles remain the same. Essentially each organisation should follow the steps of the audit programme namely, review of information available about each organisation, review and validate the controls designed to prevent the unauthorised disclosure information and promote good practice through training and awareness. What is required in this example is an overarching organisation or government body to enforce the implementation of the programme to ensure a more orchestrated response to Deep Web and OSINT-based threats. The UK government has recognised the growing importance of a cohesive IS Security response and, according to the UK Cyber Security Strategy 2011, is investing £650 in cyber security over the next four years (Cabinet Office, 2011). Whilst the threat of Deep Web or OSINT-based attacks are not specifically mentioned, there is a call for a more joined-up response from all parties (government, business and individuals).
  • 66. 66 5.4 Summation Online information is a fact of life. The commoditisation of IS services such as SaaS, cloud storage and 2nd and 3rd party service providers mean the de- perimiterisation of business services. Combined with the development of multiple media streams the digital footprint of organisations is getting bigger. In this situation the effectiveness of technical controls in stopping Deep Web and OSINT- based attacks is limited, as the attack is based on publicly available information rather than as a direct attack on the network. There are two facets to an organisation's digital footprint, the good side and the bad. Unstructured data leakage is not controlled and is bad for business as it can foster a situation which encourages OSINT-based attacks. However, a mature digital footprint is where information is spread via a controlled release and can be very beneficial to an organisation. Successful organisations will thrive if they can effectively manage the information available about it, and this is attained through appropriate logical, physical and managerial controls combined with a robust review process and comprehensive staff training that promotes good security practice. Through the use of the audit program it is possible to assess how an organisation is promoted and perceived on the internet, how the information arrives in the public domain and how it can be restricted to dissuade Deep Web and OSINT-based attacks.