Master Thesis Security in Distributed Databases- Ian Lee

Security in Distributed Databases
A DIRECTED STUDY PROJECT SUBMITTED TO THE FACULTY
OF THE GRADUATE SCHOOL OF COMPUTER INFORMATION SYSTEMS IN
PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE IN INFORMATION SYSTEMS
TO DR. EDWIN R. OTTO
CIS590 ASYCHRONOUS ONLINE
BY
IAN N. LEE
NOVEMBER 2002
1

ABSTRACT
Practically all large computer database (DB) systems are distributed. Large business
systems must support various users running common applications across different
locations. A distributed database management system (DDBMS) encompasses these
applications, their underlying support software, the hardware they are run on, and the
communication links that connect the nodes. The largest and best-known DDBMS is the
set of computers, software, and services comprising the Internet, which is so pervasive
that it connects with almost every existing DDBMS today.
Security embraces many aspects of a DDBMS, including hardware design, operating
systems, networks, compilers, user applications programs, and the users themselves.
Vulnerabilities of computer systems range from the possibility of a trusted employee’s
selling (or being forced to reveal) secrets to a competitor, disk failures that produce
system crashes, unauthorized operating system penetration, inferences about confidential
data through carefully chosen queries posed to a DB, loss of data due to natural causes,
acquisition of data through wiretapping, denying access to the DBs by flooding the
system, or harming the DBs with malicious software.
Many textbooks and periodicals were used, and many subjects, including DB security
experts and former hackers, were interviewed for this research. Many security theories
and products for DDBMSs were critiqued. Also, trends of computer crimes and statistics
were analyzed. The conclusion is somewhat surprising. DDBMS security is not
improving. In fact, it is getting more complex and problematic.
2

CHAPTER ONE
INTRODUCTION
Context of the Problem
Security is a goal in every organization. In today’s computer driven world, every
business has some sensitive and crucial data and processes that need to be secured. For
some organizations, it is absolute that they have a “breach-free” DB. Security is an
emerging issue due to the rise of internal and external hackers, terrorists, and thieves who
are costing businesses billions of dollars each year. Many organizations banks and
hospitals, are becoming less concerned about cutting-edge technology actually working
than they are about the security of their infrastructures. While the rise of DB computer
technology used in the workplace has been exponential, business security methods and
products have been minimal until recent years. More and more people are recognizing
that their DBs are extremely valuable but vulnerable resources, and need to apply the
appropriate safety measures. As with the protection of airports due to the recent terrorist
events, the computer DB security industry is just beginning to catch up with its need for
better protection of hardware, software, data, and networks.
Ensuring the security of data in large DBs can be a difficult and expensive task. The
level of security implementation should depend upon how valuable or sensitive the data
is. One million dollars would be guarded more carefully than a couple of dollars. Thus,
when deciding on the level of security to impose upon a DB, it is necessary to consider
the implications of this violation for the organization that owns the data. If, for example,
3

criminals gained access to a police computer system in which details of surveillance
operations, and such were stored, the consequences would be very serious.
There are so many vendors and products that provide security to DDBMSs. Between DB
system security measures -- security kernels, access control, encryption, cryptography,
firewalls, intrusion detection systems, auditing mechanisms -- it seems as if computer DB
security is an issue of yesterday. Why then, are DDBMSs so insecure? Why are we
hearing more and more computer security breaches in the media? Why are things not
improving? These are some of the questions this paper attempts to probe.
Statement of the Problem
A computerized DB is built to make business easier and faster. However, security is most
concerned, particularly when it involves a distributed system. Compared to centralized
database management systems (CDBMSs), the component partitions of DDBMSs are
distributed over various nodes of the network. Depending on the specific update and
retrieval traffic, a distributed DB is usually better than a centralized system. However,
DDBMS is more convoluted and harder to control. Therefore, it can be less secured and
offers more opportunities for intruders than a CDBMS.
Although security is a great concern, a distributed DB is necessary and better in many
ways. This dispersion is especially important for large and complex organizations such
as, researches, educational institutions, and human services. DDBMSs allow different
organizations to connect and “shake hands” with one another, although they might use
4

different hardware, software, and even DBs. In a DDBMS, organizations can interchange
the role of being the server or the client depending on who is giving or requesting the
information.
The purpose of this proposal is to discuss several trends and concerns pertaining to
security in DDBMSs. Businesses should design and implement security systems that
enable people to perform their functions, excluding unauthorized individuals. The paper
will then focus on the Internet1
and challenges inherent in protecting both these
DDBMSs.
The questions to be addressed are as follows:
• What is a database management system?
• What is a DDBMS?
• What are the advantages of DDBMS over CDBMS?
• What are the disadvantages of DDBMS?
• What strategies can be used by companies to ensure security in their DDBMSs?
• How can companies improve their distributed DB security processes, and what
are the implications of these improved security processes on their businesses and
internal information?
1
For reasons of simplification, the words “Internet” and “World Wide Web” are used synonymously.
5

• What are the advantages and disadvantages of the many security techniques and
products developed for protecting DDBMSs?
• What are the motivations of a hacker?
• What is the future of security in DDBMSs in supporting organizations?
Significance of the Study
In today’s business world, information is more readily available than ever. Companies’
DBs are inescapably connected, since businesses worldwide are increasingly dependent
on digital communications. But the passion for speed and convenience has a price, and
that is the increased exposure to security breaches. Companies need to understand and
prepare for these kinds of risks.
DDBMSs are very popular. One segment that is growing at an enormous rate is the
Internet. By the year 2005, more than half a billion Internet users will be involved in
Internet activities. There are over 100 million host computers connected to the Internet2
(Jamsa, 2002, p.37). Perhaps the Internet is growing too fast because it is full of security
loopholes that could have been prevented if only the developers had taken their time to
analyze potential problems.
Security is even more vulnerable when the Internet is running continuously. Users who
connect to the World Wide Web (WWW) via cable modems and Direct Service Line
2
For further information, see Appendix One: Statistics about the Growth of the Internet.
6

(DSL) never have to dial in to their online provider. With the constant availability of the
Internet, more opportunities for people to engage a little snooping would increase. These
nosey individuals can cause all kinds of mischief from peering into personal files to
planting malicious software such as worms, viruses, and Trojan horses.
So how can a business implement and maintain a secure DDBMS? Organizations have
been struggling with this issue for years and have professionals who analyze and locate
security vulnerabilities, yet we hear news about companies being penetrated daily. If no
one can maintain complete security, how can it be done? The goal, then, is not to have
100 percent security, but adequate security. Much like locking the car doors when a
customer is parked in a mall, there are many precautions one can take to prevent an
unauthorized person from walking in through a computer room.
This paper is addressed to both college students and to IT professionals, who have or
would like to have some basic understanding of computer security and/or DDBMSs.
Several sectors and professionals, such as, network security specialists, DB
administrators, computer consultants, security educators, and the industry, may also
benefit from this report. The government may also benefit from this report through the
Department of Homeland Security. After September 11, 2001, online traffic has been
monitored heavily to avoid online and DB terrorist attacks. Many terrorists would love to
get their hands on some of the United States’ top-secret information and to use it against
America. This intervention should be prevented at all cost.
Objectives of The Study
7

The main objectives here are to analyze the DDBMS security theories, issues, and
products, especially those concerning Internet and networking, and produce a meticulous
report of a student in a master’s program. Other objectives are:
• To delineate current DB security issues and strategies related to hardware,
software, and data.
• To identify the effects of these issues and specific DB security breaches
• To develop attackers’ profiles, and for what they are seeking.
• To analyze the trends of computer crimes
• To explore the future of DDBMSs.
Research Methodology - Review of the Literature3
for Secondary Research
Principles of Distributed Database Systems, by Ozsu, M. and Valduriez, P. provides a
clear and comprehensive understanding of the fundamentals of data communications
essential to information and DB management. The second edition explores the latest
trends of the Internet, Intranet, the Web, DDBMS, and e-security. It explicitly explains
information communications in the business environment.
Security In Computing, Second Edition, by Pfleeger, C. is a very detailed textbook
which covers viruses, worms, Trojan horses, firewalls, e-mail privacy, DB systems,
3
There are many other reference books that are used herein. They can be found in the WORKS CITED
section.
8

wireless networks, and their security, including administrating its installations of PCs,
UNIX, and Networked environments.
Implementing and Managing E-security, by Nash, A., Duane, W., Joseph C., and
Brink, D. This book discusses comprehensive coverage of this emerging technology that
uses digital certificates to secure Internet transactions. Beginning with an introduction to
cryptography, this book explains the technology that creates a public key infrastructure
(PKI), and outlines the necessary steps for implementing PKI in both business-to-
business (B2B) and business to consumer (B2C) environments.
Secure Computers and Networks: Analysis, Design, and Implementation, by Fisch,
E., and White, G. This book outlines the basic, most practical measures needed to take in
protecting DB systems from intrusion by all but the most talented, persistent hackers. It
enables CIOs, MIS department managers, systems and applications programmers, and a
variety of network configurations to maintain security surveillances.
Netware Security, by Steen, W. This book details how security is implemented in a
variety of computing environments and the ways in which its checkpoints can be avoided
by trespassers intent on doing harm. This book provides understandable descriptions of
all security basics and includes a list of third-party utilities for increasing or testing the
reliability of the DB.
Secondary Research – Other Literatures
Collected information also came from the Internet and bound literary periodicals.
Sources on DB security progress, new innovations, software, and hardware
9

improvements were researched, studied, and analyzed. Collected also was the statistical
information that compares its relationship to the business world and observes how this
change improved the management and economic goals of an organization. Other sources
collected from the Internet and periodicals helped to develop hackers’ profiles and to
identify tips and tricks in their trade. Several articles on DDBMSs security outline basic
security objectives, such as confidentiality, availability, integrity, authenticity, and usage.
These sources also included the following: wiretapping, hacking, message confidentiality
violations, and security implementation.
Primary Research – Interviews
Many DDBMS security experts and specialists as well as hackers who have broken into
many unauthorized DBs have been contacted regarding computer security. Websites,
phonebooks, and many libraries have been researched to examine organizations that
provide security for DDBMSs. Then, many non-profit organizations that regulate DB
security standards have also been contacted. As for the hackers, a friend revealed a story
about a person who was jailed for breaking into a bank’s DB system. This person also
transferred some funds into his bank account for two years before getting caught. In an
interview, and he disclosed the reason why it was so easy for him to commit these acts.
Luckily, he also led me to some of his friends who were also hackers. The interviews
with three former hackers have revealed the psychology and motivation behind their
actions.
Technical Advisor
10

The technical accuracy of this paper has been checked with many DB security experts,
especially Dr. J.W. Rider, who I taught me CIS515 AD (Strategic Planning for Database
Systems) during the Summer of 2002.
CHAPTER TWO
DISTRIBUTED DB SYSTEMS & COMPUTER SECURITY
What is a Database Management System?
According to the 1996 Computer Desktop Encyclopedia, database management systems
are defined as “software that controls the organization, storage, retrieval, security and
integrity of data in a database” (Freeman, 1996, p.208). They accept requests from the
applicants and instruct them on operating systems in order to transfer requested
information. DBMSs may work with traditional programming languages, such as SQL
and C++
, or they may include their own programming language for application
development. These languages might allow new categories of data to be added, deleted
from, or changed in the DB without any disruption to the current infrastructure.
What is a Distributed Database Management System?
DDBMS is a collection of multiple, logically interrelated databases distributed over a
computer network (Fryer, 1997, p. 154). A DDBMS is then defined as the software
management system that supports and protects the organization’s DB and makes the
distribution mostly transparent to the end-users. The two important terms in these
definitions are “logically interrelated (protocols for data collection and verification and
use are management functions)” and “distributed over a computer network (data is
11

stored in more than one location).”
DDBMSs can be classified according to a number of criteria. Technically, some of these
criteria are as follows: degree of coupling, interconnection structure, interdependence of
components, and synchronization between components (e.g., firewall, backups, and
security).
Degree of coupling refers to the proximity of the processing elements that are connected
to each other. This can be measured as the ratio of the amount of data swapping to the
amount of local processing used in performing a task. If the communication is done over
a DB network, there exists a weak coupling among the processing elements. However, if
components are shared, there exists a strong coupling. Shared components can be both
primary or secondary memory storage devices.
As for the interconnection structure, one can talk about those scenarios that have a
point-to-point interconnection between processing elements, as opposed to those which
use a common interconnection channel (Narins, 2000, p.203). The processing elements
might depend on each other quite strongly in the running of a task, or this
interdependence might be as minimal as passing messages at the beginning of an
execution and reporting the outcomes at the completion of the processing of the data.
Data exchange between processing elements might be maintained by synchronous or by
asynchronous means (Narins, 2000, p.237). But “handshaking” (making sure the other
12

side has received the correct packets of data and commands) and control must be under
control of the DDMS protocol and instruction. Note that some of these standards are not
entirely autonomous. For example, if the relationship between processing components is
synchronous, one would anticipate the processing elements to be greatly interdependent
and, possibly, to work in a strongly coupled fashion (Stollings, 1999, p.8).
It can be stated that the main reason behind distributed processing is to be better able to
solve the complicated and messy problems that organizations face today by using a
variation of the divide-and-conquer rule. If the necessary components’ support for the
distributed processing can be formed, then it might be possible to figure out these
complicated problems. This can be done simply by separating them into smaller pieces
and distributing them to different software groups that work on different nodes and
produce a system that runs on multiple processing elements, but can work efficiently
toward the execution of a common assignment.
Distributed processing means that the whereabouts of the data determines where the
application processing is completed. The application function must be “maneuvered” to
the place where the data is found. To access data located at the system 1, an application
program at that location is needed. To access data at system 2, there must exist another
program at that point. A logical process requiring data that is located at different sites,
therefore, has to be split into two processes with each progression implemented by a
different application program. The data of one site may have to be moved to another
location because processing the entire transaction – atomic unit of execution – becomes
13

harder to respect. Therefore, in this situation, the application may need to maintain data
integrity within the logical transaction. Usually, distributed processing involves two or
more computers connected together in a network. Application software, systems
software, and hardware are used to allow the enterprise to handle data that is divided
among the multiple compute locations.
Figure One: A Small DDMS:
Request Message
Rely Message
Message
What are the Advantages of Distributed Database Management Systems?
DDBS can be heterogeneous, give better response time, can use different architectures,
and have better performance, availability and reliability (system resilience for disaster
recovery). Also in comparison to a CDBMS, DDBMS can be “up”, “down”, or “partial”
at times. This is ideal for businesses that cannot afford the DB system to be completely
“down” (Schneier, 2000, p.122).
14
Client
Peer Peer
Server

Distributed DBs bring the benefits of distributed computing to the DB management
domain. A distributed computing system consists of a number of processing constituents,
not necessarily homogeneous, that are interconnected by a computer management
network and that cooperate in performing certain designated functions. As stated
previously, distributed computing systems partition a large, unmanageable problem into
smaller segments and solve it efficiently in a coordinated fashion. The financial viability
of this approach comes for two reasons: (1) more computer strength is harnessed to solve
an elaborate task, and (2) each autonomous processing element can be handled
independently and cultivate its own application.
Distributed processing is a better match up to the organization structure of today’s widely
dispersed enterprises, and such systems are more responsive and more reliable (e.g.,
better disaster recovery, control, etc.). More importantly, many of the current
applications and utilizations of computer technology are inherently distributed.
Multimedia applications such as medical imaging and manufacturing control systems are
two examples of such applications.
Most distributed DBMSs are designed to gain maximum benefit from data localization.
In other words, full benefits of slenderized contention and reduced communication
overhead can be obtained by a proper fragmentation and distribution of the database
(Stollings, 1999, p.7).
15

Many other advantages of DDBMSs range from sociological reasons for decentralization
to better economics. All of these can be distilled to four fundamentals that may also be
viewed as advantages of DDBMS technology. Some include the following: handling of
distributed data with different levels of transparency (transparent management of
distributed and replicated data), data independence, network transparency, replication
transparency, fragmentation transparency, and expanded availability and reliability
through distributed copies and backup transactions. Others that might be advantageous
include the following: easier expansion (expansion of the system in terms of adding more
data, increasing database sizes, or adding more processors which is much easier),
controlling redundancy, providing continuous storage for program objects and data
structures, permitting inferencing and actions, providing multiple user interfaces,
representing elaborate relationships among data, and sustaining integrity constraints.
More improvements might be the potential enforcing standards and the reduced
application development time (Narins, 2000, p.478).
What are the Disadvantages of Distributed Database Management Systems?
Technically, DDBMS is extremely complicated, and no vendor has all the solutions. One
problem is that data collection is not usually done instantaneously or in real-time. One
location might have to wait for the computer system to stop or close before it can
determine with what data the others have. Other problems are: how to distribute and
sustain data consistency, how to keep the communications costs down, how to avoid
deadlock situations, and how to handle data synchronization. Company guidelines need
to regulate how and where local data should be stored, named, accessed, updated, backed
16

up, restored, protected, and shared. Still more weaknesses of a DDBMS include the
following: location transparency (the DB design has to be performed so that the most
frequently accessed data is stored at as many nodes as possible so that access becomes
local), site autonomy (local data is owned and managed with local accountability),
referential integrity (the DDBMS has to ensure that the correct “referential” state exists
within the system among data that is represented), partition independence (a DDBMS
that supports partition independence would allow horizontal and vertical partitioning, but
would act as if the data were not portioned at all), and replication independence (copies
of files could be distributed for local processing effectiveness, but refreshed dynamically
when updates are posted to the parent files).
A DDBMS is usually more complex than a CDBMS, and this can be undesirable. First,
data may be replicated (copied) in a distributed environment. A DDBMS can be
designed so that the entire database (or portions of it) resides at different sites of a
computer network. It is not necessary to have every site on the network contain the entire
DB; it is only essential that there be more than one site where the DB resides. The
possible duplication of data items is mainly due to trustworthiness, reliability, and
efficiency considerations. Consequently, DDBMS is responsible for (1) selecting one of
the stored copies of the requested data for access in case of retrievals, and (2) making
sure that the consequence of an update is reflected in each and every copy of that data
item.
Second, if some sites crash (e.g., by either hardware or software malfunction), or if some
17

communication exchange connections fail (making some of the sites unreachable) while
an update is being processed, the system must assure that the effects will be reflected in
the data residing at the failing or unreachable locations as soon as the system can recover
from the downtime.
The third point is that since each site cannot have instantaneous information on the
actions currently being carried out at the other sites, the synchronization of transactions
on multiple sites is considerably more complicated than for a centralized system (Atre,
1993 p. 30).
Other disadvantages include: cost, distribution of control, technical difficulties and
security. Cost – DDBMSs require additional hardware, thus creating a financial
burden4
. Perhaps, a large fraction of the expense lies in the fact that additional and more
complex software and communication may be essential to solve some of the technical
problems. The development of better software engineering techniques (e.g., distributed
filters, debuggers, etc.) should help in this regard.
The largest financial constraint, however, might be the replication of effort (manpower).
When computer facilities are set up at different locations, it becomes necessary to hire
skilled or trainable people to maintain these facilities (Stollings, 1999, p.8). This usually
results in the rise of the cost for personnel and the cost in the data processing operations.
Therefore, the trade-off between maximizing profitability due to more efficient and
timely use of information and the increased manpower costs has to be analyzed carefully
4
The trend toward decreasing technological hardware costs does not make this a significant factor.
18

to obtain the optimal cost results.
Distribution of Control – This point was stated previously as an advantage of DDBSs.
Unfortunately, distribution creates problems of synchronization and coordination.
Distributed control can, therefore, easily become a liability if care is not taken to adopt
adequate protocol policies to deal with these command and control issues.
Good Security Is A Good Business
One of the major advantages of centralized databases has been the control it supplies over
the access to data. Security can easily be controlled in one central location with the
DBMS enforcing the rules. However, in a DDBMS, a network is involved which is a
medium that has its own security requirements. There are serious problems in
maintaining adequate security over computer DB and networks. Thus the security
problems in DDBMSs are by nature more problematic than in centralized ones.
So how can organizations maintain secure DDBMSs? Businesses have been struggling
with this issue for many years, and there are many professionals who analyze and attempt
to solve security vulnerabilities, yet we constantly hear news about well established
companies’ DBs being hacked despite their best efforts to keep this information quiet so
as not alarm the public of their security problems. For example, when a hacker from
Russia broke into Citibank’s DB and stole $12 million in 1995, Citibank announced that
it had been hacked into and instituted tighter security measures to prevent such attacks
from occurring in the future. Nevertheless, thousands of investors panicked and
19

withdrew their savings immediately after Citibank’s announcement. Ultimately Citibank
recovered, but the damage had been done (Schneier, 2000, p.391).
Public Concerns for Internet Security
The Internet is certainly the most sophisticated DDBMS ever developed. Many
companies use the World Wide Web to communicate and network because it is too
complex and expensive to physically attach cables to each other. Some organizations use
virtual private networks (VPN) running over the Internet to connect their DDBMSs.
Therefore, the Internet contains millions and millions of computers, connected in an
inconceivably, highly elaborate physical network. Each computer can have hundreds of
software programs running on it; some of these programs can interact with other
programs on the computer; some of them react with other programs on other computers
across the network. The system accepts user input from millions of people around the
world, sometimes all at the same time (Schneier, 2000, p. 6).
A Brief History about the Internet and Web Security
The Internet was once known as the Arpanet, a network that was created by the
Advanced Research Projects Agency (ARPA - now known as the Defense Advanced
Research Projects Agency, DARPA). It was first started around the 1970s during the
Cold War tension and was intended for only a few trusted military personnel and
researchers. Thus, the Internet back then was for informational purposes only (Narins,
2000, p.202).
20

Then came the personal computer revolution and by the early 1990s, more and more
people were beginning to turn to the computer for personal as well as business reasons.
Nowadays, it seems everyone is using the Internet. Electronic commerce, or e-commerce
– doing business on the World Wide Web – is perhaps the greatest thing in the modern
corporate world. Many large and small companies are conducting their businesses daily
with their customers and business partners over the world via the Internet (Todd, 2000,
p.141). They might even set up e-portal to allow their employees to connect to their
company’s DB while being at home. So, as the Internet became more available to the
masses, more and more people surfed the web to purchase items, send e-mails, share
digital photos, and conduct business from their computers transferring sensitive
information online without a second thought.
It did not take long before computer DB and Internet security became a major social,
economic problem. People who break into other people’s computer, at the very least, can
cause mischief, or at the very worst, can destroy a company’s sensitive information and
even the entire infrastructure. The security advisories put out by the Coordination Center
and Symantec5
(CERT) in the past few years reveals that hackers are exploiting a wide
variety of database platforms via the Internet, especially DDBMSs, because of their lack
of centralized security monitoring.
5
The Coordination Center and Symantec (CERT) is a partially government funded organization reporting
mainly about Internet and DB security. They are located in Pittsburgh, PA, and their Web address is:
http://www.cert.org/
21

Businesses might tighten their security by showing certain fields to certain users. One
group of users might be allowed to see a set of common fields (employee name,
employee number), whereas only certain users might be allowed to see specific fields
(health insurance information, salary). This is all a conventional computer security
problem solved by authentication protocols and access control lists.
There are other problems. Much more difficult is dealing with the situation where one
person is allowed to make queries and see aggregate information but is not permitted to
see individual entries. The problem is one of “inference”; this user can often infer
information about individuals by making queries about groups.
A possible solution to this problem is to scrub the data beforehand, data from the 1960
U.S. census, for example, was secured in this manner. Only one record in a thousand was
made available for statistical analysis, and those records had names, addresses, and other
sensitive data deleted. The Census Bureau also used other methods: data with extreme
values were suppressed, and noise was added to the system. These sorts of protections
are complicated, and subtle attacks often remain. If someone wants to know the income
of the one wealthy family in a neighborhood, it might still be possible to infer it from the
data by applying some educated guesses.
Another possible solution is to limit the types of queries that someone can make to the
DB. This is also difficult to perfect. A hacker knowing the kinds of queries that are
allowed will try to figure out some mathematical way of deducing the information one
22

wants from the information one’s permitted to get. And situations can worsen if the
hacker is allowed to add and delete data from the DB. If a searcher wants to learn what
about a particular person, one might be able to add a couple hundred records into the DB
and then make general queries about the population one added his / her target. Since one
knows all the data one added, one can infer data about one’s target. A whole set of
related attacks follow from this idea.
CHAPTER THREE
COMPUTER CRIMES
Software Bugs
A secure DB is one that only authorized users can use it and no one else. It also must be
free of undesirable effects or “bugs”. As anyone who has worked with computers for any
length of time has discovered that software bugs in even the simplest DBs are very
prevalent. Most bugs are inadvertent programming errors, but some are intentional bits
of code placed in the program by the vendors in order to aid debugging and then
forgotten by the rollout date (Stein, 1998, pp.155-162). The larger and more
sophisticated a DB is, the more likely it would contain bugs that can cause security
problems. Even a simple bug can be disastrous if not caught early. For example, an
infection can cause a spreadsheet document to be corrupted. This can lead to the printer
format contamination, and eventually the whole DB network is compromised. Many
computer hackers are constantly looking for bugs in server software because each bug
represents a potential portal of entry in the DB. By carefully crafting the input fed to the
DB server or by manipulating the server’s environment in a controlled way, the wicked
23

hacker can trick the software into performing some actions that no one could ever
imagine (e.g., giving the hacker access to the inside of the DB system).
Distributed Denial-of-Service Attacks
I, personally, have experienced that my e-mail account could not accept any more e-
messages because some companies were sending me a “ton of junk mails” (a.k.a. Spam
mails) everyday. The idea for distributed denial-of-service is the same. These attacks
first happen when the attackers break into hundreds or thousands of insecure computers,
called zombies, on the DB or the Internet, and install an attack program. Then the
perpetrator coordinates them to invade the target at the same time. The target site is
attacked from many places at once; it cannot defend against them all, and the whole DB
crashes and fails.
These attacks are incredibly formidable, if not impossible, to defend against. In a
traditional denial-of-service attack, the victim computer might be able to figure out where
the attack is coming from and block such connections. But in a DDBMS, there might not
be a single source of attack. Also, the computer cannot shut down all connections, except
for the ones it knows can be trusted if it is a public Internet site.
Although no general defense exists, companies can still protect themselves against
distributed denial-of-service attacks by continuously monitoring their network
connections as well as having backup servers and routers. Sometimes the specific bugs
exploited in the attacks can be fixed, but many cannot.
24

Malicious Software
Many distributed denial-of-service attacks have come from worms, viruses, and Trojan
horse programs to propagate the zombie tools and then to automatically launch the attack
with some code word from a public forum. Worms, viruses, and Trojan horses can also
do other harm to computers and DBs. Worms are devastating programs written by
peoples to scan and exploit vulnerabilities in software (Narins, 2000, p.108) and usually
replicate themselves in the disk and memory and eventually invade other computers. A
worm might multiply itself in one computer so often that it causes the computer and
eventually the entire DB to crash. Sometimes programmed in separate segments, a worm
is introduced secretively into the target system either mischievously or with the intent of
sabotaging information.
Viruses are software used to infect a computer. Much like the worm, usually the virus
code is unexpectedly buried by the programmer within an existing program. Once that
program is executed, the virus code becomes active and attaches copies of itself to other
programs in the computer. Infected programs replicate the virus to other programs and
eventually to the entire DB. Viruses can infect the “boot sector” of floppy disks or hard
drives, infect other software that uses macros, or spread as attachments to e-mail allover
the Internet. Many people fear that there are many more viruses planned there, which are
dormant, but expected to “come alive” at certain times6
.
6
Note that a virus cannot be attached to the data. It must be affixed to an executable program downloaded
into or already installed in the computer. The virus-attached program must be executed in order to activate
the virus.
25

Another malicious software is the Trojan horse. They are designed to fool the naïve
user into thinking that it is a useful or benign program. For example, one type of Trojan
horse - the logic bomb - is a piece of code that a programmer writes into a large software
application that starts misbehaving if, for example, the programmer is ever deleted from
the payroll file. Back Orifice is a popular Trojan horse for Microsoft Windows. If it is
installed on a computer, a remote hacker can effectively take it over via the Internet. The
hacker can upload and download files, delete files, run programs, change configurations,
take control of the input and output devices, and see whatever is on the server’s monitor
(Schneier, 2000, p. 156).
Figure Two: Examples for anti-Trojan horses, worms, and viruses software:
(Sources: http://www.diamondcs.com.au/ and
http://www.all.search.shopping.yahoo.com/search/all?p=anti+virus)
CHAPTER FOUR
SECURITY PLANS FOR DDBMSS
As previously stated, organizations must prevent unauthorized users from accessing their
data and resources illegitimately. Different organizations require different levels of
security protection. DB security administrator should first develop a checklist – a plan
for DB a comprehensive security. A risk analysis might be necessary to determine the
26

susceptibility of a computing system to various kinds of security failure. Risk analysis
can be performed by critiquing the general threats to the security of the system, such as a
programmer sabotage, and then it determines whether the threats could affect the system
in question. A threat that could affect a system adversely is called a vulnerability.
Companies should build a practical DDBMS architecture with the appropriate security in
mind. They should start with a good secure policy with clearly defined goals and aims
that are justifiable, coherent, and consistent. For example, users should be advised to
make their passwords hard to crack7
and not to share this with anyone. No one should
bring non-company approved software to work or download programs from the Internet.
According to Dr. Kris Jamsa, a best-selling author of nearly 30 books on all aspects of
DB computing, below is a list of appropriate security questions to consider:
• What universal groups are necessary in the organization?
• What global groups are necessary in the organization?
• How will we utilize the built-in local groups?
• What local groups are necessary in the organization?
• What filters are necessary for group policies in the organization?
• What policies are required for Active Directory objects in the organization?
• What policies are required for the file system in the organization?
• What policies are required for registries in the organization?
• What polices are required for system services in the organization?
7
To enhance security, passwords should be lengthy with a combination of letters and numbers mixed
together.
27

• What polices are required for network accounts in the organization?
• What polices are required for local computers in the organization?
• What polices are required for Event Logs in the organization?
• What polices are required for restricted groups in the organization?
• How will one perform network logon and authentication in the organization?
• What approach does one take with smart cards in the organization?
• What approach does one take with certificate mapping in the organization?
• How does one implement Public Key Infrastructure within the organization?
• How does one implement the Encrypting File System in the organization?
• How will one provide authentication for remote access users?
• How does one protect the organization’s Web site?
• How does one implement code signing in the organization?
After answering the questions in depth, the DB administrator needs to design adequate
physical, hardware, software, and data security and test this by hiring some qualified
security experts and hackers to analyze and evaluate the beta system at a remote location
before implementing the new company security program. If upgrading to a better
security system, make sure all potential safe guarding problems are accounted for. Start
with the weakest security link first, and then move to the next vulnerable part ASAP.
Physical and Hardware Security
28

Physical security of processing computers and networks in a DB influences how security
must be implemented. Without physical network security, the ability to perform
encryption algorithms efficiently is critical. Without physical security of nodes,
encryption can be used to keep data private, but data can be deleted maliciously
(Mullender, 1989 pp.30-31).
The server, the operating system, the wires in an office LAN, the dedicated phone lines,
possibly ISDN or DSL, the dial-up connections, satellites, the fiber optics, etc. all must
be installed, configured, and operated correctly and securely. It is a good idea to have
tamperproof, tamper resistant, or tamper evident hardware. There are many precautions
that can be taken to prevent a stranger from walking in through a computer’s front door,
such as locking the door or having an armed individual to guard it.
Software Security
Computers make access requests only under the control of the software programs. A
program operates under the control of or in the name of a user. Thus, the accesses to a
program are in all likelihood the result of requests from a user. However, programs can
also be altered, that is, a program is actually a series of bits in memory, and those bits can
be read, written, modified, and deleted as any data can be. While a user’s program may
be designed to read a file, the program, with minor modification to the executable code,
could instead write or expunge the file. Those modifications can be the result of
hardware mistakes and failures, a lapse in the logic of the program, or a change
influenced by some other program in the system. Hardware errors do not happen often,
29

and checking circuitry is built into computers to detect such mistakes before they affect a
computation. Unintentional or not, user errors are much more difficult to discover and
prevent. Programs are stored either in files or main memory, or both. Thus, the first line
of defense against program errors is memory protection, which is designed to prevent
one person from deliberately or accidentally accessing files and memory assigned to
another person. While these controls protect users from one another, they are far less
efficient at safeguarding themselves from errors in their own program logic (i.e.,
computer bugs or logical errors).
A second protection against program errors is careful and thorough software
engineering, including structured and arrangement design, program analysis, and use of
programming management professionals. Such programming practices help to protect a
user from unintentional errors.
The third form of defense against program errors is software testing. Unfortunately, the
best that testing can confirm is the presence of errors and not their absence. Nonetheless,
a thoroughly tested program can give credibility to the assurance that a program is error-
free. Of special interest is that the software that controls access of any subject to any
object also protects itself against access by all unauthorized subjects. Nevertheless, the
access control program itself represents a significant vulnerability: defeat or prevent the
access control program, and the thieve can obtain unhindered access to all system
services. For this reason, on more secure computing systems, the access control function
is separated among several different modules: one to control access to memory, another
30

to control access to files, etc. In this way, defeating one module does not immediately
open up all the DB systems resources to illegitimate uses.
A related question is confirmation of the validity of the access control software itself,
ensuring that it will permit all authorized users. Clearly, access control procedures are
valid only if they are implemented properly. Good software engineering practices for the
design and installation of the access control software are combined with explicit control
over its modifications, once installed, to insure the correct and effective functioning of
the access control software. Also, if using software programs from vendors, the most
updated versions should be used because they are more likely to be free of bugs.
Data Security
Maintaining the security of data such as a payroll file or a digitized graphical image
requires consideration of the security of the entire computing DB system, including the
internal data. In today’s information age, data security has become of significant
importance because of the literally millions of users in cyberspace who might,
accidentally or intentionally, invade and compromise the integrity of data on the
computer calling into the Internet and other DBs.
DDBMS Products and Techniques
Domain Name Service (DNS) Security
DNS is basically a large DDBMS. Most computers on the Internet – nodes, routers, and
hosts – have a domain name like “maryland.edu” or “hotmail.com.” These names are
31

designed to be remembered by humans and are used to build things like URLs and e-mail
addresses. However, computers do not understand domain names; they understand IP
addresses (a.k.a. virtual addresses) like 192.168.10.0. IP addresses are then used to route
packets around the DB network.
Among other things, the DNS converts domain names to IP addresses. When a computer
is handed a domain name, it requests a DNS server to translate that domain name into an
IP address. Then it knows to which computer to send the packet of information.
The problem with DNS system is that it lacks security. So when a computer sends a
query to a DNS server and receives a response, it assumes that the response is correct and
that the DNS server is honest. However, the DNS server may not be honest since hackers
could have compromised it. And the reply that the computer gets from the DNS server
might not have even come from the DNS server; it could have been a faked reply from an
imposter. If the attackers make changes in the DNS tables (the actual data that translates
domains to IP addresses and vice versa), computers will automatically accept the validity
of the modified tables.
Therefore, it is not difficult to imagine the kinds of computer invasions that could result8
.
The attackers can trick the users that they are coming from trusted sources (change the
DNS tables to make them look like the attackers’ computers are trusted IP addresses).
The attackers can hijack DB network connections (change the DNS tables so that users
8
For more information, go to Appendix One to see the number of packets required to reach 50% success
probability for various numbers of open queries.
32

wanting to connect to legitimate websites actually makes the connections with the
imposters). Attackers are capable of doing all sorts of things. And DNS servers might
have an automatic update procedure. If one DNS server records a change, it tells the other
DNS servers, and they believe it. So if the attacker can make a change at a few certain
points, that change can propagate across the entire DB or the Internet.
Cryptography
The word Cryptography comes from the Greek for “secret writing”. Cryptography is one
important tool used to preserve both the confidentiality and the integrity of a DB (Stein,
1998 p. 15). Confidential data are encrypted to prevent their disclosure to unauthorized
people. Furthermore, encryption usually prevents unauthorized and undetected
modification: someone might be able to scramble the bits of an encrypted text so that the
bits decrypt to nothing meaningful, but without hacking the encryption, no one can alter a
specific field of the underlying plaintext (the original message) data from “one” to “two”
(Ralston, et. al, 2000, p. 249).
One significant use of cryptography is to compute a cryptographic checksum, a function
that depends upon the sum or other relationship upon every bit of a section of data and
also upon the key used for the cryptographic function. For example, a weak
cryptographic checksum is the parity of a string of bits; odd change to the string affects
the parity. The cryptographic checksum is computed when the section of data is made
and again when it is used; if the data has been changed between origin and use, the value
of the checksum at time of use will certainly not match that computed at time of origin, a
33

signal that the data has been altered (Schneier, 2000, p.88). It is recommended that users
of cryptography change their encryption keys regularly. If hackers steal or figure out the
key, they can read the plaintext.
Cryptography, while very powerful, is still subjected to security breaches. This reason is
that cryptography is a branch of mathematics, which is logical. In the physical world,
however, things can be very abstracted. Cryptography is based on hypotheses and
theories. However, in order to recognize the conclusions, the premises, the models, and
the relationship between the theories and the reality, must be accepted. And that is
sometimes very difficult to achieve. People do not always follow the rules. Sometimes
they do just the opposite. Hardware and software can be the same way. They break
down, misbehave, and fail.
Also, it does not matter how good the cryptography is or what the key length is; weak
passwords will help hackers to break into the DB. For example, hackers can use
L0phtcrack, a NT password-auditing tool that can test an encrypted password against an
8-megabyte dictionary of popular passwords in seconds.
E-Mail Security
E-mail uses both the OpenPGP and S/MIME protocols. OpenPGP is the protocol in PGP
(Pretty Good Privacy) and variants; S/MIME is the Internet standard protocol in just
about everything else. Some e-mails use Cryptography, which performs two valuable
functions. It provides a digital signature for authenticity and encryption for privacy. In
34

order to send an encrypted mail message, first the message that contains the sensitive data
is written. The sender obtains the public key of the receiver. A bulk encryption key is
generated, and then the sensitive data is encrypted with this key. After the document is in
ciphertext, the bulk encryption key is encrypted, using the receiver’s public key. The
message is now ready to be delivered. The receiver uses the private key in order to gain
access to the bulk encryption key. The receiver then uses the bulk encryption key to
return the document to uncover the plaintext (Todd, 2000, p.371).
Digital Signatures
Digital signature techniques are also used by e-mail. This helps to assure both the sender
and the receiver that the message has not been tampered with during transmission. When
the user indicates through the e-mail interface that the message should have a digital
signature, the private key is used to mix the message and produce the message digest.
The document and the message digest are then sent to the receiver. The e-mail interface
will indicate to the receiver that the message contains a digital signature. In the
verification of the digital signature, the sender’s public key is used to decrypt the digital
signature. The document is divided up by the generation of a 128-bit9
number of the
receiver. If the decrypted digital signature matches the generated 128-bit number, the
receiver knows that the sender is really the person who is indicated on the message and
that the body of the message has not been breached before the receiver has gotten it.
Mathematically, the ability to sign a message using public encryption depends on the fact
that the encryption and decryption algorithms are the inverse of one another, that is:
9
If the key is n bits long, then there are 2n
possible keys. So, if the key is 128 bits long, there are trillions
of possible keys. This means that this code is usually impossible for hackers to recover the plaintext using
“brute-force” password attacks.
35

p = d(e(p))
Where p = Plaintext, e = Encryption procedure based on encryption key and d =
Deciphering procedure based on decryption key.
Of course, the physical world is more complicated. Just as one does not encrypt
messages with public-key encryption algorithms (one encrypts a message key), one also
does not sign messages directly. Instead, this person takes a one-way hash of a message
and then signs the hash. Again, signing the hash is a few orders of magnitude faster, and
there can be mathematical security problems with signing messages directly.
Also, most digital signature algorithms do not actually encrypt the messages that are
signed. One makes some calculation based on the message and his / her private key to
generate the signature. This signature is affixed to the message. The other end makes
another calculation based on the message, the signature, and the public key to verify the
signature. Even without the private key, the hacker can verify the signature.
Encryption Rules
Encryption is intended to overcome the problem of people who bypass the security
controls of the DDBMS and gain direct access to the data or to the DB or who breach the
communication lines (Bell & Grimson, 1992, p.293). It is standard practice to encrypt
passwords, but it is possible also to encrypt data and messages, and this encryption may
well be desirable in a DDBMS environment. The plaintext is subjected to an encryption
algorithm, which scrambles the “plaintext” into “ciphertext” using an encryption key.
36

Unless the recipient knows the encryption key, it will be virtually impossible to decipher
the “ciphertext” (Bell & Grimson,, 1992, p.294). In fact, the dominating issue is not the
difficulty of breaking the code, but rather the security of the encryption keys (Schneier, p.
86).
There are two principle methods of encryption, the Data Encryption Standard and Public
Key Encryption. The Data Encryption Standard (DES) was adopted by the National
Bureau of Standards in 1977 and is widely used. DES uses a 64-bit key and the
algorithm is available on an LSI chip that is capable of processing text at a rate of over
one megabit per second (Stein, 1998 p. 16).
An alternative approach is provided by the public key (“trap door”) cryptosystems. The
idea here is to assign two keys to each user: an encryption key for encrypting plaintext,
and a decryption key for deciphering ciphertext. A user’s encryption key is published in
much the same way as a telephone number so that anyone can send coded messages, but
only the person who holds the corresponding decryption key can unlock that message. It
is nearly impossible to deduce the decryption key from the encryption key (Stein, 1998 p.
16).
Firewalls
A firewall is a product that protects a company’s internal network from the rest of the
world. It has to act as a gatekeeper. It keeps intruders out and internal users in. It has to
figure out which bits are harmful and deny them entry. It has to do this without
37

unreasonably delaying traffic. Also once attackers bypass the firewall into the DBs, the
firewall is no longer a safeguard. Since about 70 percent of all computer attacks come
from the outside10
, firewalls are worth considering for most businesses.
Hackers can use Trojan horses to penetrate firewalls, exploit some kind of bug in the DB
that will open a connection between the hacker outside the firewall and the computer
inside the firewall. If it all works, the hacker gets inside.
Early firewalls were once known as packet filters. The firewall would look at each
packet header coming in and decide whether to admit it or drop it, depending on a myriad
of rules by the programs. The first packet filters were very inefficient by today’s standard
and let many packets in that were better kept out. Eventually firewall technology got
better. So instead of looking at each packet individually, today’s firewall keeps the
information about the status of the network and what types of packets to look for. Still,
firewalls only have so long a memory inside, and slow and persistent attacks can often
get through.
For further protection, some companies have two firewall systems: one for the outside
world, and another one with more restrictions against the insiders. In a distributed
environment, implementations of the appropriate firewall should be done on all the
nodes. Remember you are only as good as your weakest link.
10
According to a study by the Computer Security Institute in 1998.
38

Figure Three: Illusion of how a firewall provides the “gateway protection” for
corporations. (Source: http://www.f-secure.com/products/network_security/)
Anti-Viral, Trojan Horse, and Worm Utilities
Well-tested virus, Trojan horse, and worm checking utilities should be obtained. For
example, anti-virus software programs keep a database of virus footprints or signatures –
bits of code that are known to be parts of viruses – and when they find the same markings
on a file, they know the computer has been infected. Most virus utilities of today can
scan the entire computer for bit strings that signify the virus and then execute the
automatic program to remove the virus and restore normal state (Schneier, 2000, p. 158).
Scanning can occur on a scheduled maintenance basis, such as nightly, or it can be
triggered by an action, such as when a user downloads a file. Many new viral checkers
are also able to repair infected systems. Also, because new viruses are being written and
introduced in cyberspace daily, be sure to get the type of checker that can be updated
easily by implementing a file maintained on the manufacturer’s FTP or Web site.
39

However, virus checkers are not perfect. There are many viruses that mutate with every
infection11
(polymorphic viruses) and use cryptography to hide their footprints (encrypted
viruses). Also, there might be many false alarms that can cause massive hysteria within
the entire organization leading to investigations and loss of productivity. Lastly, anti-
virus companies often release updates that catch particular viruses as soon as they appear,
but if a virus can infect 10 million computers (one estimate of the ILOVEYOU viral
infections from the Philippines which caused billions of dollar in total collateral
damages) in the hours before a fix is released, the damage is done (Mills-Abreu, 2001, p.
2).
Even with the best anti-malicious software programs, to reduce the chances of being
exposed to and affected by viruses, Trojan horses, and worms, computer users should
adopt certain habits of safety. Strange or unexpected attachments that show up in e-mails
should not be opened. If one is not sure why someone known that person is sending a
certain message with an attachment, it is better for that person to check with the sender
whether the attachment was really from them.
Kerberos Protocol
Kerberos is an authentication and authorization system from the MIT computer lab
originally developed for Project Athena in the 1980s (Jamsa, 2002, p. 302). Kerberos
provides mutual authentication for both servers and clients, and server to server, unlike
other protocols (such as NTLM) that authenticate only to the client. Kerberos operates
on the assumption that the initial electronic communications between clients and servers
11
A virus detector cannot find viruses it has never seen before.
40

are done on an unsecured network. Networks that are not secure can be easily monitored
by people who want to impersonate a client or server in order to gain access to
confidential information.
Here are a few basic concepts of Kerberos. A shared secret is shared only by those
needing to know the secret. The secret might be between two people, two computers,
three servers, etc. The shared secret is limited to the minimum parties necessary to
accomplish the required assignment, and it allows those who know the shared secret to
verify the identity of others who also know the shared secret. Kerberos depends on
shared secrets to perform its authentication. Kerberos uses secret key cryptography as the
mechanism for implementing shared secrets. Symmetric encryption, in which a single
key is used for both encryption and decryption, is used for shared secrets in Kerberos.
One side encrypts information, and another side successfully decrypts the information;
this is proof of the knowledge of the shared secret between the two entities (Todd, 2000
p. 68).
41

Figure Four: Illusion of a Kerberos Protocol. (Source: http://www.f-
secure.com/products/network_security/)
Public Key Infrastructure (PKI)
PKI provides the framework that allows one to spread out security services based on
encryption. Other infrastructures using purely symmetric (secret key) encryption have
been attempted and failed due to the management, scalability, and life cycle issues they
encountered. PKI allows one to create the identities and the associated trust that one need
for identification and authentication processes and to manage the public/private key-
based encryption that can provides a lot more scalable solution than previous encryption
and security infrastructures (Nash, et. al, 2001, p. 6).
Biometric Authentication
Biometric identification gives DDBMSs a strong authentication security. Biometric
42

authentication schemes rely on proving “what you are.” In these system plans, some
unique physical characteristic of the person being identified is used. This might include a
user’s fingerprint, retinal pattern, voiceprint or possibly the way in which one signs one’s
signature (Nash, et. al, 2001, p. 359). Biometric is a good security measure because a
person’s physical attributes are not easily duplicated. However, no system is perfect. For
example of a system that uses fingerprinting authorization, the users might be denied of
service if his / her finger is cut, too dirty, or too sweaty (Nash, et. al, 2001, p.360). If a
person can’t persuade the system that it is he or she, this is called a false negative
(Schneier, 2000, p. 143). Also, if someone stole a fingerprint or a picture of the user to
bypass authentication, biometric security is useless because the user cannot easily change
his/her fingerprints and face. This is not like a digital certificate, where the DB security
administers can issue that person another one. Biometric identification will become more
and more popular as technology improves on correcting for day-to-day variations of
individual’s physical features.
Smart Cards
Smart cards are basically a tiny microprocessor with memory embedded inside a piece of
plastic much like a credit card. Their use ranges from simple phone card styles of
applications to complex PKI devices that support cryptographic technology.
Smart cards are built to a set of standards, with the ISO7816 being one of the most
important. The ISO7816 standards define the shape, thickness, contact positions,
electrical signals, protocols, and some operating system functions to be supported by
smart cards.
43

Smart cards are acknowledged by the small golden seal on one side of the card. This is a
set of gold electrical contacts to connect to the smart card. In fact, the microprocessor is
hiding underneath those gold contacts (Nash, 2001, p.337). The most serious problem
with smart cards is that they can be stolen, thus allowing hackers to get into the DB.
Indeed, the DB system does not really authenticate the person; it authenticates the card.
Therefore, most secure DDBMSs today combine smart cards with other authentication
tools such as passwords to overcome this vulnerability.
Other types of access control mechanisms are capabilities, which are, effectively, tokens,
identification cards, or tickets that a user must possess in order to access an object, and
group authorizations, in which subjects are allowed access to objects based on defined
membership in a group, such as all employees of a single department or the collaborators
on a research project. The objects whose access is controlled in a computing system
include memory, storage media, I/O devices, computing time, files, and communication
paths. Although the nature of access control to these objects is the same, access is
controlled by different schemes. Memory can be controlled through hardware features,
such as base and bounds registers or dynamic address translation (virtual memory,
paging, or segmentation). Access to I/O devices and computing time (i.e., to the use of
the CPU) is more frequently controlled by requiring the intervention of the operating
system to access the device. The operating system is assisted by the hardware in that,
although the machine can run in two or more states, direct access to I/O devices is
permitted only from the more privileged system state. Files and communications paths
44

are typically controlled by permission to the first access. Such accesses are requested
from the operating system.
Backups – Even if hackers and their malicious software do not attack the system, there’s
still the chance that it will be burned by fire, dropped, or be at ground zero when a soda
pop drops. Although this might be expensive and requires extra space, back up the
computer files and equipment on a regular basis (Stein, 1998, p.114).
CHAPTER FIVE
ADVANCE CONCEPTS AND TERMINOLOGIES OF
SECURITY IN A DISTRIBUTED ENVIRONMENT
It is very difficult to notice things like users not logging on for long periods of time, users
who transfer too much data and/or stay on too long, multiple logins on different
computers, and logins from an incorrect machine account. In this chapter, we will
examine further details behind the ideas and concepts that we have touched earlier to
provide the necessary security for the ever-complex DDBMSs.
Secrecy – Users must be able to keep data secret; that is, prevent other users from
peaking at confidential information.
Privacy – Users must be guaranteed that the information they give is used only for the
purpose for which it was given.
Impersonation - Because it is hard to identify users, only an IP address is typically
45

identified; this poses problems. These portal addresses can be changed and seem to come
from legitimate sources. This is an issue with non-distributed database designs as well.
Availability – This is about making sure that the use of data, programs, and other system
resources will not be denied to authorized persons, programs, or systems (Schneier, 2000,
p.122).
Multitier Architecture of DDBMSs
Having multitier architecture makes DDBMSs’ security more challenging. Possible
security services in a multitier architecture include the following: authorization,
authentication, nonrepudiation, confidentiality, and data integrity (Buretta, 1997, pp.98-
99).
Authorization control means permitting the right user to run the right transaction at the
right time. In a DDBMS, the creator becomes the “owner” of the objects, and there is an
authorization matrix. Having a distributed authorization control means that the
organization has integration, remote user authentication, views, and user groups, and
everyone is using the same authorization process (e.g., when one data item gets updated,
all others get updated).
Authentication is the process of having each user, node, host, or application server prove
it is whom it says. Authorization is the process of ensuring that each authenticated user
has the necessary permission level to perform the requested tasks (Narins, 2000, p.182).
46

It is about the continuity of relationships, knowing whom to trust and whom not to trust.
UNIX, for example, has three possible access permissions for the owner of a resource to
decide: read, write, and execute. These permissions are autonomous of each other.
Someone who has only read permission for a particular resource cannot write or execute.
Someone who has only write permission can change the resource but cannot read it.
Someone who has both read and write permission can do both, etc.
Nonrepudiation is ensuring that authenticated and authorized users may not deny that
they used an allowed resource. Confidentiality prevents unauthorized users from
accessing confidential information. Data integrity preserves the genuineness of the data
from the sender to the receiver and prevents these data from being modified in an
unauthorized manner. Global integrity assures that data, programs, and other system
resources are protected against revelation to unauthorized individuals, programs, or
systems (Buretta, 1997, pp.98-99).
Issues in Replicated DDBMSs
There are some ways for implementing adequate security in a replicated DB environment
as with DDBMSs. The first is that all stored and/or displayed passwords must be
encrypted so that unauthorized persons and processes may not steal them. Passwords are
compared against the secure perimeter or central DB for validity. Sometimes it might be
a good idea to have the user’s node locked up after a certain number of bad password
attempts.
47

Pseudo-user accounts, those established for systems to automatically log users on to the
DB, are common in distributed DB environments. These accounts must comply with the
organization's security policies, and knowledge of their passwords should be limited. All
file systems, raw devices, and/or DB structures used to store queued data and/or
messages must be secure also. This item points out the many avenues in a DDBMS that
are available to unauthorized users, and which must be protected. Finally, encryption
techniques must be integrated within the replication service. Encryption prevents the
breaching of the data transmitted over the network (Buretta, 1997, p.203).
DDBMSs may use either application- or data-level security. Application-level security
is programmed into the application logic and theory (Stollings, 1999, p.11). Each
application is responsible for governing user access to the data. Data level security is
implemented in the DB engine. Profiles of acceptable data items and operations should be
stored and checked by the DB engine against the end-user's permission status on each DB
operation. Donald Burleson of Oracle Tuning Consultants, a well-known DDBMS expert,
recommended that application-level security be taken out and be replaced with data level
security to make the DDBMS more secure. The argument for this is that a skilled end-
user with commonly available development tools could easily write an application that
does not follow the organization's security policy. Such a security flaw may be created
either unintentionally by the designer or intentionally by someone with hateful intent.
When data-level security is implemented, such security holes are solved (Burleson, 1994,
pp.208-219).
48

Issues in Multilevel Security
Some information is more secretive than others. It is common practice in many
organizations (e.g., the military, financial brokerage firms) to classify information
according to various security levels (e.g., top-secret, secret, confidential and
unclassified), and to assign an appropriate security authorization level to each user.
People working with this data need security clearances proportional to the highest
classification of information with which they are working. Someone with a secret
clearance, for example, cannot access top-secret information, but can see information
that is unclassified, confidential, and secret. A security classification level is assigned to
objects (data, files, and so forth) and a clearance level is assigned to users. The
classification and clearance levels are thus ranked (Bell & Grimson, 1992, p.286):
top secret > secret > confidential > unclassified
The rules governing this multilevel security model, where clearance (A) is the clearance
level of user A and classification (x) is the classification level of object x are:
User A can read object x if and only if clearance (A) ≥ classification (x),
User A can update object x if and only if clearance (A) = classification (x).
Advantages of the Multilevel Model
One of the advantages of the multilevel model is that it not only supports strong content-
independent access controls, but also restricts the flow of information by ensuring that
information can only flow upwards throughout the design. It is not possible for a user
with a particular clearance level to simply copy sensitive data and thereby make it
accessible to other users with a lower clearance level.
49

Multilevel security was designed to manage multiple levels of classification in a single
system. In order to develop a multilevel security policy for a DDBMS, it is generally
assumed that the operating system on which the DDBMS is built also has a multilevel
security policy in operation at the file level. Such an operating system makes use of a
reference monitor that has the following characteristics:
It is called up every time a user requests access to an object.
It supports a multilevel security policy.
It is tamper-proof.
It is sufficiently small that it can be thoroughly tested and the code formally
confirmed as correct (Bell & Grimson, 1992, p. 287).
Improvement for the Multilevel Model
Multilevel security, however, might need to have separation from mutually distrustful
users (Schneier, 2000, p. 127). For example, in a computerized medical DB, with
patients able to access their accounts, the DB manager wants to prevent Patient A from
seeing Patient B’s medical record, even though both accounts might be classified at the
same level.
Thus DDBMSs built on top of such operating systems normally use the operating
system’s reference monitor by careful mapping of the DB objects (relations, tuples,
attributes) onto files. The easiest method is to use the relation as the security granule
(i.e., the basic unit of data to which security controls can be applied). Thus every tuples
in a relation will be at the same classification level and the relation is then mapped
50

directly onto an operating system file. If rows (or columns) of a relation have different
security classifications, then the relation must be separated into units of the same
classification level which can then be mapped onto the individual files. With horizontal
fragmentation by row, the original relation can be restored by applying the set UNION
operator to the fragments. With vertical partitioning by columns, it is imperative to
repeat the primary key in each partition and reconstitution is then carried out by
performing a relational JOIN on the fragments (Stollings, 1999, p.7).
If content-dependent control is required, then the services of the operating system
cannot be used, and hence, security would have to be delivered by the DDBMS directly.
In this case, the DDBMS would, therefore, require its own “reference monitor”.
Work on reference monitors has led to the development of secure operating systems for a
DDBMS base on a trusted kernel, which contains all of the security-related functions.
One of the problems with many security techniques is the code concerned with
authorization checks. The security-related issues are spread throughout the system (query
interface, system catalogue, etc.). The focus of the “trusted kernel approach” is to
centralize all security – related information and processes within the trusted kernel. In a
multilevel security system, the reference monitor corresponds immediately with the
trusted kernel. However, having the multilevel security does not guarantee reliability and
trustworthiness (Bell & Grimson, 1992, p.288).
The Granularity of the Data Object
51

As a data or security administrator, one fundamental decision that must be made is the
basic unit or granule to which security control can be applied. The granularity of the data
object can range from an individual attribute within a tuple of a relation to a whole
relation or even to the entire DB. The finer the granularity of data objects reinforced by
the DB system, the more precise the access rights can be. Generally speaking, a fine
granularity system will incur a much higher administration overhead than a coarse
granularity system. For example, if the granule, is in fact, the whole DB, then almost all
that we need to do is to maintain a list of authorized users for that DB, indicating what
type of access rights the users have (read, write, or modify). At the other extreme, if the
data granule is an individual attribute or data item, then we have to keep the same
information for each attribute or data item. Of course, defaults will be operative
depending on the security policy of the organization.
Security Policies for Control of Access
For control of access, there are four main types of security policies within organizations:
Need to know
Maximized sharing
Open systems
Closed systems
Many organizations adopt a policy that confines access to information on a need to know
basis. Thus, if a user requires read (write) access to a particular set of data objects, then
that specific user is granted the appropriate rights to those objects and to those objects
alone. For example, if security controls are placed at the relation level (i.e. the granule is
52

a relation), then once a user has access to any part of a relation, the user will have
automatic access to all attributes and to all tuples within that relation. If the granule is the
individual attributes, then it is possible to implement a very precise “need to know”
system.
However, while a “need to know” security policy might well be right for many high-
security applications, it might not be appropriate for normal commercial information
processing environments where data sharing is an important and fundamental goal of the
DB or web approach. At the other end of the security policy spectrum from need to
know, we have a policy of maximized sharing in which the objective environment is to
facilitate as much data sharing as possible (i.e., in a DB sharing access). Maximized data
sharing does not mean that all users have full access rights to all the data in the DB; only
those parts of the DB that really need to be protected are guarded. For example, those
who are using a patient DB for epidemiological research need full access to all the
clinical information about the patients, plus probably, data such as the patient’s age, sex,
occupation, etc. They do not, however, need access to personal identification information
such as names and social security numbers.
An open policy for access control means that the default is that users have all access to
the data, unless access is explicitly forbidden. Such an approach facilitates data sharing
but has a disadvantage in that the omission or unintentional deletion of an access rule
results in (possibly classified) data being made accessible to everyone.
53

In a closed policy for access control, on the other hand, access to all data is implicitly
forbidden, unless access privileges to that data are explicitly permitted. Such a policy is
used by organizations that follow the “need to know” approach. Closed systems are
clearly more secure than open systems. Moreover, since the default in a closed system is
to forbid access, errors in the rules will restrict rather than open up access.
Information on access privileges is often called a user profile, which explicitly describes
an individual’s user’s access rights (privilege to sensitive materials) to data objects within
the system. User profiles are generally represented in the form of an authorization matrix
in which the users (or user groups) form the rows of the matrix, and the data objects form
the columns of the authorization matrix (Bell & Grimson, 1992 p. 285).
Types of Access Controls
For the Access Control at the DB level, four main types of controls can be identified:
Content-independent
Content-dependent
Statistical control
Context-dependent.
With content-independent control, a user is allowed (or not) to access a specific data
object without regard to the content of the data object. An example of content-
independent access control would be that ‘user A is allowed read access to the employee
relation’. Checking can be done at compile time, since the actual value of the data object
54

does not have to be examined at compile time.
With content-dependent control, gaining access to a data object by an individual user
depends on the content of the DB and hence can only be checked at run-time. This
method involves a greater degree of overhead than content-independent control. An
example of a content-dependent access control would be: “Employee A is permitted to
update the salary files of an employee provided the current salary is less then $15,000”.
Under statistical control, the user is permitted to perform statistical operations such as
SUM, AVERAGE, and so on, on the data but is not allowed to access the individual
records. For example, a user might be allowed to count the number of patients in the DB
suffering from a particular disease, but is not allowed to see the diagnosis of a particular,
individual patient (Bell & Grimson, 1992, p. 286).
With context -dependent control, the user’s access privileges depend on the context in
which the request is being made. For example, a user may only be allowed to modify a
student’s grade in a course if the user is the course professor. A user in the personnel
administration department may be allowed to update an employee’s salary, but only
between the hours of 9 a.m. and 5 p.m. on weekdays.
Security Facilities in SQL
The database language SQL [ANS92] demonstrates a standard means of accessing data
organized according to the relational model. The basic security features provided through
55

SQL for many DBs are generally implemented into a few different rules (Bell &
Grimson, p. 288):
Figure Five: The ANSI-SPANC three-level architecture:
The View Mechanism:
The Authorization Rules
In the above example of ANSI-SPARC division of the architecture of DDBMSs, the
security features fall into three levels (i.e., conceptual, internal and external). This is
implemented and stimulated mainly by the need to provide concurrency or data
independence (Burleson, 1994, pp.192-193). Users can access the DB through a logical
external schema or user view (external level), which is then mapped onto the global
logical schema (conceptual level), which is then mapped onto the physical storage
(internal level). Data independence insulates applications using different views from
each other and from the underlying details of physical storage. Views also provide users
External
Schema
User
Language
User
LanguageInternal
Schema
External
Schema
Conceptual
Schema
Physical
Level
Physical
Level
Physical
Level
56

with a means of logically structuring the data in such a way that it is meaningful to them
and of filtering out data that are not relevant (Narins, 2000, p.202). An important and
very useful byproduct of the “view mechanism” is the security that it provides. Users can
only access data through views or a mask and hence are automatically prevented from
accessing data that is not contained in their own view or mask.
The SQL view definition facility permits users to generate views that combine data from
any number of relations. The original relations of the conceptual schema on which the
views are defined are known as the base relations or base tables. In essence, the view
mechanism creates a virtual table, against which the user can issue queries in just the
same way as against a base table.
In a DDB with nodal autonomy, security of data will ultimately be the responsibility of
the local DDBMS (Bell & Grimson, 1992, p. 292). However, once a remote user has
been granted permission to access local data, the local site no longer has any means of
further ensuring the security of that data. This is because such an access entails copying
of the data across the DB network. Issues such as the relative security level of the
receiving site and the security of the network now have to be taken into account. There is
no point to have a secure site send confidential data over an insecure communications
line or send the confidential data to an insecure site. The following are some security
issues that are peculiar to DDBMSs:
Identification and authentication rules
Distribution of authorization rules
57

Encryption rules
Global view mechanisms
Identification and Authentication Rules
The term authentication is more complex than previously mentioned. It is also used to
refer to several distinct, though related, processes. A more demanding requirement is for
each to be able to convince a judge or arbiter that a certain transaction was performed by
the other (nonrepudiation). Document content may required to be authenticated in the
sense that its integrity is checked and guaranteed to be accurate in addition to the original
document authorship. In theory, authentication is quite distinct from the provisions of
confidentiality, but in practice authentication and confidentiality usually go together (Bell
& Grimson, 1992, p.292).
When a user attempts to access any computer DB system, they must first identify
themselves (e.g. by giving a name) and then authenticate that identification (e.g. by
typing a handle name and a password). In order to allow users to access data at remote
sites in a DDBMS, it would be necessary to store the user names (identification
information) and passwords (authentication information) at all sites (Bell & Grimson,
1992, p.293). This duplication of essential security information is in itself a security risk,
even though the passwords might be stored in encrypted form. Also, having a valid user
name does not necessarily mean that it is authenticated. The same goes for passwords.
A better approach that avoids duplication is to allow users to identify themselves at one
58

site, called their home site, and for that site to perform the authentication. Once that user
has been admitted by the DDBMS at their own site, all other sites will accept the user as
the right user. This does not of course mean that the user now has unlimited access to all
data in the DDBMS, as they would still have to satisfy local access controls on the data.
Naturally such a system depends on sites being able to satisfactorily identify themselves
first to one another, and this can be done in exactly the same way as for users by giving a
site identifier followed by a password.
Distribution of Authorization Rules
Since the data itself is distributed, it is best to store authorization rules for
access to a data object at the site at which that data object is stored. The alternative is to
generate all rules fully at all sites. This would enable authorization checks to be
performed at compilation time for content-independent control and at the beginning of
execution for content-dependent controls. However, this early validation of access rights
to remote objects has to be offset against the cost of supplying and maintaining fully
replicated rules.
Global View Mechanisms
It is relatively straightforward to provide support for views (or masks) in a DDBMS, the
view or mask themselves being defined in the normal way on top of global relations (Bell
& Grimson, 1992, p.295).
Indeed both from the point of view of data independence and security, the view
59

mechanism are, if anything, even more useful in a distributed environment. DDBMSs are
typically much larger and more complex than centralized DBs, and views provide a good
means of providing users with a mask of only the relevant data. Moreover, from the
security viewpoint, there are likely to be many more users in the distributed environment,
and view mechanisms enable users to be classified into groups. However, complex
views, especially those requiring joins across several tables, can be very expensive to
implement and maintain. For a DDBMS, which allows fragmentation and partitioning of
global relations, the real-time performance penalty is likely to be quite expensive.
Fortunately, however, the use of powerful query optimizers can help by combining the
view definition with the global query and optimize the whole process of view
materialization and query against the view as a unit.
Object-Oriented Distributed Databases
Objected-oriented databases are first developed to ease the complexity of DDBMSs. The
distributed object model is a valuable extension of object-orientation to the distributed
system environment. Instead of having objects located on a single system, objects are
physically distributed on different processors or computers. Distributed object systems
may use the client-server architecture with the objects being managed by servers, and the
clients making requests of the servers to access the objects’ methods (Narins, 2000,
p.203).
However, the object-oriented database model has complicated the development of
adequate DDBMSs security. O-O databases do not figure prominently into the
60

distributed database market (Bobak, 1996 p. 438). As with stand-alone DDBMSs, the
relational model is by far the most popular. DDBMSs can, however, offer some object-
oriented features within the relational model. Such end results are typically referred to as
object-relational DDBMSs. They usually provide the ability to store and access data
types such as sound and video. These features may be significant to a DDBMS designer
as these large data types can generate a huge load on the network when propagated. O-O
databases are rapidly gaining more popularity than RDBs.
CHAPTER SIX
THE HUMAN FACTOR
Security is never black and white, and the context will matter more than technology.
Security in the real world doe not fit into little neat boxes. That is why there will never be
a perfectly secured DDBMS or security of any kind, for that matter, because the people
who run it and use it are not perfect, as their judgments, morals, and values differs.
An attacker can be an outsider or an insider. For example; it can be a kid with a PC trying
to hack into a commercial DB for fun, a consultant or contractor upgrading old software,
or an employee trying to break into his company’s DB system to increase his/her salary.
People can steal hardware, software, and data. Because perpetrators are everywhere and
they are hard to catch, the security managers have to assume that every person is a
potential hacker. And the only way to stop a hacker is to think like one. They have to
think about every possible angle of attack. Perhaps, the great Benjamin Franklin said it
61

best, “the only way to be secure is to never feel safe”.
So who are the minds behind the malicious computer attacks? Why would they risk their
lives to go to jail for hacking? It is important to understand who the attackers are, what
they want, and how to deal with the threats they represent. But first, when it comes to
people, let’s start with the users.
The Users
The users often represent the weakest link in the security chain and are frequently
responsible for the failure of security systems. Many users do not understand risks and
security vulnerability. Some users do not understand computers. Some users believe
what computers tell them. For example, they think computers can never make a
mathematical mistake.
Security can be the opposite of convenience. One usually is sacrificed in favor of the
other. DB users always want to get the data anytime and anywhere. They seem to be
more concerned about their convenience, privacy12
, and anonymity than their company’s
security. Users become paranoid about passwords, biometrics, access cards and other
security requirements. They work around security procedures whenever possible and will
12
The Supreme Court has insinuated that it is a right guaranteed by the Constitution. Democracy is built
upon the notion of privacy. People want to be secure in their conversation, their papers, and their
computers. Your privacy rights might vary at work, depending on your company policy.
62

ruin the system at every turn. People often use the same passwords for all of their
computer needs and/or passwords that are easy for them to remember, which make it easy
for hackers to crack. Sometimes, users go as far as to place their passwords on a self-
stick removable note onto their monitor or to share passwords with co-workers trying to
help them to get some work done around the office. When deadlines are due and work
needs to get done, security is the first thing is compromised. It only takes the weakest
password for a hacker to crack the DB. Managers of the database must decide which
risks are acceptable and which security measures are practical. Also when protection is
too stringent, legitimate users will be irritated and complain. And if it is too lax, nobody
will notice until a security problem has occurred (Mullender, 1989, p.118). In fact, it is
human nature to think bad things (e.g., security breaching) only happens to other people
or bigger establishments. There are no perfect security systems of any type. We need to
decide what particular balance is right for us and then create security that enforces that
balance.
Social Engineering
Social engineering is the hacker term for a con game: persuade the insider to do what he /
she wants. It is very effective. Social engineering bypasses cryptography, computer
security, network security, and everything else technological. It goes straight to the
weakest link in any security system: the poor human being trying to get his job done and
wanting to help out if possible. Social engineering is more prevalent in a distributed
environment since it is not so small where everyone knows everyone.
63

And if insiders ever turn to the other side and attack their company’s DB, they can be
impossible to stop because they are the exact people DB administrators are forced to
trust. Insiders might be less likely to attack a system than outsiders are, but systems are
much more vulnerable to these people (Schneier, 2000, p.48).
The Hackers
Many years ago, the term hacker was definite as a brilliant, constructive computer genius
that launched the information revolution. They were people dedicated to the high-
minded ideal of demystifying technology, creating for the good, and helping others
through technology’s strengths. These were, indeed, the first hackers. There are many
individuals in the computer industry today who pride themselves on being hackers in the
truest sense of the word: people interested in solving problems and creating solutions
through the use of computers and technology (Jamsa, 2002, p. 24).
In recent years, however, the media has given the name hacker to a new class of
programmer. These programmers (known within the computer industry by a variety of
disparaging names, including “crackers,” “phrackers,” and “phreakers”), tear down
computers and DB systems rather than helping society. Crackers are programmers
specializing in cracking into proprietary systems, either as a prank or to corrupt data.
Phrackers are special class of hackers who devote their time to hacking out programs that
either deliver free telephone calls or otherwise penetrate the computers and DBs of
telephone companies. Phreakers use stolen telephone information (including calling card
number, cell phone numbers, and so forth) to access other computers. While many of
64

these crackers and phrackers try to penetrate systems simply for personal pleasure, the
number of crackers who participate in industrial espionage and sabotage are also
increasing. Business, after all, is like war. Imagine business rivals breaking into
competitors’ DBs, stealing inside knowledge or intellectual properties, and gaining unfair
advantages. This could put someone out of business before they can catch the thieves.
However, whether for personal satisfaction or for other purposes, these code breakers13
pose a singular threat to DDBMSs.
As mentioned above, hackers come in many different categories. Crackers specialize in
cracking systems, and the majority of crackers are a cut below most hackers in technical
expertise. The stereotypical cracker is a high-school kid on a home PC, clumsily
crashing systems though brute-force passwords (Jamsa, 2002, p. 35). These crackers pose
the greatest risk to DDBMSs because they do not really know what they are doing. A
good analogy would be an untrained gun user who may be more dangerous than a trained
gun user.
Hacker Profiling
Generally, most hackers are young, male, and have low self-esteem. They might be
young “techno-junkies” with incredible egos and a need to share details of system
conquests with one another. They usually do not have a lot of money, but sometimes
have a lot of times. They usually have their own counterculture: hacker names or handles,
lingo, and rules (Schneier, 2000, p.44). Most of them are driven by a desire to
13
Despite the existence of an appropriate and applicable definition for individuals committing computer
crimes, the media has categorized all system intruders as hackers. Therefore, this paper will mostly use the
generic term “hacker” to refer to the intruder against whom organizations are learning to protect.
65

Master Thesis Security in Distributed Databases- Ian Lee

Master Thesis Security in Distributed Databases- Ian Lee

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Master Thesis Security in Distributed Databases- Ian Lee

Similar to Master Thesis Security in Distributed Databases- Ian Lee (20)

Master Thesis Security in Distributed Databases- Ian Lee