Data Protection and Rapid Recovery From Attack With A Private File
When a personal computer is attacked, the most difficult thing to recover is personal data. The operating system
and applications can be reinstalled returning the machine to a functional state, usually eradicating the attacking
malware in the process. Personal data, however, can only be restored from private backups – if they even exist.
Once lost, personal data can only be recovered through repeated effort (e.g. rewriting a report) and in some case
can never be recovered (e.g. digital photos of a one time event). To protect personal data, we house it in a file serv-
er virtual machine running on the same physical host. Personal data is then exported to other virtual machines
through specialized mount points with a richer set of permissions than the traditional read/write options. We imple-
ment this private file server virtual machine using a modified version of NFS server installed in a virtual machine
under various virtualization environments such as Xen and VMWare. We demonstrate how this architecture pro-
vides protection of personal data as well as rapid recovery from attack. Specifically, we demonstrate how an intru-
sion detection system can be used to stop a virtual machine in response to signs of compromise, checkpoint its cur-
rent state and restart the virtual machine from a trusted checkpoint of an uncompromised state. We show how our
architecture can be used to defend against 5 of the 9 new viruses posted on CERT since April 21, 2004. We also
demonstrate that by placing the user’s applications in a virtual machine rather than directly on the base machine we
can provide near instant recovery from even a successful attack. Finally, we quantify the overhead costs of this ar-
chitecture by running a series of benchmarks on both Windows and Linux in the base machine as well as on an
NFS partition mounted in a virtual machine.
1. Introduction users do not routinely backup their data. Once lost, per-
sonal data can only be recovered through repeated ef-
fort (e.g. rewriting a report) and in some case can never
Worms and viruses have entered the consciousness of
be recovered (e.g. digital photos of a one time event).
the majority of personal computer users. Even novice
users are aware of the attacks that can come in the form
We propose the use of a specialized private file server
of email from a friend or a pop-up ad from a web site.
virtual machine to provide added protection for person-
The goals of an attack can vary from using the compro-
al data. This file server virtual machine is made acces-
mised system to attack others, to allowing a remote at-
sible only to other clients running on the same host by
tacker to harvest data from the system, to outright cor-
way of a local virtual network segment. Personal data
ruption of the system.
is housed in the private file server and exported
through specialized mount points with a richer set of
Fully restoring a compromised system is a painful pro-
permissions than the traditional read/write options.
cess often involving reinstalling the operating system
This architecture provides a number of benefits includ-
and user applications. This can take hours or days even
ing 1) the opportunity to separate personal data into
for trained professionals with all the proper materials
multiple classes to which different finer grained per-
readily on hand. For average users, even assembling
missions can be applied, 2) the separation of personal
the installation materials (e.g. CDs, manuals, configu-
data from system data allowing each to be backed-up
ration settings, etc.) may be an overwhelming task, not
and restored appropriately, 3) the ability to rapidly in-
to mention correctly installing and configuring each
stall or restore virtual machines containing fully con-
piece of software.
figured applications and services, and 4) rapid recovery
from attack by rolling back system data to a known
To make matters worse, the process of restoring a com-
good state without losing recent changes to personal
promised system to a usable state can frequently result
in the loss of any personal data stored on the system.
From the user’s perspective, this is often the worst out-
In Section 2, we describe our architecture and its bene-
come of an attack. System data may be painful to re-
fits in detail. In Section 3, we compare our architecture
store, but it can be restored from public sources. Per-
to other solutions with similar goals such as system
sonal data, however, can be restored only from private
backup utilities and network booting facilities. In Sec-
backups and the vast majority of personal computer
tion 4, we describe how it can be used to protect user Virtual machine appliances can have two network in-
data against specific attacks. In Section 5, we quantify terfaces – one on the physical network bridged through
the overheads associated with this architecture by run- the base machine and one on the local virtual network.
ning a variety of benchmarks on a prototype imple- Depending on its function, a virtual machine appliance
mented using a modified version of NFS in conjunction may not need one or both of these network interfaces.
with virtual machines in both Xen and VMWare. We For example, you may choose to browse the web in a
discuss related work in Section 6, future work in Sec- virtual machine appliance with a connection to the
tion 7 and finally, conclusions in Section 8. physical network but with no interface on the local vir-
tual network to prevent an attack from even reaching
2. Architecture the file server virtual machine. Similarly, you might
choose to configure a virtual machine with only access
Figure 1 illustrates the main components of our archi-
to the local virtual network if it has no need to reach
tecture. A single physical host is home to multiple logi-
the outside world.
cal machines. First, there is the base machine (labeled
with a 1 in the diagram). This base machine contains a
virtualization environment which can be implemented 2.1. Base Machine
as a base operating system running a virtual machine We have implemented several prototypes of this archi-
system such as VMWare or as a virtual machine moni- tecture using either Linux or Windows as the base op-
tor such as Xen. Second, there is a virtual network that erating system and Xen or VMWare as the virtual ma-
is accessible only to this base machine and any virtual chine monitor. There are several other excellent virtual
machine running on this host. Third, there is a file sys- machine systems we could have used, but our purpose
tem virtual machine which has only one network inter- was not a comparison of existing virtual machine sys-
face on the local virtual network. This file system vir- tems. We chose VMWare for its robustness, ease of use
tual machine is the permanent home for personal data and support of Windows guest operating systems. We
and exports subsets of this personal data store via spe- chose Xen for its lower overhead [Xen03][CDD+04].
cialized mount points to local clients. Fourth, there are
virtual machine appliances. These virtual machines Regardless of implementation, the base machine is
house system data such as an operating system and user used to create the local virtual network, the file system
applications. They can also house locally created data virtual machine and the virtual machine appliances. It
temporarily. is used to assign resources to each these guests. It can
also be used to save or restore checkpoints of virtual
machine appliance images.
We also use the base machine as a platform for moni- work. Attacks cannot target the file system virtual ma-
toring the behavior of each guest. For example, in our chine .
prototype, we ran an intrusion detection system on the directly. They could only reach the file system virtual
base machine. It can also be used as a firewall or NAT machine by first compromising a virtual machine appli-
gateway further controlling access to even those virtual ance. This would require two successful exploits – one
machine appliances with interfaces on the physical net- against an application running in a virtual machine ap-
work. Note that the base machine can monitor both the pliance and one running against the NFS server run-
incoming and outgoing network traffic from the virtual ning on the file server virtual machine.
machine appliances. It can detect both attack signatures
in incoming traffic and unexpected behavior in outgo- Personal data is housed in the file system virtual ma-
ing traffic. For example, it could indicate that all outgo- chine and subsets of it are exported to virtual machine
ing network traffic from a particular virtual machine appliances. This allows you to restrict both the amount
appliance should be POP or SMTP. In such a configu- and access rights that a given virtual machine has to
ration, unexpected traffic such as an outgoing ssh con- your personal data. For example, if you have a virtual
nection that would normally not raise alarms could be a machine appliance running a web server, you may
detected as signs of a possible attack. (Is this a good chose to only export the portion of the personal data
example?) store that you wish to make available on the web. You
can export portions of your user data store with differ-
The security of base machine is key to the security of ent permissions in different virtual machine appliances.
the rest of the system. Therefore, in our prototype, we For example, you may mount a picture collection as
“hardened” the base machine by strictly limiting the read only in the virtual machine you use for most tasks
types of applications running on the base machine. and then only mount it writeable in a virtual machine
Normal user activity takes place in the virtual machine used for importing and editing images. This would pre-
appliances. We also closed all network ports on the vent your collection of digital photos from being delet-
base machine. It would also be possible to open a limit- ed by malware that compromises your normal working
ed number of ports for remove administration, but environment. Similarly, you may choose to make your
since each open port is a potential entry point for at- financial data accessible only within a virtual machine
tack, it is important to carefully secure each open port. running only Quicken or you may choose to make old,
rarely changing data read-only except in the rare case
2.2.File System Virtual Machine that you do really want to change it.
We implemented the file system virtual machine using It is simple to have multiple mount points within the
a modified version of Sun’s Network File System same virtual machine. You can mount some portions of
(NFS) version 3 running in a Linux guest virtual ma- your personal data store read only and others read/write
chine. Virtual machine appliances using both Linux into the same virtual machine appliance.
and Windows as the guest OS mount personal data
over NFS across a local file system. For Linux, there is We also implemented a richer set of mount point per-
an open source NFS client. For Windows, we used a missions to allow “write-rarely” or “read-some” se-
commercial NFS client from Labram [Labtam]. Note mantics. Specifically, we modified NFS to add read
that our modifications are only to the NFS file server and write rate-limiting capability to each mount point
code. Unmodifed NFSv3 clients can be used with our in addition to full read or write privileges. One can
modified server. specify the amount of data that can be read or written
per unit of time. For example, a mount point could be
Much like the base machine, the file system virtual ma- classified as reading at most 1% of the data under the
chine is hardened against attack by stripping away any mount point in 1 hour. Such a rule could prevent down-
unnecessary applications and closing all unnecessary loaded code from rapidly scanning the user’s complete
network ports. It is easier to secure a system with a data store.
limited number of well-defined services than a general
purpose machine. All the software in the file system Figure 2 shows an example of an /etc/exports file with
virtual machine is focused on exporting personal data read and write limits. The first line indicates that the
to local clients and to facilitating maintenance on that client at IP Address foo can read bar.
data such as backup, the creation of particular exported
volumes and the setting of permissions that each client This required …describe patch
can have to the exported volumes.
Still could allow malware that read very slowly
The file system virtual machine is additionally protect- through your data store, but another good hurdle
ed by only being reachable over the local virtual net-
Read and write limits are one example of a richer set of a known-good checkpoint is restoring a compromised
mount point permissions that can be used to help pro- virtual machine appliance from a trusted snapshot.
tect against attack. Append only permissions could be Any changes made within the virtual machine appli-
used to allow prevent removal or corruption of existing ance since the checkpoint would be lost, but changes to
data. For example, a directory containing photos could personal data mounted from the file server machine
be mounted append-only in one virtual machine appli- would be preserved. In this way, personal data does not
ance allowing it to add photos, but not to delete exist- become an automatic casualty of the process of restor-
ing photos. To delete photos, the same directory could ing a compromised system.
either be temporarily mounted with expanded permis-
sions or another virtual machine with expanded permis- The file system virtual machine is hardened against at-
sions could be used. Another example would be re- tack by carefully controlling the software that is run
stricting the size or file extension on files that are creat- within it. Virtual machine appliances, however, will
ed (e.g. no “exe” files). We did not implement these continue to run an unpredictable mix of user applica-
additional permission types, but they would each be a tions including some high-risk applications. As a result,
relatively straightforward addition to what we have al- they may susceptible to attack through an open net-
ready implemented. work port running a vulnerable service or through a
user-initiated download such as email or web content.
One benefit of the file server virtual machine is that it
contains exactly that irreplaceable personal data that is Compromised virtual machine appliances can often be
most important to back up regularly. In this way, our automatically detected by the intrusion detection sys-
architecture helps facilitate efficient back-up of user tem running on the base machine. In our prototype, we
data. have demonstrated that we can detect a attack with a
failing rule in Snort and in response to the rule failure,
we can stop and checkpoint the compromised virtual
2.3. Virtual Machine Appliances machine as well as restart a known good checkpoint of
the same machine. This process is nearly instantaneous
– requiring only sufficient time to move the failed sys-
Virtual machine appliances isolate system state much
tem image to a well-known location and move a copy
like the file system virtual machines isolate personal
of a trusted snapshot into place. Changes in the system
data. Each virtual machine appliance contains a base
configuration made after the trusted image was created
OS and any number of user level applications from
would be lost as would any information stored directly
desktop productivity software to server. They can have
in the corrupted guest. However, the checkpoint image
network interfaces on the physical network allowing
would provide an immediately functional computing
communication with the outside world. They can also
platform that would mount the user’s data store from
have network interfaces on the local virtual network
the file system virtual machine.
over which they can mount subsets of personal data
from the file server virtual machine.
We do this with a combination of Snort, snort rules,
shell scripts etc. details
There can be multiple mount points from the file sys-
tem virtual machine into a client. Each mount point can
To prevent future attacks, the trusted image should also
have different permissions to allow finer grain control
be updated to patch the exploited vulnerability. Thus
over the allowable access patterns. For example, in a
the user is notified when an automatic restoration ac-
single virtual machine you might mount your email
tion takes place. Analysis of the corrupted image
BLAH and your
and/or secure logs collected by the virtual machine
monitor [King03b ] could provide clues to what needs
Like the file server virtual machine facilitates the back-
to be modified.
up of personal data, virtual machine appliances facili-
tate the back-up of system data. System data however
The corrupted image can be saved or shipped to a sys-
changes at a different pace than personal data. Specifi-
tem administrator for analysis and even possible recov-
cally, changes to system data can be clearly identified
ery of data stored inside. During this analysis and re-
(e.g. when a new application, upgrade or patch is ap-
covery process, the user would still have a functional
plied). Also, since system data is recoverable from
computing platform with access to the majority of their
public sources, it can be less crucial that we capture all
data. This is a significant improvement over the ex-
the changes to system state.
tended down time that is often required when restoring
a compromised system today.
In our prototype, we save known-good checkpoints of
each virtual machine appliance. One important use of
System restarted will still have vulnerability that was To quantify this benefit, Table 1 lists the time it took us
originally attacked. Therefore, we limit the number of to install a variety of software. The measurements list-
automatic restarts. For example, after three restarts of a ed reflect local experiments installing software when
given image, any further compromise will result in with the user had already successfully installed the
stopping the virtual machine and checkpointing, but software at least once before. The time it takes a new
not in restarting the “trusted” snapshot. Another option user to install this software could be significantly high-
would be to leave the virtual machine running but to er as they frequently run into problem that can delay
pull its access to the physical network and the local vir- them for hours or even days (witness the the many in-
tual network. This is similar to turning off or disabling stallation FAQs and installation questions posted to
the network port for a machine demonstrating signs of newsgroups across the Internet).
compromise until a system administrator can resolve
the problem. The times in Table 1 can also be considered a measure
of the time saved whenever a checkpoint of a virtual
It is worth noting that users can also trigger the restora- machine appliance is used to recover from attack. Each
tion process manually if they suspect a compromise. time a virtual machine appliance is recovered from a
Users could also use the restoration process to rollback know good state, this is a lower bound on the time
the start of a virtual machine appliance for any other saved in reinstallation. If it has been some time since
reason (e.g. they installed a piece of software and sim- the user installed the software, the time savings are
ply don’t want to keep it in the system likely to be even higher as they must spend time gath-
ering the installation materials and possibly stumbling
Similarly, the restoration process can be used to recov- into some of the same errors a new user would.
er from accidental system corruption, e.g. from a rou-
tine patch or upgrade the introduced instability into the We measured the times in Table 1 locally, but in retro-
system. Many users are rightfully nervous about possi- spect, we would love to see average expert and novice
ble system corruption from routine software updates. In install times routinely listed for all software. Such met-
addition to the promise of safer and better operation, rics could be reported by both the software developers
any update represents a non-negligible chance of sys- and/or by third party evaluators.
Software Installation Time for an
Checkpoints allow regular upgrades to a virtual ma- Experienced User
chine appliances to be tested without risk. You can ap- (minutes)
ply a patch or an upgrade to software in the system se- Base Windows Desktop
cure in the knowledge that if it causes instability you Install
can return to the last checkpoint. Many users do not Windows Desktop Install
regularly apply patches and system upgrades because with an array of user level
of the risk of instability. Stable checkpoints would en- software
courage users to be compliant with upgrade requests by Base Linux Desktop In-
allowing them to easily experiment with the upgraded stall (RedHat)
image. Reducing the risk of regular upgrades and Base Linux Desktop In-
patches is another subtle way in which virtual machine stall (Ubuntu)
appliances enhance system security. Linux Base Installation
with Apache Web Server
Another crucial benefit of virtual machine appliances is Linux Base Installation
that they not only provide rapid recovery from attack with mySql
by restoring a known good snapshot, but they also pro- Linux Base Installation
vide rapid first time installation of software systems. with sendmail
Anyone who has struggled for hours to install and con- Checkpoints can also be used to transfer working sys-
figure software that is already running on another ma- tems images from one physical host to another. For ex-
chine will appreciate this benefit. Preconfigured virtual ample, users can move copies of a working virtual ma-
machines with fully functional, preconfigured web chine appliance onto both their laptop and desktop ma-
servers, database servers, etc. would save new users chines. As another example, a completely configured
hours of headaches assembling and installing all the FTP server virtual machine appliance could be dupli-
dependencies. This is similar to the benefits of cated and transferred to any other system – transferred
LiveCDs that allow users to experiment with fully con- from one machine to another for the same user, trans-
figured versions of software without the drawbacks of ferred from one user to another or even downloaded
slow removable, unmodifiable media. from a web site.
The number and type of applications in each virtual
In many ways, we see checkpoints of virtual machine machine appliance can be tailored to the usage require-
appliances as a new model for software distribution. ments and desired level of security. At one extreme,
Often times installing and configuring server software there could be only one virtual machine appliance con-
is difficult and time-consuming. A pre-configured vir- taining all the software normally installed on a users
tual machine appliance could be delivered to a user base machine. At the other extreme, there could be
with well-defined resource requirements and connec- many virtual machine appliances each with a subset of
tions to the rest of the system including the characteris- the user’s software.
tics of any mount points into the user’s data store.
Multiple virtual machine appliances allow finer grained
The term “appliance” implies a well-defined purpose, control over resources required, expected behavior and
well-define connections to the rest of the world and a the subset of personal data accessed. For example, a
minimum of unexpected side effects. Physical applica- web server virtual machine appliance may be given
tions typically specify their resource requirements and read only access to the content it is serving and may be
can be replaced with an equivalent model if they mal- prevented from establishing outgoing network connec-
function. In the case of a virtual machine appliance, a tion. Thus even if the web server is attacked, the dam-
user would load it on their system and plug in into their age done to the user’s system is minimized. The attack-
data store by mapping its defined mount points to the er would also be prevented from harvesting informa-
exports from the local file system virtual machine. If tion from the rest of the user’s data store and their abil-
the virtual machine appliance is attacked or malfunc- ity to use the system as a launching pad for other at-
tioned, it would be straight-forward to replace it with a tacks would be diminished.
new functional equivalent without losing your personal
data . When each virtual machine appliance has a small num-
ber of applications, it is easier to characterize expected
This would provide a new platform for value added behavior which of course makes it easier for intrusion
services including configuration, testing and characteri- detection software running on the base machine to
zation of virtual machine appliance. Those who pro- watch for signs of an compromised system. It is also
duce virtual machine appliances could compete to pro- easier to configure the virtual machine appliance with
duce appliances that have the right combinations of the smallest set of rights to personal data that is neces-
features, that are easy to “plug in”, that have a good sary to accomplish the task.
track record of being resistant to attacks, that use fewer
system resources or that set and respect tight bounds on However, each additional virtual machine appliance re-
their expect behaviour. Appliances that reliably provide quires additional memory when executing and addi-
the advertised service without violating their resource tional diskspace to store the operating system and other
requirements would have value to users. In fact, we ex- common files. Multiple virtual machine appliances also
pect that most users would like to view computers as make it more difficult to share data between applica-
appliances that reliably perform the advertised actions tions. For these reasons, it is best to group as many ap-
without unanticipated side effects. Virtual machine ap- plications with similar requirements together as possi-
pliances would be a good step in this direction. Valida- ble.
tion of these virtual machine appliances may also be
more straightforward because they would be designed Taken to the extreme however this could mean a seper-
to run in isolation from other user level applications. ate virtual machine for each applications. We are not
advocating this extreme. It is easier for users when they
can exchange data between applications and many ap-
Virtual machine appliances are particular attractive in plications with similar resource, data and security re-
the context of open source software because any num- quirements can and should be grouped together. In our
ber of applications could be distributed together in an experience with our prototype, we have found that a
virtual machine appliance without concern for licens- good strategy is to isolate those applications with spe-
ing requirements of each individual software package. cial security needs. For example, applications that are
This could be a significant hurdle for commercial soft- commonly attacked (e.g. server software such as web
ware however without new licensing models. Similar- servers or database servers) are good candidates for
ly, developers of open source software could distribute their own virtual machine appliance. Similarly, appli-
virtual machine appliances with a complete develop- cations requiring access to sensitive personal data such
ment environment including source code with all the as financial data are also good candidates for their own
proper libraries required for compilation and software virtual machine appliance.
to support debugging and testing.
(Doe we want to mention this? )One of the more diffi- virtual machine appliance onto the users system. Tools
cult cases is applications such as web-browsing and that made the creation, inspection and validation of
email access. These applications are common sources these contracts easier for users and developers would
of attacks. However, they are also applications that be a helpful addition to such a system.
users like to have tightly integrated with their personal
data. For applications such as these, there is a range of
solutions along a spectrum of security and ease-of-use. 3. Comparison to Other Solutions
For example, we have found it effective to support
web-browsing in multiple virtual machines – one with
In this section, we compare our architecture to several
rights to access the personal data store and one with no
existing solutions such as regular full backups of a
complete system, network booting combined with im-
porting data from an independent physical file server
The base machine creates a set of resource limits for
and system reset facilities such as DeepFreeze or State-
each virtual machine appliance in several ways. First,
the base machine can allocate a limited amount of sys-
tem resources such as memory, disk space or even
How to do full backup – various methods some more
CPU time to each guest. Second, the base machine can
focused on needs of personal data and some of system
restrict access to the local virtual network and/or the
data. For example burning data to DVD or other re-
physical network connection. In either case, access can
movable media is a protable backup for personal data
be denied completely or simply restricted through fire-
but is typically not bootable so it does not restore the
wall rules. Third, the intrusion detection system run-
system to a bootable state again quickly
ning monitor the behavior of the guest for both attack
signatures and otherwise “innocent” looking traffic that
Making a ghost image is a good way to restore the sys-
is simply unexpected given the purpose of the virtual
tem to a fully bootable state but harder to do incre-
machine appliance. For example, you might not expect
mentabls and expensive to do full backup of old and
to see any outgoing network connections from a web
new personal data, system data everything on each
server virtual machine. (Too many web server exam-
ples – change some of these)
Also ghost images often not portable across machines
These limits can be thought of as a contract of sorts
(what CPU platform, devices supported etc) - can re-
with the virtual machine appliance. When a virtual ma-
store to same physical machine –but hard to send com-
chine appliance is loaded on the system, a contract is
promised machine out dor recovery and/or analysis
established that limits it to its expected set of behav-
while restoring to another platform. Virtual machines
iors. Virtual machine appliances for which their ex-
abstract these underlying details so they are portable
pected behaviour is well- characterized would make it
access machines – we used same VMWware client
easier to detect signs of attack or compromise and
imags for both Linux base OS and Windows base OS
would therefore be more valuable to users. Accom-
plishing the required functionality under a more re-
stricted contract would be another aspect of a high
In the case of ghost images there is also the difficul of
quality virtual machine appliance.
verifying full backups withouth trashing the existing
systems (or having a second duplicate system on
Contracts fix a fundamental problem with running new
applications. Applications typically run with a users
full rights but there is no method for holding them ac-
Can craft a Combination of full ghost images plus in-
countable doing only what is advertised. This leads to
crementable but each layer makes it more diffifult hard
Trojan horse exploits in which a piece of software
claims to accomplish a particular desired task but really
also has malicious unadvertised effects. Similar to run-
Protecting data in a file system virtual machine does
ning software with a reduced set of privileges like with
not remove the need for regular back-ups to protect
chroot or jail.
agains hardware failure etc, but it does streamline the
process by allowing backup efforts to focus on the irre-
In our prototype, these contracts are expressed through
placeable personal data rather than on the recoverable
a combination of Snort rules and limits imposed by the
system data. It also allows backup efforts to be cus-
virtual machine monitor. In the future, we would love
tomized to the differing needs of system data and per-
to see an unified contract language that could be used
to express all aspects of the contract. Such a contract
could be inspected by the user and then loaded with the
Tailor backup strategy to the differing needs of person- tion of responsibilitis to people with just one computer
al data versus system data. not a whole network
System data changes more slowly and predictably – Net boot and mount from separate FS machine
while user data is more dynamic and unpredictable in
its rate of change . To be more clear, there may be new
system data written every day – log files etc- but this is Versus DeepFreeze or other system reset facilities –
not really use visible changes – changes that are impor- great for environments where you want a fresh start ev-
tant to preserve in the case of a compromised machine erytime – here we want to be able to make changes –
download new apps and have them stay there etc – but
In addition, there is a mismatch between the overall isolate user data – if VM with a new app dies you can
rate of change in system data and the user visible rate reinstall from last known base
of change. A large percentage of the data written in the
average computer system is not directly related to user TODO cOMPARE to DeepFreeze or other system reset
data (e.g. logs of system activity or writes to the page facilities
file). However, it is produced constantly even when the TODO stateless Linux clients (Red Hat Fedora
system is otherwise idle. This activity is of no interest project)? network booting
to most users as long as the system continues to func-
tion. If a month’s worth of such activity were lost,
users would be perfectly happy as long as the system
was returned to an internally consistent and functioning 4. Protection Against Attacks
User data is exactly the opposite. Overall, user data To assess how well our architecture prevents and helps
changes at a slower, human driven pace. However, the recover from attacks, we began by examining recent
rate of user-visible change is high. For example, a user CERTadvisorie. We found that there have been 9 adv-
may only add 1 page of text to a report in an 8 hour sisories posted since April 21, 2004. These 9 attacks
workday but the loss of that one day of data would be are described briefly in Table X.
immediately visible. Fortunately, this means that ef-
forts to protect user data can be effective even if target- CERT attacks – how we defend against them
ed at a small percentage of overall data.
In addition the CERT advisories, we also examined
TODO quantitative data on backups - references for several well-known attacks including CodeRed, My-
how few people do itor expense/time involved Doom and Blaster. These attacks are briefly described
in Table Y.
If the user does make routine backups of their personal
data, the FS-VM can also make that process more effi- Other kinds of attacks we can defend against
cient. The FS-VM contains exactly the data that should
be backed-up. Even if the user does not make routine For example, these mount points can specify the allow-
backups, this configuration prevents loss of user data able read or write rate or prevent files with certain
simply because it is intermingled with high-risk system properties from being created.
Action item would be to group these into categories of
Why better than backup whole system – 1) ability to what fixes them
catch attack with Snort (if catch with inbound virus
scanner then can prevent) – if don’t then can look for write restriction lastest CERT restriction on exe-
attack out behavior and restart from known good or just cutables email and file sharing (some versions of
pull their network access outlook do not allow saving executables but many
email clients and many people don't have set - also
For system data, ease of checkpoint and restore. Appli- don't let you view HTML so many people turn off be-
ances – ability to try out new appliances or roll back to cause to restrictive ) we are not preventing people
old from saving and running executables - just not sav-
ing them to the FS-VM - we let people try them and
Much like bringing benefit of managed LAN to a per- if it goes wrong they can reset APP-VM to a known
sonal computer – file server, net boot server, firewall, it good state – this is much safer/easier user experi-
is bringing the benefits of network servers and sepera- ence safe "playpen"
Start with Xen numbers on ITL – iozone read and write
read restriction latest CERT restriction on read all
snort rule/restart/pull network CodeRed (2/9/05) Compare that to Xen on versus planet lab
Then examine VMWare numbers with Windows base
Leslie - MyDoom (Bagle like this?) lots of versions on the ITL machines – iozone read and write and
change registry to run itslef when the system restarts
freebench; VMWAre with Linux base on the ITL ma-
puts copies of itslef in shared directories what does it
chines (relace all this with Jason’s machines numbers if
do? mass emailing snort could catch and go back
to known good that wouldn't have registry mods those are better),
can't NFS mount the registry Teatime prevents registry
changes? Spybot Dbench?
MyDoom doesn;t do anything just mails copies of
itself takes up network BW leaves a backdoor - Details of experiments – machine specs, Version num-
allows later attacks to be run from this client over- bers of VMware and Xen, version numbers of iozone
writes host file so users can't acccess antivirus or win- and parameters to iozone, version numbers of
dows update site freebench, debench, versions of Linux and Windows
Kevin CodeRed1 and CodeRed2 on base, in guests, What NFS server and clinet ver-
CodeRed - a little defacing on main page of web sions, how many runs of each experiment
pages (little writing) 1-19 of month generate list of Ips
CodeRed1.0 static list of Ips CodeRed1.1 random What do we know from Xen paper and Freenix paper?
generation of Ips IIS exploit - some kind of server
attack 21-29 DOS attack on popular sites CodeRed1 Conslusions - does powerful vs less powerful matter?
stays in memory CodeRed2 writes an exectutable -
does base OS matter? does guest OS matter? does the
could do blocking of executables limits on outside con-
VM system matter - Xen vs VMWare (yes from other
nections CodeRed1 reboot would kill it off
and won't come back unless reinfected paper) does NFS vs AFS matter? break down base
OS/local FS to guest OS/ NFS/AFS in FSVM with
watch snort logs and if detect it - restart the VM only guest OS/ local FS
restart N times and then alert the user that this snort
rule constatnly fails restart it but without a bridged Conclusions – little to no overhead for freebench, over-
connection ******** or possibly pause the VM? head of up to 30% for io intensive for VMWAre –
equivalent of pulling a machines network connection overhead under Xen is much lower – however Win-
dows is most attacked so VMWare is crucial – even at
Patty – Blaster randomly picked an IP address and higher overheads still worth it for many users (prefer
then exploited a RPC flaw (port 135) snort rule finds stable but slower appliance to constant worry about at-
and pulls its network access automatically run the tack)
patch on this system??
VMWare is robust and it supports Windows VMS;
Xen suports XenoLinux today and plans for XenoWin-
5. Overhead of Private File Server and dows; For Windows VMs is VMWare the only choice?
Virtual Machine Appliances Win4Lin, Win, Crossover Office
whats wrong with VMWare? 1) we would prefer a
Clearly, this is a tax on the system performance. Intro- base OS that is a simple hardened VM manager;
duces overhead but given the power of modern PCS, VMWAre is not that VMWARE ESX - is it that?
many users will have resource to spare (disk space,
CPU speed, memory) and many users willing to pay Xen has lower overhead but does not support Win-
for the protection of their personal data and rapid re- dows. Since the majority of high-impact worms and
covery from attack. viruses target Windows systems, it especially important
to support Windows guests. However, there is on-going
work on XenoWindows, or the ability to support Win-
Lots of VM monitors comparision of them all beyond
dows guests on Xen.
the scope of this– we experiemented with VMWare
and Xen - why
Ways to further reduce avoid overhead – spectrum of We propose the use of virtual machine technology to
security and recoverability versus overhead isolate and protect user data from the rest of the system
and to provide rapid restoration of system state. Virtual
Another good strategy is to if we find that expensive machine technology has been available for over 30
togo to NFS/AFS server and cheaper to stay in a VM - years on mainframes [VM370, Goldberg74], but it is
then we are better off the more things are to system relatively new for commodity personal computers [De-
data (kept local) only pay the expense on mounted nali02, VMWare, Xen03]. Today, virtual machine
technology on commodity hardware has many applica-
tions including providing multiple operating systems
platforms on the same physical machine, building effi-
Some longer range plans include examining file traces cient honeypot machines and as a stable platform for
to quantify the percentage of file system traffic directed OS development. As the computing power and storage
to system data versus user data on a set of desktop sys- capacities of commodity platforms increase, virtual
tems. The lower the percentage of traffic directed to machine technology is also being used to provide en-
user data, the easier it will be to efficiently protect the hanced system services such as secure logging
user data without significant overhead. [Chen01] and backtracking of intrusions [King03].
Also not all “unrecoverable” data needs to be classified We propose using virtual machine technology to pro-
as personal data and stored in the file server virtual ma- vide rapid restoration of a compromised system.
chine. For example, web server logs may be useful and Recently, virtual machine technology has been used to
interesting but not worth preserving in case of attack. identify the specific vulnerabilities that allowed an at-
tack to succeed so that similar attacks can be prevented
in the recovered system. We apply virtual machine
technology to a related problem – that of rapid recov-
6. Related Work ery from an attack.
Virtual machine technology has been available for over
30 years on mainframes [VM370, Goldberg74], but it TODO on summary of other VM systems besides VM
is relatively new for commodity personal computers ware
[Denali02, VMWare, Xen03]. Today, virtual machine
technology on commodity hardware has many applica- Qemu and Bochs
tions including providing multiple operating systems
platforms on the same physical machine, building effi- 7. Future Work
cient honeypot machines and as a stable platform for Improvements to base systems – nice GUIs for easily
OS development. As the computing power and storage manipulating VMs, unified system to express all ele-
capacities of commodity platforms increase, additional
ments of the contract between based machine and a
applications of virtual machine technology are ex-
VM and tools to help create, inspect and validate these
Several systems have used virtual machine technology contracts
to enhance system security and fault tolerance. Bres-
soud and Schneider developed fault-tolerant systems Right now not for the novice user but it could be -
using virtual machine technology to replicate the state VMWware GUI is pretty easy to use integrate
of a primary system to space back-up system [Bres- some of these configurations by degault right now they
soud96]. Dunlap et al used virtual machines to provide don't give you a choice of VM with an interface on a
secure logging and replay [Dunlap02]. King and Chen local virutalk network and the real network you have to
used virtual machine technology and secure logging to configure that custom creates a market for easy to use
determine the cause of an attack after it has occurred. appliance VMS sell your VM ot others if it is easy to
Reed et al used virtual machine technology to bring un- use the "plugs" are well defined
trusted code safely into a shared computing environ- More benchmarks especially Windows system bench-
We focus on the related problem of rapid system
restoration and protection of user data. We are unaware We would also like to investigate logging data modifi-
of another system that has seperated user data and sys- cations in the file system virtual on a per client basis.
tem data in the way we are proposing and optimized Logging data modifications would allow the FS-VM to
the handling of each to provide rapid system restora- roll back the changes of a compromised VM when sus-
tion after an attack. picious behavior is detected.
In addition, the FS-VM could log modifications to the 9. References
user’s data store. As in a log-structured file system, the
data written would immediately be visible in the file
[Bressoud96] T. Bressoud and F. Schneider. Hypervi-
system; however, earlier versions of the file system
could be recovered if garbage collection of old data is sor-based fault tolerance. ACM Transactions on Com-
delayed. Such a delay in garbage collection would al- puter Systems, 14(1):80-107, February 1996.
low the user or the intrusion detection system time to
detect the attack. Once detected, the log could be trun- [Chen01] P. Chen and B. Noble. When virtual is better
cated at a point before the attack began. The length of than real. Proceedings of the 2001 Workshop on Hot
the log could be based on the amount of time required Topics in Operating Systems (HotOS), p. 133-138,
to detect an attack. Such a delay would also provide an May 2001.
undo facility to recover from accidental destruction of
user data. The maintenance VM could be used to con- [CDD+04] B. Clark, T. Deshane, E. Dow, S. Evanchik,
trol rollback of the FS-VM. M. Finlayson, J. Herne, J. Matthews. “Xen and the Art
of Repeated Research”, 2004 USENIX Annual Techni-
cal Conference FREENIX Track, June 2004.
[Denali02] A. Whitaker, M. Shaw, S. Gribble. Scale
List all the ways we help improve security – ability to and Performance in the Denali Isolation Kernel. Pro-
characterize and hold virtual machine appliances to ceedings of the 5th Symposium on Operating Systems
their resource limits, finer grain control of subsets of Design and Implementation (OSDI 2002), ACM Oper-
personal data, reducing the risk of regular patches and ating Systems Review, Winter 2002 Special Issue,
upgrades, collecting personal data together allows re- pages 195-210, Boston, MA, USA, December 2002.
sources for regular backup to be focused on the unre-
coverable user data, automatic restart of a compro- [Disco97] E. Bugnion, S. Devine, K. Govil, M. Rosen-
mised appliance to a known good state, ability to send blum. Disco: Running Commodity Operating Systems
checkpoints of compromised machines for analysis and on Scalable Multiprocessors. ACM Transactions on
recovery in parallel with running at least temporarily
Computer Systems, Vol. 15, No. 4, 1997, pp. 412-447.
on the last known good snapshot, ability to hold per-
sonal data in a virtual machine that is not even directly
accessible from remote machines without at least two [Dunlap02] G. Dunlap, S. King, S. Cinar, M. Basrai, P.
separated successful exploits one against the hardened Chen. ReVirt: Enabling Intrusion Detection Analysis
file system virtual machine through Virtual Machine Logging and Replay. Pro-
ceedings of the 2002 Symposium on Operating Sys-
Other benefits – ability to share working virtual ma- tems Design and Implementation (OSDI), p.211-224,
chine appliances across physical machines thus facili- December 2002.
tating a new software distribution method with many
advantages including saving users the headaches of ini- [Eust99] K.F. Eustice. A universal information appli-
tial installation and configuration ance. IBM Systems Journal, Volume 38, Number 4,
Recovering a compromised computer system can be a
painful process often involving reinstalling the operat- [iozone]
ing system and user applications. In addition, any per-
sonal data stored on the machine is often lost unless a [King03a] S. King, G. Dunlap, P. Chen. Operating Sys-
recent backup exists. We describe the fundamental dif-
tem Support for Virtual Machines. Proceedings of the
ferences between user data and system data which lead
2003 USENIX Technical Conference, June 2003.
to different approaches to protecting and recovery each
class of data after a attack.
[King03b] S. King and P. Chen. Backtracking Intru-
We have quantified overhead in a prototype system and sions. Proceedings of the 19th ACM Symposium on Op-
described ways to further improve the prototype for erating Systems, p. 223-236, December 2003.
both eas of use and further reductions in overhead
[LKML03] K. Fraser. Post to Linux Kernel Mailing Technical Report UCAM-CL-TR-552, University of
List, October 3 2003, URL http://www.ussg.iu.edu/hy- Cambridge, Computer Laboratory, Jan. 2003.
permail/linux/kernel/0310.0/0550.html accessed De-
cember 2003. [Xen03c] S. Hand, T. Harris, E. Kotsovinos, and I.
Pratt. Controlling the XenoServer Open Platform, April
[Norm98] D. Norman. The Invisible Computer: Why 2003.
Good Products Can Fail, the Personal Computer Is So
Complex, and Information Appliances Are the Solu-
tion. MIT Press, 1998. [ZPERF] S. Thoss, Linux on zSeries Performance Up-
date Session 9390. URL
[Dike00] J. Dike. A User-mode Port of the Linux Ker- http://linuxvm.org/present/SHARE101/S9390a.pdf ac-
nel. Proceedings of the 4th Annual Linux Showcase & cessed December 2003.
Conference (ALS 2000), page 63, 2000.
[Dike01] J. Dike. User-mode Linux. Proceedings of
the 5th Annual Linux Showcase & Conference, Oak- [freebench]
land CA (ALS 2001). pp 3-14, 2001.
[Dike02] J. Dike. Making Linux Safe for Virtual Ma-
chines. Proceedings of the 2002 Ottawa Linux Sympo- 10.OUTTAKES
sium (OLS), June 2002.
[Goldberg74] R. Goldberg. Survey of Virtual Ma-
chine Research. IEEE Computer, p. 34-35, June 1974.
Figure 1: First System Configuration
[Spafford89] E. Spafford. Crisis and Aftermath, Com-
2.1 Identifying User Data and System Data
munications of the ACM, 37(6):678-687, June 1989.
User data housed in the FS-VM would typically in-
[VM370] R. Creasy. The Origin of the VM/370 Time- clude a user’s home directory or documents directory.
Sharing System. IBM Journal of Research and Devel- Many applications are already configured to store data
opment. Vol. 25, Number 5. Page 483. Published 1981. in these locations. User data could also include other
locations in which applications store user data (e.g.
[VMWARE] Vmware, URL http://www.vmware.com mailboxes, calendars, archives of scanned photos).
accessed December 2003. Subsets of the user’s data store could be mounted into
the PLATFORM-VM at different locations and with
[Xen99] D. Reed, I. Pratt, P. Menage, S. Early, and N. different permissions.
Stratford. Xenoservers: Accounted Execution of Un-
trusted Code. Proceedings of the 7th Workshop on Hot System data would include the operating system image
Topics in Operating Systems, 1999. and related files as well as all installed software. Sys-
tem data would typically also include any information
[Xen03] P. Barham, B. Dragovic, K. Fraser, S. Hand, written by the OS or user applications such as logs or
T. Harris, A. Ho, R. Neugebauer, I. Pratt and A. paging files.
Warfield. Xen and the Art of Virtualization. Proceed-
Clearly, there is flexibility in this boundary between
ings of the nineteenth ACM symposium on Operating
user and system data. For example, if a user runs a web
systems principles, pp 164-177, Bolton Landing, NY, server, the content directory could be considered per-
USA, 2003 sonal user data and mounted read-only from the FS-
VM. However, if the user has a static copy of that data
[Xen03a] P. Barham, B. Dragovic, K. Fraser, S. Hand, within the FS-VM, she might simply place a copy in
T. Harris, A. Ho, E. Kotsovinos, A. Madhavapeddy, R. the PLATFORM-VM instead. Similarly, web server
Neugebauer, I. Pratt and A. Warfield. Xen 2002. Tech- logs may be considered system data. However, if the
nical Report UCAM-CL-TR-553, January 2003. user wanted to prevent the loss of these logs in case of
attack, a mount point could be established into the FS-
[Xen03b] K. Fraser, S. Hand, T. Harris, I. Leslie, and VM.
I. Pratt. The Xenoserver Computing Infrastructure.
2.2 Key Differences Between System Data and User legal content or as the launching pad for additional at-
This separation of user data from system data is moti- Once a system is compromised, attackers may be able
vated by several key differences between system data to log in at will and search through user data in a less
and user data. First, system data is more predictable automated fashion. In this way, attackers may locate
and thus is easier to attack. Second, system data is of- valuable user data such as credit card numbers or other
ten more valuable to an attacker and thus more attrac- financial data. However, the difficulty of automation
tive to attack. Third, system data can be restored from has to-date limited the scope of such attacks.
public sources and thus is less critical to protect. These
differences lead to different approaches to data protec- User data is rarely the primary target of an attack.
tion and restoration. However, it is often a casualty of attacks on system
software. Isolating user data from system data makes it
2.2.1 System data is more predictable and thus is easier less likely that user data will be lost simply because
to attack. malfunction
ing system software makes it inaccessible.
From the earliest worms [Spafford89], attackers have
exploited the homogeneity or monoculture of system
data. Attackers can target a small number of operating
systems or commonly used pieces of user level soft-
ware (e.g. web servers, email readers). The attackers
themselves have easy access to the actual binaries and
typical configurations. If they discover vulnerabilities
when probing their own test system, the chances are
excellent that the exact same vulnerabilities will exist
on many other systems.
Users data, on the other hand, is more unpredictable in
its contents, name and location. There are exceptions to
this rule. For example, some malware has exploited the
homogeneity of address books in Microsoft Outlook.
However, for the most part, user data is significantly
less predication and thus harder to exploit in an auto-
mated fashion than system data.
Malicious code or data can still be introduced into the
user’s data store. For example, a virus may be written
to a user’s email log. However, a virus scan program
could be run against the user’s data store to clean it
without loss of data and if the PLATFORM-VM was
compromised it could be restored.
2.2.2 System data is often more valuable to an attacker
and thus more attractive to attack.
As a general rule, user data is also less valuable to at-
tackers. For example, a report may represent weeks or
months of careful work for a user, but to an attacker it
is of no value. Similarly, a video of a special event may
be priceless to a user, but irrelevant to an attacker.
System data on the other hand is valuable because once
compromised it can provide the attacker with resource
such as an idle CPU, free disk space or an under-uti-
lized network connection. Attackers can use these re-
sources for many purposes such as to store and serve il-