SlideShare a Scribd company logo
1 of 34
Download to read offline
UNIVERSITÀ DEGLI STUDI DI PISA
Dipartimento di informatica
CORSO DI LAUREA IN INFORMATICA
TESI DI LAUREA
Lock-in issues with PaaS
Relatore:
Prof. Antonio Brogi
Candidato:
Federico Conte
ANNO ACCADEMICO 2012-13
“All truths are easy to understand
once they are discovered;
the point is to discover them.”
Galileo Galilei (1564 – 1642)
1
Contents
Introduzione 3
Introduction 4
1. Cloud Computing 5
1.1. Why using the cloud? . . . . . . . . . . . . . . . . . . . . 5
1.2. What is Cloud Computing . . . . . . . . . . . . . . . . . 7
1.3. Similar systems and concepts . . . . . . . . . . . . . . . . 10
2. Vendor lock-in 13
2.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2. Lock-in issues . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Conclusions 27
Acknowledgements 30
Ringraziamenti 31
References 32
2
INTRODUZIONE
Il vendor lock-in è un problema che affligge la tecnologia del cloud computing.
Lo scopo di questa tesi è quello di descrivere e mostrare tale problema nel
livello Platform-as-a-Service (PaaS) del cloud.
Il capitolo 1 è incentrato sul cloud computing e ne richiama i concetti base.
Capendo che cos'è il cloud computing è possibile collocare e descrivere in modo
appropriato il problema del vendor lock-in.
Il capitolo inizia elencando alcuni dei problemi di costruzione di una
applicazione senza utilizzare la tecnologia del cloud computing. Ciò mette in
risalto i motivi per cui è nata tale tecnologia; vengono quindi messe in risalto le
potenzialità del cloud. Successivamente, viene data una definizione formale di
cloud computing e vengono descritte le tecnologie su cui è basato.
Nel capitolo 2, viene introdotto il problema del vendor lock-in. Tale problema
esiste anche senza l'utilizzo di una soluzione cloud. Confrontando, però, il
“classico” lock-in con il lock-in nel cloud, risaltano le differenze tra i due e
quanto grave e pesante sia il secondo. Inoltre, il problema viene accentuato
appositamente dai fornitori di cloud, al fine di mantenere i loro clienti nel loro
sistema. Il capitolo prosegue approfondendo il problema del lock-in e
mostrando le problematiche che ne derivano agli utenti.
Elencando i vari motivi per cui un utente potrebbe decidere di cambiare
fornitore, mostreremo che non si tratta di una decisione rara quella di pensare
di cambiare provider. Addentrandoci nel cuore del problema, vedremo tre tipi
possibili di vendor lock-in e di come tale problema diventa più forte a seconda
del modello di servizio cloud scelto.
Per finire, concretizzeremo un minimo le cose fornendo un semplice esempio di
applicazione al fine di mostrare il lock-in in PaaS.
Nelle conclusioni, riporteremo alcune parziali al problema faremo un accenno
agli standard che si stanno formando.
3
INTRODUCTION
The vendor lock-in is a problem that afflicts the technology of cloud computing.
In this thesis, the purpose is to describe and show the vendor lock-in in the
Platform-as-a-Service (PaaS) level of cloud.
Chapter 1 focuses on cloud computing and it recalls the basic concepts in order
to collocate and describe the vendor lock-in problem properly.
The chapter begins by listing of the issues of building an application without
using cloud computing. This highlights the reasons why this technology is born
and so we praise its potentials. Follows a definition of cloud computing and a
description of the technologies on which it is based.
In Chapter 2, it is introduced the problem of vendor lock-in which exists even
without the use of a cloud solution. Comparing the “classic” lock-in with the
lock-in in the cloud, stand out the differences between them and how serious
and heavy is the second one. Moreover, the problem is made deeper by cloud
providers, in order to keep their customers in their system. Then, the chapter
starts to go into detail. First, it is shown what concern the users about moving
to the cloud.
Listing the various reasons why a user might decide to change provider, we will
show that it is not a rare decision the will of changing. Entering the core of the
issue, we will see three possible types of vendor lock-in and how this problem
becomes stronger as we go up in the stack levels.
In the end, in order to finalize the topic, we will provide a simple example of
application. This shows the lock-in in PaaS.
In the conclusions, we will report some of partial solutions to the problem and
we will do an overview on the standards that are forming.
4
Chapter 1
CLOUD COMPUTING
1.1 Why using Cloud Computing?
Suppose that you want to realize a business application to support your ideas.
But business applications have always been too expensive because behind each
one there is a world of complexity [1].
The main part should be to build your application (that is what you need)
and it is already difficult: you have to determine the goal, design the
application, pick a language, set up an IDE, write the code and test it. But in
addition you have significant challenges (Figure 1) [2], even with the
smallest application:
• install and configure a web server and a database,
• find some way to push out new versions of your code when you update
5
Figure 1: the significant challenges to build an application
your application,
• find some way to monitoring your application so you can see whether it
is down, see how much traffic you are getting,
• once you have that solved and you have got your application ready to
run, you have to find some place to actually put it. This requires a data
center that needs space, cooling, energy power, database servers, system
of backup, a network to keep everything connected,
• you need an expert team (IT team) to install and configure
everything,
• you have to deal with the operating system, virtual machines,
applications server, persistence, cache, security, and so on,
• sometimes you need to update your system or face problems like:
breakdown of machines and hard disk drive failure,
• you must maintain your application, writing patches to update it or fix
the bugs,
• when your application starts to grow, you need to get more machines,
expand your database, etc.
Hundred of thousand of companies do this now, it costs a lot of money and it
takes a lot of time. Besides they have to continue to pay regardless of the load
on their servers.
With Cloud Computing, instead of running your applications yourself, they run
on a shared data center. You do not need any more to be concerned about
over-provisioning for a service whose popularity does not meet its predictions,
thus wasting costly resources, or under-provisioning for one that becomes
wildly popular, thus missing potential customers and revenue [3]. In effect, you
have access to computing power instantly when you need it: if you
suddenly need more computing power dedicated to your application, you can
scale up as much as you need almost instantly and if your traffic goes down,
you can release your servers back into the clouds [4].
In the cloud, resources are virtual and unlimited and the details of the
physical systems on which software runs are abstracted from the user [5]. It
is like when you use an electric appliance, you plug it into an outlet and you do
not care about the electric power generated to sustain it. This is possible
6
because electricity is virtualized, that is, it is readily available from a wall
socket that hides power generation stations and a huge distribution grid. When
extended to information technologies, this concept means delivering useful
functions while hiding how their internal implementation [6].
The system is centralized, so it is easier to apply patches and upgrades on your
applications ensuring a good maintenance. The scale of cloud computing
networks and their ability to provide load balancing and fail-over makes them
highly reliable.
In conclusion, the cost to realize your business application is drastically
reduced, both upfront capital expenditures and maintenance and management
costs, because cloud computing eliminates the significant challenges discussed
at the beginning. In particular, it reduces the need of IT staff and it eliminates
the cost to buy machines and to repair broken hardware.
1.2 What is Cloud Computing
The National Institute of Standards and Technology (NIST) describes the cloud
as follows [7]:
“Cloud computing is a model for enabling convenient, on-demand
network access to a shared pool of configurable computing resources
(e.g., networks, servers, storage, applications, and services) that can be
rapidly provisioned and released with minimal management effort or
service provider interaction. This cloud model promotes availability and
is composed of five essential characteristics, three service models,
and four deployment models.”
Five Essential Characteristics [7][8]:
On-demand, Self-service → consumers can use cloud services when
they need them, without requiring human interaction.
Broad network capabilities are available over the network.→
Resource pooling the provider's computing resources are pooled to→
serve multiple consumers using a multi-tenant model, with different
physical and virtual resources dynamically assigned and reassigned
7
according to consumer demand.
Rapid elasticity capabilities can be elastically provisioned and→
released to scale rapidly up and down as needed.
Measured Service cloud services typically use the same model of an→
electricity bills: we use a certain amount of energy, and that is what we
pay for. Cloud systems automatically control and optimize resource use
by taking advantage of a metering capability. Resource usage can be
monitored, controlled, and reported. Besides, since cloud networks
operate at higher efficiencies and with greater utilization, significant
cost reductions are often encountered.
Three Service Models (Figure 2) [7][8]:
Infrastructure as a Service It is the lower layer of the cloud→
computing stack, both PaaS and SaaS rely on it. IaaS provides storage
systems, servers, switches, routers, firewalls, everything that you
normally use when building your own hardware infrastructure can be
offered as a service so you do not have to worry about hardware. The
IaaS service provider manages all the infrastructure, while the client is
responsible for all other aspects of the deployment. These can include
8
Figure 2: the three service models of cloud
the operating system, applications, and user interactions with the
system.
Platform as a Service In the PaaS model, the whole platform is→
offered as a single service, instead of offering each of the infrastructure
components separately. PaaS provides virtual machines, operating
systems, applications, services, development frameworks, transactions,
and control structures. The client can deploy its applications on the
cloud infrastructure or use applications that were programmed using
languages and tools supported by the PaaS service provider. The service
provider manages the cloud infrastructure, the operating systems, and
the enabling software1
. The client is responsible for installing and
managing the application that it is deploying.
Software as a Service SaaS simply refers to software that is→
provided on-demand for use. Traditionally, when someone wants to use
software he would go to the store, picks up some disks, takes them home,
and installs them on a computer. With SaaS, he just uses hosted
software. The applications are accessible from various client devices
through either a thin client interface, such as a web browser, or a
program interface. The consumer does not manage or control the
underlying cloud infrastructure.
Four Deployment Models [7][8]:
Private Clouds are clouds that are accessible only within an→
organization. This is useful if you want full control of the cloud, which is
useful from a flexibility, security and performance perspective.
Public Clouds the cloud infrastructure is provisioned for open use by→
the general public, like Amazon S32
and Gmail3
.
Community cloud the cloud infrastructure is provisioned for→
exclusive use by a specific community of consumers from organizations
that have shared concerns.
Hybrid Cloud the cloud infrastructure is a composition of two or→
1 Enabling software means the modification of the design or implementation of software to
allow internationalisation to take place.
2 It is an online file storage web service offered by Amazon Web Services
3 It is a free, advertising-supported email service provided by Google
9
more distinct cloud infrastructures that remain unique entities, but are
bound together by standardized or proprietary technology that enables
data and application portability.
1.3 Similar systems and concepts
Cloud Computing is the result of the evolution and adoption of existing
technologies and paradigms. It has the goal to allow users to take benefit from
all of these technologies, without the need for deep knowledge about or
expertise with each of them.
Cloud computing is based on several technologies [9]:
Web Applications
A web application refers to:
• an application that uses a web browser as a client,
• a computer software application written in a browser-supported
programming language and is reliant on a web browser to render
the application executable.
Web applications can permit to exploit cloud computing: when you use
them you are actually accessing applications that are sitting on other
servers using the web browser that is installed on your computer. For
example, trough a web browser you can log into your account in the
Google docs website and start writing documents as if the application
was installed on your own computer. Using a web browser, users can
connect from anywhere and from any device (device and location
independence).
Clustering [10]
A computer cluster consists of a set of connected computers (nodes) that
work together. The Clustering middleware is a software layer that
manages all the nodes so they can be viewed as a single system by the
users. In load-balancing clusters, the computational workload is
shared among the nodes to provide better overall performance. For
example, a web server cluster may assign different queries to different
nodes, so the overall response time will be optimized. So basically, let us
10
say a lot of people are trying to access a particular database on the
internet and that database is in a cluster. The cluster will redirect each
request towards the less load server. Besides, if one of the servers fails,
the cluster realizes that and will not send users to that server.
Grid Computing [11]
Grid computing is a form of distributed computing whereby a “super
virtual computer” is a collection of computer resources acting together
to perform large tasks. Unlike cluster computing, the grids tend to be
more loosely coupled, heterogeneous, and geographically dispersed.
Terminal services
Back in the sixties and seventies, there was the mainframe, a very
powerful computer, and connected to this there were the dumb
terminals. This was a simple little device with no intelligence and
composed by monitor, keyboard and mouse. The mainframe processed
all the operations given by these dumb terminals and then sent the
output to their monitors, and it gave every single dumb terminal a slice
of time, in order to simulate a dedicated computer for them. Now we
have basically the same thing, but we changed the terminology: we use
terminal services server instead of mainframe, and thin clients
instead of dumb terminals. Thanks to thin clients, you can connect to a
terminal services server and work on it, like if you were using your own
computer. Let us say you want to enable your employees to work with an
application, but you do not want to install it on every machine of the
office. You can install the application on the terminal services server, so
the employees can use the application through the thin clients.
This is called application server.
Virtualization [12]
Hardware virtualization refers to the creation of a virtual machine that
acts like a real computer with an operating system. This means that you
can transfer the entire operating system, with applications, settings etc.,
from one piece of hardware to another piece of hardware and everything
continues to function properly. Instead, before virtualization, if you
needed to do this operation, you would have to do a backup of all the
data, install the OS in the new piece of hardware, reinstall all the
11
applications and recover all the data.
Hypervisor or virtual machine monitor (VMM) is a piece of
computer software, firmware or hardware that creates and runs virtual
machines.
Type 2 (or hosted) hypervisor is a software installed on your operating
system that allows you to install other operating systems, an example is
VirtualBox.
Type 1 (or native) hypervisor is like an operating system that you install
on the machine where you want to run your virtual computers on.
After installed it, it does not allow you to do anything. You need to
install a management software on whatever computer you are going
to use in order to administer your virtual computers. Basically this
software connects through the network to the hyper-visor and, trough a
graphical interface, permits you to create all the virtual computers that
you need and you can move them easily on any other server on your
network that is running the hyper-visor. The management software can
look and see the status of the physical hardware for all of your servers.
So if there is a problem with a server in which your OS was running, this
is moved to another hyper-visor. Furthermore the management software
can turn on and off physical servers in correspondence to the request of
virtual computers, saving power energy and money.
Once people figured out that you can separate the operating system from
the hardware, some companies realized that they could sell instances
of virtual operating systems using the big under-used resources they
had.
12
Chapter 2
VENDOR LOCK-IN
2.1 Overview
In a recent report, the European Network and Information Security Agency
(ENISA) highlighted lock-in as one the biggest risks involved with cloud
computing [13]:
“There is currently little on offer in the way of tools, procedures or
standard data formats or services interfaces that could guarantee data
and service portability. This makes it extremely difficult for a customer
to migrate from one provider to another, or to migrate data and services
to or from an in-house IT environment. Furthermore, cloud providers
may have an incentive to prevent (directly or indirectly) the portability
of their customers services and data. This potential dependency for
service provision on a particular CP, depending on the CP's
commitments, may lead to a catastrophic business failure should the
cloud provider go bankrupt and the content and application migration
path to another provider is too costly (financially or time-wise) or
insufficient warning is given (no early warning).”
Vendor lock-in is a situation in which a customer using a product or service
cannot easily transition to a competitor's product or service. Vendor lock-in is
usually the result of proprietary technologies that are incompatible with those
of competitors. However, it can also be caused by inefficient processes or
contract constraints, among other things [14].
Said this, we can already see the problem. As discussed in chapter 1, with Cloud
Computing the cost to realize a business application is drastically reduced. So,
when a company has to evaluate the possible technologies to use, the choice to
rely on a cloud provider may seem obvious if the costs of migrating off the
13
system are under-estimated
But what happens if the cloud provider decides to double the prices? Or it goes
bankrupt? Or, let us assume that a company choose to rely on a cloud provider.
Then, its application grows to such a point that will require unexpected new
features not supported by that provider, but supported by the competition.
In these situations, the companies may have no choice but to migrate and to
face the expenses, related to this action, which could be prohibitive and
unexpected. Furthermore the companies have to do a new management plan in
order to face the complexities of cloud service migration, even if they would
like to avoid this cumbersome process. Therefore, some companies may find
themselves locked in a provider that does not meet any more their needs. This
fear of vendor lock-in is considered a major impediment to cloud service
adoption.
2.2 Lock-in issues
As discussed in the Chapter 1, if you want to build a businesses application
without using the cloud computing, you have to face significant challenges.
During these processes, you will do a lot of choices in terms of the hardware
you will buy, the OS you will use, the database, etc. With high probability,
during the deploying phase, proprietary operating system and database
features will be used. Moreover choices will be made taking into consideration
the potential of the hardware you have available.
There might have been good reasons at the time to do so (e.g. better
performance, better integration with development tools, better manageability,
etc.) but because of those decisions the cost of changing any of the significant
components of that software becomes too high to be practical. You are locked-
in [15].
Using cloud computing, you basically have the same thing. You choose a
provider who made its choices in term of system structure, SO, DB, hypervisor,
APIs supplied, etc. So under this point of view, the lock-in does not seem a big
problem. However, this time you do not own all these components. You are
14
paying and adapting your businesses application in order to use them. Besides,
you do not have full control of your system.
If we go back to the comparison between cloud computing and electricity, we
can clearly see the problem. If you do not like the company that is providing
you electricity, you can just change it. This operation does not imply to change
or configure in a different way all the electric appliances that you were using.
For more than a decade, IT managers and lawyers have been working tirelessly
to enable solutions based on common standards and protocols that can be built,
supported, swapped out and replaced, regardless of vendor. For instance, SOA
helped free us from the technological vendor lock-in model that we had a
decade ago, in terms of motivating the transition towards interoperability level
standards. This new degree of cloud vendor lock-in is a step backwards
from all the work that has been done with approaches such as service oriented
architecture. Cloud computing may be erasing the gains we have made in terms
of vendor dependence lock-in [16].
Generally, a service provider is not willing to yield his customers to
competitors. This means that the change of service provider can be a difficult
task, and we know it from our experience (e.g. changing our telephone
operator). In order to block the customers within its own services, many
companies even build blocks on service or platform exit making the transition
process difficult if not impossible.
This happens in the cloud, making the vendor lock-in even bigger.
One way they do this is through the security checks. The goal of security
controls is to restrict access to data. Therefore it is easy for a service provider
refer to the safety requirements as an excuse for not providing parts of key data
to allow a safe transition.
So the vendor lock-in is a real problem that needs to be faced and deserves
further focus. Many cloud platforms and services are proprietary, meaning
that they are built on the specific standards, tools and protocols developed by a
particular vendor for its particular cloud offering [16]. So when you are
building your application, if it relies on a cloud solution, you have to know that
you have accepted this situation and you should be ready to pay off handsomely
15
when you will not accept it any more. Also if you try to build a standards-
compliant application for certain interoperability functions, you will still find
the lock-in.
Actually, because standards are still being formed and developed, cloud
computing is still too immature to reach the point where customers are
demanding vendor independence [16].
Despite this, users start to fear the vendor lock-in problem, especially the data
lock-in, as shown in Figure 3.
The actual hosting of the application, the actual requirements for that
application to exist in a cloud environment, to connect to the virtualized
resources and whatever administration tools the cloud providers may give you
to configure and maintain the application, will be, for the most part, controlled
by the cloud provider [16].
Thus, many companies now using cloud services are in big trouble when it is
time to move on. And there may be many reasons why it is time to move
away from your current cloud provider [16][17]:
• it may go bankrupt,
• it may increase its services costs,
16
Figure 3: What concern you about moving to the cloud?
• it may change its terms of service,
• it may decrease the quality of services,
• it may have too many network or system outages. It is not rare, it can
also happen to the major services providers, it is the case of Amazon
Web Services that went down in North Virginia affecting Reddit,
Pinterest, Airbnb, Foursquare, Minecraft and others [18],
• it may not support some features that you will discover to have need,
• it could be bought out by a larger company involving a policies change,
• it may change the leasing terms or shift geographically. So you can find
conflicts with some legal requirements you may have.
Three types of vendor lock-in can occur with cloud computing [19][20]:
Platform lock-in
While it is possible to write a cross-platform application so it can be OS
independent, in the cloud you can not escape from the proprietary
libraries and the system configurations.
As we said, PaaS provides virtual machines, operating systems,
applications, services, development frameworks, transactions, and
control structures.
While developing your application, you probably will use these features
so you will be locked in the cloud provider. Indeed, a different provider
could implement the same features in a completely different way or may
not even provide the same functionality. Moreover a cloud service tends
to be built on one of several possible virtualization platforms, for
example VMWare or Xen. Migrating from a cloud provider using one
platform to a cloud provider using a different platform could be very
complicated.
Here is an overview of platform lock-ins in the best known PaaS
providers:
• Google App Engine: Using the Java languages on GAE you
have some limits, you can not find all the APIs, especially if they
require access to the file system. Moreover, GAE does not support
the servlet JEE. We can not implement a custom security for our
applications through the web.xml file so we are forced to use the
17
security mechanism of Google.
• Windows Azure: Windows Azure is particularly affected by the
vendor lock in, because of its unique .NET framework which is the
most supported and documented. Despite Windows Azure
supports other languages and frameworks, migrating from
Windows Azure remain hard because of the lock-in on the
operating system and Azure services.
• Salesforce: Force.com allows external developers to create
applications integrated as much as possible with the Salesforce
SaaS environment. In order to ensure this, all the applications
have to be develop using Apex, a proprietary language similar to
Java, and Visualforce, a syntax similar to XML to design user
interfaces in HTML.
Data lock-in
Since the cloud is still new, standards of ownership are not yet
developed. For instance, the ownership about who actually owns the data
once it lives on a cloud platform. Therefore, it will be hard for an user
moving data off of a cloud vendor's platform. Often you need to do the
following steps. First you have to move the data back to the customer's
site. Most of the time, the data may have been altered for
compatibility with the original provider's system, so you need to
reconvert them to their former state. Now, you can move the data to the
new provider environment.
But what happens if you have a huge DB on the cloud (e.g. 100TB) ? You
can not export the data on your computer because you do not have
enough memory. You would need to have an enterprise-level solution
just to store them, and that does not even count backing the data up.
This is assuming that they even give you direct access to the data you
have stored. If they do, they may give you the possibility to move the
data via a portable storage device or via Internet. In the second case, it
can be a problem if you do not have a reasonable time-frame to
download them, and it will take time to store them again on the new
cloud provider. If they do not give you direct access to the data, you can
retrieve through brute force only a part of them which is visible.
18
Assuming that you retain ownership of the data and you succeed to put
them in local. You may find them in a format you can not easily use, such
as backup tapes made by tools you do not own, so they are useless.
Moreover, even if you manage to read the format, some data may be
encrypted for security reasons preventing the logical access to them, if
the cryptographic keys are not provided [21].
Tools lock-in
If tools built to manage a cloud environment are not compatible with
different kinds of both virtual and physical infrastructure, those tools
will only be able to manage data or apps that live in the vendor's
particular cloud environment.
Let us make an simple example to clarify this point taking a service
commonly used by many users, that is Gmail. In Gmail, users can not
change email address but they can register a new account and transfer
all the mails into the new one using the mail client and IMAP. But,
usually, a Gmail account is used for the authentication in other services:
Google Reader, Google Docs, Google Plus, YouTube e Picasa.
If an user creates a new Gmail address, she needs to create a new profile
for each services related with the old one. Instead, if she wants to change
mail provider, she can export the emails but without features, like labels.
Besides the user will lose all the services related with the old Gmail
account and for some of them will be impossible to export the contents,
like in Google Plus.
In Figure 4, we can see the Thorsten's Lock-in Hypothesis4
: The higher the cloud
layer you operate in, the greater the lock-in.
This means that if you use an application in the cloud, such as an all-in-one CRM5
package, you have the highest chance of getting locked-in. Move one level down to a
platform in the cloud and you are somewhat less likely to get locked-in [17].
4 The title given to that assertion from Thorsten Von Eicken [17]
5 Customer relationship management (CRM) is a model for managing a company's
interactions with current and future customers. It involves using technology to organize,
automate, and synchronize sales, marketing, customer service, and technical support [22].
19
This theory is based on the fact that lock-in can actually occur at many levels in
the stack. The higher the level, the greater the services an user will receive and
the lower will be the control over them. Besides, the more the code is controlled
by the cloud and the more you tend to lose freedom.
Therefore, the more code s under your control the easier it is to replicate it
elsewhere and retain freedom. Here are a number of different layers at which
you could find yourself locked-in [17] :
• You may not own the application that manages your data or you need
to write a new one in order to change cloud provider.
• You may have used, in your application, third-party web services. They
could be only supported by a particular cloud provider.
• Your application may be coded in a proprietary development
environment. Besides it may run in a proprietary run-time
environment. So you will need to retrain programmers and rewrite
your application in order to move it to a different cloud.
• Your application may use of a proprietary language and, in order to
do the same operations, you will have to use another language supported
by the new cloud provider. In this case, you will need expert
programmers in both languages who are able to translate the interested
20
Figure 4: Thorsten's Lock-in hypothesis.
parties in order to move the application.
• Your data may have been stored in a proprietary or hard to reproduce
data model or storage system. So, maybe, you will need to transform
all your data and the code accessing it.
• You may have not access to your data or you can get it in a proprietary
format, like we said before.
• You may not own all the side information on your application, like the
log files, analytic information, metrics, history data, etc. So when
you will move in the new cloud provider, you will start from scratch.
• You may not control the operating system platform, the versions of
libraries and tools. So, when you will move the application into the new
cloud, you can not porting the operational procedures.
If you subscribe to a PaaS vendor, you will be limited to using IaaS and SaaS
products that are compatible with the Platform as a Service you choose.
This creates a "black box" effect. For instance, if you write a simple Python
application in Google App Engine, you will not have big porting problems, like
you will show in section 2.3. But if that application starts using GAE services, it
will getting harder moving it.
When you move down to an infrastructure cloud, it becomes easier to see how
you can move your application stack from one provider to another. After all,
there is not much distinguishing the Linux box you get in EC2 from the Linux
box you get at GoGrid. But even here, lock-in needs to be thought because the
system behaviour, from storage persistence to networking details and on and
on, is far from identical [17].
Users of PaaS services are raising the question of the portability of their
applications. In some cases, the porting operation has required major changes
in software and caused project delays and even productivity losses. This is
caused by two specific problems.
The first problem is the lack of a consistent platform definition among PaaS
providers. The second problem is a lack of credible alternative providers for a
platform. Because, if a big company, like Microsoft or Google, creates its PAAS
solution, this can discourage the competition.
Moreover, IaaS providers, such as Amazon, have been steadily adding services
to their IaaS platform, making it a simple form of PaaS. Since there are no
21
standards for these added services, using them will lock applications to a cloud
provider.
The greatest portability risk for PaaS over time will not be with the formal PaaS
platform, but with the evolution of IaaS services into PaaS services through the
addition of features, such as Amazon's Redshift or caching services [23]. Many
users of these platforms will never see themselves as PaaS users and they will
not realize of lock-in until the time they will try to move an application to
another provider.
2.3 Example
In order to show a realistic business case, we will suppose that a company
decides, in the first place, to make use of Google App Engine as its cloud
service of choice. So it decides to write an application in Python, Google's most
supported language, using the Django framework.
Let us assume that later, for some reasons, the company wants to change cloud
provider, and its choice falls over Microsoft Windows Azure.
With the goal of achieving an easier porting, it decides to keep the same
language and framework. By exploring this example, we will be able to point
out where the lock-in lies.
The application is a guestbook, which is very simple but complex enough to
show some of the most important features in the Cloud system.
You can access the website running on Google App Engine at the address
http://draxent-project.appspot.com.
When you access on this website, you find the guestbook with all the greetings
written on it (Figure 5). Each one of them is composed by the username, who
created it, and one row of the original message. If the row does not report all
the original information, you can click on it to see the entire message and the
optional photo attached on it. Above the book, you can find the menu with the
"write on it" and "login" buttons. The first one makes you write on the
guestbook through a specific form composed by a textarea and an optional file
input field. If you complete and submit the form, you are actually writing on
the guestbook, that is you writing on the Google DB. The second one allows the
22
authentication through Gmail.
In order to provide this features, the guestbook makes use of some Cloud
services available on the Google App Engine platform:
• Users Authentication through the Google Account API.
In this way, all the complexity related to create, manage, memorize and
authenticate the users is taken care of by the cloud.
• Database access by using Django-nonrel, a project to support Django
on non-relational (NoSQL) databases, such as Google App Engine's
Datastore. NoSQL databases are designed to be lightweight and more
scalable but they introduce some limitations, for example they do not
support JOINs.
• File storage by using Google Cloud Storage. This provides flexibility
with the uploading and retrieving of users-loaded content. For example,
storage maintenance tasks, such as getting more space for our files, are
greatly simplified: it is not needed to migrate the data on a different
23
Figure 5: guestbook running on GAE
server, additional storage can be purchased when needed.
We are now going to explain the steps and difficulties which were encountered
when porting the application to the Windows Azure platform.
The result of the porting can be found at the address http://draxent-
24
Figure 6: Authentication porting
project.azurewebsites.net. Even if both platforms support the Django
framework, there is nothing as Django-nonrel available for Azure. For this
reason, Django was configured to use a standard SQLite database.
Furthermore, there were some small discrepancies in the directory structure of
the project, due to the different Django versions used.
The first real problem occurred when porting the authentication method
(Figure 6). The Google App Engine version of the guestbook relied on the
simple Google Accounts API for that. Windows Azure provides a complex,
flexible system to handle identification, which makes it possible to use a lot of
services (Google, Facebook, Yahoo, Live, etc.) to login [24]. Unfortunately, the
feature proved to be almost impossible to set-up because, apparently, the APIs
have not been ported to Python yet. At the moment of writing, the only feasible
approach would be to implement them from scratch, by using the
documentation and other languages' examples as a guide. This could be beyond
the company's possibilities, because programmers would need specific skills
and proficiency in both languages in order to do it.
For that reason, it was more convenient to store the authentication details in
the database and to implement manually the login and create user processes.
25
Figure 7: File storage porting
The most obvious disadvantage is that the passwords could not be recovered, so
they were randomly regenerated and the users could then be notified and
encouraged to choose a new one.
By contrast, porting the cloud storage proved to be quite an easy task, because
the APIs were relatively similar (Figure 7).
Regarding the scaling policies' configuration, a direct porting would have not
been possible, for two reasons. On one hand, this information is stored in a
yaml file on Google App Engine, while on Azure they need to be entered
through a web interface; on the other hand, the policies provided by the two
services are not directly comparable.
The second major problem was the data porting, which was divided in two
steps: coping the users' files stored on the cloud, and porting the database
content. The first goal was achieved with a script running on Google App
Engine, using both GAE and Azure libraries. It simply lists the files on the GCS
bucket and then goes through each of them to upload it on the Azure container.
The second problem was more difficult to solve. The Google's database is not
available as a single file that can be downloaded, so the data had to be
extracted with brute-force and then stored in a simple format. The produced
file was then parsed by a script on the Windows Azure application, and the
information was finally written on the target database.
26
CONCLUSIONS
Now to conclude this report, we are going to do a brief overview on possible
solutions to lock-in problems.
As we said previously, cloud computing is still too immature to reach the point
where customers are demanding vendor independence. For this reason, the
problem is not getting smaller. On the contrary, new proprietary features are
steadily added on IaaS and PaaS, moreover new SaaS are created and they are
not compatible for all the cloud providers.
Therefore vendor lock-in may be unavoidable at this point, but what companies
need to do is to understand up-front what the exit strategy will be. Basically,
during the initial cost analysis, the companies should add this cost on the total.
For exit strategy we mean that a company should pick out the cloud provider
basing the decision on additional factors and invest on protective measures
against the lock-in [25]:
• Make sure that your application can be easily ported to other clouds. In
this way, you can move it if there is a service outages.
• If your application is highly customized, you should have a different
cloud proving as backup. Thanks to this, you will suffer less for the lock-
in problem. You can switch between the two cloud solution depending of
the convenience.
• You should know really well the PaaS you have chosen. In order to do
that, you should ask questions about where your PaaS is running and
how they are managing their risks of failure of a big cloud.
• In addition to the point above, you should ask about redundancy and
system architecture, as well, and evaluate all the information with the
help of network engineers and system architects.
• When you write your application, you should pick up a code that is
easier and faster to modify. Certain flavours and types of frameworks
and Web scripting environments are more difficult to change. Besides
you should pick up languages and frameworks that are supported by
more cloud provider as possible.
27
Another ways, in which a company can protect itself from the lock-in, is based
on middleware solutions, like mOSAIC5 [26] and CloudBees6 [27].
In these approaches a middleware layer will behave as a broker between the
application and the cloud infrastructures, providing an abstract interface for
developers and isolating them from the specific requirements of each cloud
vendor [28]. It seems a solution to the problem, because if a client wants to
change from one cloud provider to another one it can delegate this task to the
middleware. Furthermore this solution allows to the applications to use
features of different cloud providers (multi-cloud). However, middleware
solutions are often quite complex, heavy and they have to be deployed in
conjunction with the application, penalizing the deployment and performance
of the software components attached to them. Further yet, the source code of
middleware dependent components will be tightly coupled to the specification
of the middleware, thereby moving the lock-in effect from vendors to
middleware [28].
Several PaaS development groups are working to establish a set of standards
and common APIs to act as the middleware between IaaS and SaaS instalments.
These initiatives include [29]:
Cloud Application Management for Platform (CAMP)
CAMP reduces the effort to move applications between clouds, offers
cloud providers and consumers a REST-based approach to application
management, provides a common development vocabulary and API that
can work across multiple clouds without excessive adaptation so puts a
common basis for developing multi-cloud management tools [30].
OpenStack
OpenStack is a global collaboration of developers and cloud computing
technologists producing the ubiquitous open source cloud computing
platform for public and private clouds. OpenStack is a cloud operating
system is massively scalable and controls large pools of compute,
storage, and networking resources throughout a datacenter [31]. It is
free open source software released under the terms of the Apache
License. Anyone can run it, build on it, or submit changes back to the
28
project. According to Openstack.org, this approach is the only way to
remove the fear of proprietary lock-in for cloud customers and create a
large ecosystem that spans cloud providers.
Cloud Foundry
VMware has entered the cloud game by offering an open-source package
called Cloud Foundry, an open source Platform-as-a-Service released
under the terms of the Apache License. It gives developers a way to
create and deploy applications in the cloud without being locked in to a
proprietary platform. It supports vSphere, vCloud, OpenStack, and
Amazon AWS as infrastructures as a service [32].
We conclude this report by observing that enormous potential of cloud
computing risks to be quite negatively affected by the vendor lock-in problem.
We do hope that some of the possible solutions that we have briefly mentioned
in this chapter will succeed in overcoming it.
29
AKNOWLEDGMENTS
I wish to thank prof. Brogi, supervisor of this thesis , for the availability and for
stimulated me with this topic, which it will be definitely useful for my future
career. Special thanks go to my brother, an inspiration source for me. He has
been my spiritual guide throughout my life, especially in this important
occasion. To my parents, that have never held me back in any field and in any
my choice. In particular, they allowed me follow this passion for computer
science, carry out the change of high school and come to Pisa. To my sister,
who has filled me with support words in all the dark times. Then I thank
Andrea, it is thanks to his help I can say to be satisfied of my work. Luca, who I
consider as a brother, that is my backbone to withstand and overcome the
difficulties of life. My girlfriend, the only source of relax in this hard period, my
island of peace and support, without her I would be mad. Roberto, a great
friend but also a kind of idol, who is for me a great source of inspiration. The
prof. Pollastri for making me discover the computer science through his
mythical lessons of Pascal. Luca's father, for saying the phrase: “the only limit
of computer science is your imagination”. Alessandra, the first who heard my
thesis. And lastly, all my friends, that have understood although I have
completely neglected them in recent months.
30
RINGRAZIAMENTI
Desidero ringraziare il prof. Brogi, relatore di questa tesi, per la disponibilità e
per avermi stimolato con questo argomento, che mi sarà sicuramente utile per
la mia carriera futura. Un ringraziamento particolare va a mio fratello, per me
fonte di ispirazione; per avermi fatto da guida durante la mia vita e soprattutto
in questa importante occasione. Ai miei genitori, che non mi hanno mai frenato
in nessun campo e in nessuna mia scelta. In particolare, mi hanno permesso di
seguire questa passione per l'informatica, di effettuare il cambio di liceo e di
venire a Pisa. A mia sorella, che mi ha riempito di parole di conforto in tutti i
momenti difficili. Poi ringrazio Andrea, e anche grazie al suo aiuto che posso
dire di essere soddisfatto del mio lavoro. Luca, che considero come un fratello.
che rappresenta la mia colonna portante per sopportare e superare le difficoltà
della vita. La mia ragazza, unica fonte di relax in questo duro periodo, la mia
isola di tranquillità e conforto, senza di lei sarei impazzito. Roberto, una
grande amico ma anche una sorta di idolo, che è per me una grande fonte di
ispirazione. Il prof. Pollastri, per avermi fatto scoprire l'informatica grazie alle
sue mitiche lezioni di Pascal. Il padre di Luca, per avermi regalato la frase:
“l'unico limite dell'informatica e la propria immaginazione”. Alessandra, la
prima a subirsi la mia tesi. E infine tutti i miei amici, che mi hanno compreso
nonostante li ho trascurati completamente in questi mesi.
31
REFERENCES
[1] What is Cloud Computing?
Last visited on http://www.youtube.com/watch?v=ae_DKNwK_ms
[2] GoogleDevelopers. Campfire One: Introducing Google App Engine.
Last visited on http://www.youtube.com/watch?v=3Ztr-HhWX1c
[3] M. Armbrust et al. A view of cloud computing. Communications of the
ACM Vol. 53 No. 4, April 2010.
[4] Cloud Computing Explained.
Last visited on http://www.youtube.com/watch?v=QJncFirhjPg
[5] B. Sosinsky. Cloud Computing Bible, chapter 1. Wiley Publishing, Inc.
2011.
[6] R. Buyya, J. Broberg, A. M. Goscinski. Cloud Computing: Principles and
Paradigms, chapter 1. Pearson-Prentice Hall. 2010.
[7] P. Mell, T. Grance. The NIST Definition of Cloud Computing.
[8] I. Jansch, V. Chin. PHP Development in the Cloud.
[9] Eli the Computer Guy. Introduction to Cloud Computing.
Last visited on http://www.youtube.com/watch?v=QYzJl0Zrc4M
[10] D. Bader, R. Pennington. Cluster Computing: Applications. Georgia Tech
College of Computing.
[11] What is grid computing? Gridcafe. E-sciencecity.org.
[12] G.J. Popek, R.P. Goldberg. Formal Requirements for Virtualizable Third
Generation Architectures. Communications of the ACM Vol. 17 No. 7,
July 1974.
[13] D. Catteddu, G. Hogben. Benefits, risks and recommendations for
information security, chapter 3. ENISA. November 2009.
[14] M. Rouse. Vendor lock-in. Techtarget.com. May 2012.
[15] M. Garnaat. Cloud Lock-In. Not your father's lock-in. Elastician.com.
April 2009.
[16] J. McKendrick. Cloud Computing's Vendor Lock-In Problem: Why the
Industry is Taking a Step Backward. Forbes.com. November 2011.
[17] T. Von Eicken. The Skinny on Cloud Lock-In. RightScale.com. February
2009.
[18] R. Dilletu. Update: Amazon Web Services Down In North Virginia —
Reddit, Pinterest, Airbnb, Foursquare, Minecraft And Others Affected.
Techcrunch.com. October, 2012.
[19] M. Hinkle. Three cloud lock-in considerations. Zenoss Blog. June 2010.
[20] L. Monni. Il Lock-In nei servizi cloud. CloudUp, CloudTalk.
32
[21] E. Moyle. Cloud computing vendor lock-in: Avoiding security pitfalls.
Techtarget.com. June, 2012.
[22] R. Shaw. Computer Aided Marketing and Selling. Butterworth-
Heinemann Newton. 1991.
[23] T. Nolle. Application portability in PaaS: Problems and solutions.
Techtarget.com. March, 2013.
[24] Microsoft.com. Adding Sign-On to Your Web Application Using Windows
Azure AD.
[25] A. Salkever. 5 ways to protect against vendor lock-in in the cloud.
Gigaom.com. September, 2011.
[26] E.M. Maximilien et al. Toward cloud-agnostic middlewares. In OOPSLA
'09, pages 619–626. 2009.
[27] W. Tsai et al. Service-Oriented Cloud Computing Architecture. In ITNG
'10, pages 684-689. 2010.
[28] J. Miranda et al. Identifying Adaptation Needs to Avoid the Vendor
Lock-in Effect in the Deployment of Cloud SBAs. WAS4FI-Mashups '12,
pages 12-19. September, 2012.
[29] M. Szynaka. Ask the Expert: Is PaaS vendor lock-in unavoidable?
Techtarget.com. April 2013.
[30] C. Redwood Shores. Leading Technology Vendors Announce New
Specification Designed to Ease Management of Applications Across
Public and Private Clouds. Oracle.com. August, 2012.
[31] R. Sean et al. OpenStack Training Guide. Introduction to OpenStack,
chapter 2,3. OpenStack Foundation. October, 2013.
[32] S.Higginbotham. VMware Launches Open-Source Cloud. Gigaom.com.
April, 2011.
33

More Related Content

What's hot

Lecture01: Introduction to Security and Privacy in Cloud Computing
Lecture01: Introduction to Security and Privacy in Cloud ComputingLecture01: Introduction to Security and Privacy in Cloud Computing
Lecture01: Introduction to Security and Privacy in Cloud Computingragibhasan
 
Cloud computing security from single to multi clouds
Cloud computing security from single to multi cloudsCloud computing security from single to multi clouds
Cloud computing security from single to multi cloudsCholavaram Sai
 
Ijarcet vol-2-issue-4-1405-1409
Ijarcet vol-2-issue-4-1405-1409Ijarcet vol-2-issue-4-1405-1409
Ijarcet vol-2-issue-4-1405-1409Editor IJARCET
 
Migration of Virtual Machine to improve the Security in Cloud Computing
Migration of Virtual Machine to improve the Security in Cloud Computing Migration of Virtual Machine to improve the Security in Cloud Computing
Migration of Virtual Machine to improve the Security in Cloud Computing IJECEIAES
 
Cloud Computing on ISO/IEC JTC 1
Cloud Computing on ISO/IEC JTC 1Cloud Computing on ISO/IEC JTC 1
Cloud Computing on ISO/IEC JTC 1Seungyun Lee
 
Seminar report on cloud computing
Seminar report on cloud computingSeminar report on cloud computing
Seminar report on cloud computingJagan Mohan Bishoyi
 
Security & Privacy in Cloud Computing
Security & Privacy in Cloud ComputingSecurity & Privacy in Cloud Computing
Security & Privacy in Cloud ComputingJohn D. Johnson
 
Cloud Computing Documentation Report
Cloud Computing Documentation ReportCloud Computing Documentation Report
Cloud Computing Documentation ReportUsman Sait
 
Final Year IEEE Project 2013-2014 - Cloud Computing Project Title and Abstract
Final Year IEEE Project 2013-2014  - Cloud Computing Project Title and AbstractFinal Year IEEE Project 2013-2014  - Cloud Computing Project Title and Abstract
Final Year IEEE Project 2013-2014 - Cloud Computing Project Title and Abstractelysiumtechnologies
 
Openstack & why cloud for enterprise ppt
Openstack & why cloud for enterprise pptOpenstack & why cloud for enterprise ppt
Openstack & why cloud for enterprise pptAsmaa Ibrahim
 
Introduction to cloud computing
Introduction to cloud computingIntroduction to cloud computing
Introduction to cloud computingvishnu varunan
 
Reminiscing cloud computing technology
Reminiscing cloud computing technologyReminiscing cloud computing technology
Reminiscing cloud computing technologyeSAT Publishing House
 
Cloud Computing for Universities Graduation Project
Cloud Computing for Universities Graduation ProjectCloud Computing for Universities Graduation Project
Cloud Computing for Universities Graduation ProjectMohamed Shorbagy
 

What's hot (20)

Lecture01: Introduction to Security and Privacy in Cloud Computing
Lecture01: Introduction to Security and Privacy in Cloud ComputingLecture01: Introduction to Security and Privacy in Cloud Computing
Lecture01: Introduction to Security and Privacy in Cloud Computing
 
Cloud computing security from single to multi clouds
Cloud computing security from single to multi cloudsCloud computing security from single to multi clouds
Cloud computing security from single to multi clouds
 
Ijarcet vol-2-issue-4-1405-1409
Ijarcet vol-2-issue-4-1405-1409Ijarcet vol-2-issue-4-1405-1409
Ijarcet vol-2-issue-4-1405-1409
 
Migration of Virtual Machine to improve the Security in Cloud Computing
Migration of Virtual Machine to improve the Security in Cloud Computing Migration of Virtual Machine to improve the Security in Cloud Computing
Migration of Virtual Machine to improve the Security in Cloud Computing
 
Cloud Computing on ISO/IEC JTC 1
Cloud Computing on ISO/IEC JTC 1Cloud Computing on ISO/IEC JTC 1
Cloud Computing on ISO/IEC JTC 1
 
Cloud Computing_2015_03_05
Cloud Computing_2015_03_05Cloud Computing_2015_03_05
Cloud Computing_2015_03_05
 
Seminar report on cloud computing
Seminar report on cloud computingSeminar report on cloud computing
Seminar report on cloud computing
 
Security & Privacy in Cloud Computing
Security & Privacy in Cloud ComputingSecurity & Privacy in Cloud Computing
Security & Privacy in Cloud Computing
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
Cloud Computing Documentation Report
Cloud Computing Documentation ReportCloud Computing Documentation Report
Cloud Computing Documentation Report
 
Demystifying the cloud
Demystifying the cloudDemystifying the cloud
Demystifying the cloud
 
Final Year IEEE Project 2013-2014 - Cloud Computing Project Title and Abstract
Final Year IEEE Project 2013-2014  - Cloud Computing Project Title and AbstractFinal Year IEEE Project 2013-2014  - Cloud Computing Project Title and Abstract
Final Year IEEE Project 2013-2014 - Cloud Computing Project Title and Abstract
 
Fog doc
Fog doc Fog doc
Fog doc
 
Intro Cloud Computing
Intro Cloud ComputingIntro Cloud Computing
Intro Cloud Computing
 
Openstack & why cloud for enterprise ppt
Openstack & why cloud for enterprise pptOpenstack & why cloud for enterprise ppt
Openstack & why cloud for enterprise ppt
 
Briefing 47
Briefing 47Briefing 47
Briefing 47
 
Introduction to cloud computing
Introduction to cloud computingIntroduction to cloud computing
Introduction to cloud computing
 
B042306013
B042306013B042306013
B042306013
 
Reminiscing cloud computing technology
Reminiscing cloud computing technologyReminiscing cloud computing technology
Reminiscing cloud computing technology
 
Cloud Computing for Universities Graduation Project
Cloud Computing for Universities Graduation ProjectCloud Computing for Universities Graduation Project
Cloud Computing for Universities Graduation Project
 

Similar to Lock-in issues with PaaS

Winds of change from vender lock in to the meta cloud
Winds of change from vender lock in to the meta cloudWinds of change from vender lock in to the meta cloud
Winds of change from vender lock in to the meta cloudMunisekhar Gunapati
 
A Short Appraisal on Cloud Computing
A Short Appraisal on Cloud ComputingA Short Appraisal on Cloud Computing
A Short Appraisal on Cloud ComputingScientific Review SR
 
Introduction to cloud computing
Introduction to cloud computingIntroduction to cloud computing
Introduction to cloud computingvishnu varunan
 
Cloud computing
Cloud computingCloud computing
Cloud computingJawhar Ali
 
IRJET - Cloud Computing Application
IRJET -  	  Cloud Computing ApplicationIRJET -  	  Cloud Computing Application
IRJET - Cloud Computing ApplicationIRJET Journal
 
Load Balancing Tactics in Cloud Computing: A Systematic Study
Load Balancing Tactics in Cloud Computing: A Systematic Study    Load Balancing Tactics in Cloud Computing: A Systematic Study
Load Balancing Tactics in Cloud Computing: A Systematic Study Raman Gill
 
Cloud Computing
 Cloud Computing Cloud Computing
Cloud ComputingAbdul Aslam
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Cloud Computing Without The Hype An Executive Guide (1.00 Slideshare)
Cloud Computing Without The Hype   An Executive Guide (1.00 Slideshare)Cloud Computing Without The Hype   An Executive Guide (1.00 Slideshare)
Cloud Computing Without The Hype An Executive Guide (1.00 Slideshare)Lustratus REPAMA
 
Cloud computing
Cloud computingCloud computing
Cloud computingleninlal
 
Cloud computing
Cloud computingCloud computing
Cloud computingsfu-kras
 
Cloud introduction
Cloud introductionCloud introduction
Cloud introductionRameshKante
 
The Nitty Gritty of Cloud Computing
The Nitty Gritty of Cloud ComputingThe Nitty Gritty of Cloud Computing
The Nitty Gritty of Cloud ComputingMike Tase
 
Efficient architectural framework of cloud computing
Efficient architectural framework of cloud computing Efficient architectural framework of cloud computing
Efficient architectural framework of cloud computing Souvik Pal
 

Similar to Lock-in issues with PaaS (20)

Cloud computing
Cloud computingCloud computing
Cloud computing
 
Microservices.pdf
Microservices.pdfMicroservices.pdf
Microservices.pdf
 
Winds of change from vender lock in to the meta cloud
Winds of change from vender lock in to the meta cloudWinds of change from vender lock in to the meta cloud
Winds of change from vender lock in to the meta cloud
 
A Short Appraisal on Cloud Computing
A Short Appraisal on Cloud ComputingA Short Appraisal on Cloud Computing
A Short Appraisal on Cloud Computing
 
Introduction to cloud computing
Introduction to cloud computingIntroduction to cloud computing
Introduction to cloud computing
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
IRJET - Cloud Computing Application
IRJET -  	  Cloud Computing ApplicationIRJET -  	  Cloud Computing Application
IRJET - Cloud Computing Application
 
Cloud computing (2)
Cloud computing (2)Cloud computing (2)
Cloud computing (2)
 
Cloud computing (1)
Cloud computing (1)Cloud computing (1)
Cloud computing (1)
 
cloud computing basics
cloud computing basicscloud computing basics
cloud computing basics
 
Load Balancing Tactics in Cloud Computing: A Systematic Study
Load Balancing Tactics in Cloud Computing: A Systematic Study    Load Balancing Tactics in Cloud Computing: A Systematic Study
Load Balancing Tactics in Cloud Computing: A Systematic Study
 
Cloud Computing
 Cloud Computing Cloud Computing
Cloud Computing
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Cloud Computing Without The Hype An Executive Guide (1.00 Slideshare)
Cloud Computing Without The Hype   An Executive Guide (1.00 Slideshare)Cloud Computing Without The Hype   An Executive Guide (1.00 Slideshare)
Cloud Computing Without The Hype An Executive Guide (1.00 Slideshare)
 
Cloud Computing Essay
Cloud Computing EssayCloud Computing Essay
Cloud Computing Essay
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Cloud introduction
Cloud introductionCloud introduction
Cloud introduction
 
The Nitty Gritty of Cloud Computing
The Nitty Gritty of Cloud ComputingThe Nitty Gritty of Cloud Computing
The Nitty Gritty of Cloud Computing
 
Efficient architectural framework of cloud computing
Efficient architectural framework of cloud computing Efficient architectural framework of cloud computing
Efficient architectural framework of cloud computing
 

Lock-in issues with PaaS

  • 1. UNIVERSITÀ DEGLI STUDI DI PISA Dipartimento di informatica CORSO DI LAUREA IN INFORMATICA TESI DI LAUREA Lock-in issues with PaaS Relatore: Prof. Antonio Brogi Candidato: Federico Conte ANNO ACCADEMICO 2012-13
  • 2. “All truths are easy to understand once they are discovered; the point is to discover them.” Galileo Galilei (1564 – 1642) 1
  • 3. Contents Introduzione 3 Introduction 4 1. Cloud Computing 5 1.1. Why using the cloud? . . . . . . . . . . . . . . . . . . . . 5 1.2. What is Cloud Computing . . . . . . . . . . . . . . . . . 7 1.3. Similar systems and concepts . . . . . . . . . . . . . . . . 10 2. Vendor lock-in 13 2.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2. Lock-in issues . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Conclusions 27 Acknowledgements 30 Ringraziamenti 31 References 32 2
  • 4. INTRODUZIONE Il vendor lock-in è un problema che affligge la tecnologia del cloud computing. Lo scopo di questa tesi è quello di descrivere e mostrare tale problema nel livello Platform-as-a-Service (PaaS) del cloud. Il capitolo 1 è incentrato sul cloud computing e ne richiama i concetti base. Capendo che cos'è il cloud computing è possibile collocare e descrivere in modo appropriato il problema del vendor lock-in. Il capitolo inizia elencando alcuni dei problemi di costruzione di una applicazione senza utilizzare la tecnologia del cloud computing. Ciò mette in risalto i motivi per cui è nata tale tecnologia; vengono quindi messe in risalto le potenzialità del cloud. Successivamente, viene data una definizione formale di cloud computing e vengono descritte le tecnologie su cui è basato. Nel capitolo 2, viene introdotto il problema del vendor lock-in. Tale problema esiste anche senza l'utilizzo di una soluzione cloud. Confrontando, però, il “classico” lock-in con il lock-in nel cloud, risaltano le differenze tra i due e quanto grave e pesante sia il secondo. Inoltre, il problema viene accentuato appositamente dai fornitori di cloud, al fine di mantenere i loro clienti nel loro sistema. Il capitolo prosegue approfondendo il problema del lock-in e mostrando le problematiche che ne derivano agli utenti. Elencando i vari motivi per cui un utente potrebbe decidere di cambiare fornitore, mostreremo che non si tratta di una decisione rara quella di pensare di cambiare provider. Addentrandoci nel cuore del problema, vedremo tre tipi possibili di vendor lock-in e di come tale problema diventa più forte a seconda del modello di servizio cloud scelto. Per finire, concretizzeremo un minimo le cose fornendo un semplice esempio di applicazione al fine di mostrare il lock-in in PaaS. Nelle conclusioni, riporteremo alcune parziali al problema faremo un accenno agli standard che si stanno formando. 3
  • 5. INTRODUCTION The vendor lock-in is a problem that afflicts the technology of cloud computing. In this thesis, the purpose is to describe and show the vendor lock-in in the Platform-as-a-Service (PaaS) level of cloud. Chapter 1 focuses on cloud computing and it recalls the basic concepts in order to collocate and describe the vendor lock-in problem properly. The chapter begins by listing of the issues of building an application without using cloud computing. This highlights the reasons why this technology is born and so we praise its potentials. Follows a definition of cloud computing and a description of the technologies on which it is based. In Chapter 2, it is introduced the problem of vendor lock-in which exists even without the use of a cloud solution. Comparing the “classic” lock-in with the lock-in in the cloud, stand out the differences between them and how serious and heavy is the second one. Moreover, the problem is made deeper by cloud providers, in order to keep their customers in their system. Then, the chapter starts to go into detail. First, it is shown what concern the users about moving to the cloud. Listing the various reasons why a user might decide to change provider, we will show that it is not a rare decision the will of changing. Entering the core of the issue, we will see three possible types of vendor lock-in and how this problem becomes stronger as we go up in the stack levels. In the end, in order to finalize the topic, we will provide a simple example of application. This shows the lock-in in PaaS. In the conclusions, we will report some of partial solutions to the problem and we will do an overview on the standards that are forming. 4
  • 6. Chapter 1 CLOUD COMPUTING 1.1 Why using Cloud Computing? Suppose that you want to realize a business application to support your ideas. But business applications have always been too expensive because behind each one there is a world of complexity [1]. The main part should be to build your application (that is what you need) and it is already difficult: you have to determine the goal, design the application, pick a language, set up an IDE, write the code and test it. But in addition you have significant challenges (Figure 1) [2], even with the smallest application: • install and configure a web server and a database, • find some way to push out new versions of your code when you update 5 Figure 1: the significant challenges to build an application
  • 7. your application, • find some way to monitoring your application so you can see whether it is down, see how much traffic you are getting, • once you have that solved and you have got your application ready to run, you have to find some place to actually put it. This requires a data center that needs space, cooling, energy power, database servers, system of backup, a network to keep everything connected, • you need an expert team (IT team) to install and configure everything, • you have to deal with the operating system, virtual machines, applications server, persistence, cache, security, and so on, • sometimes you need to update your system or face problems like: breakdown of machines and hard disk drive failure, • you must maintain your application, writing patches to update it or fix the bugs, • when your application starts to grow, you need to get more machines, expand your database, etc. Hundred of thousand of companies do this now, it costs a lot of money and it takes a lot of time. Besides they have to continue to pay regardless of the load on their servers. With Cloud Computing, instead of running your applications yourself, they run on a shared data center. You do not need any more to be concerned about over-provisioning for a service whose popularity does not meet its predictions, thus wasting costly resources, or under-provisioning for one that becomes wildly popular, thus missing potential customers and revenue [3]. In effect, you have access to computing power instantly when you need it: if you suddenly need more computing power dedicated to your application, you can scale up as much as you need almost instantly and if your traffic goes down, you can release your servers back into the clouds [4]. In the cloud, resources are virtual and unlimited and the details of the physical systems on which software runs are abstracted from the user [5]. It is like when you use an electric appliance, you plug it into an outlet and you do not care about the electric power generated to sustain it. This is possible 6
  • 8. because electricity is virtualized, that is, it is readily available from a wall socket that hides power generation stations and a huge distribution grid. When extended to information technologies, this concept means delivering useful functions while hiding how their internal implementation [6]. The system is centralized, so it is easier to apply patches and upgrades on your applications ensuring a good maintenance. The scale of cloud computing networks and their ability to provide load balancing and fail-over makes them highly reliable. In conclusion, the cost to realize your business application is drastically reduced, both upfront capital expenditures and maintenance and management costs, because cloud computing eliminates the significant challenges discussed at the beginning. In particular, it reduces the need of IT staff and it eliminates the cost to buy machines and to repair broken hardware. 1.2 What is Cloud Computing The National Institute of Standards and Technology (NIST) describes the cloud as follows [7]: “Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. This cloud model promotes availability and is composed of five essential characteristics, three service models, and four deployment models.” Five Essential Characteristics [7][8]: On-demand, Self-service → consumers can use cloud services when they need them, without requiring human interaction. Broad network capabilities are available over the network.→ Resource pooling the provider's computing resources are pooled to→ serve multiple consumers using a multi-tenant model, with different physical and virtual resources dynamically assigned and reassigned 7
  • 9. according to consumer demand. Rapid elasticity capabilities can be elastically provisioned and→ released to scale rapidly up and down as needed. Measured Service cloud services typically use the same model of an→ electricity bills: we use a certain amount of energy, and that is what we pay for. Cloud systems automatically control and optimize resource use by taking advantage of a metering capability. Resource usage can be monitored, controlled, and reported. Besides, since cloud networks operate at higher efficiencies and with greater utilization, significant cost reductions are often encountered. Three Service Models (Figure 2) [7][8]: Infrastructure as a Service It is the lower layer of the cloud→ computing stack, both PaaS and SaaS rely on it. IaaS provides storage systems, servers, switches, routers, firewalls, everything that you normally use when building your own hardware infrastructure can be offered as a service so you do not have to worry about hardware. The IaaS service provider manages all the infrastructure, while the client is responsible for all other aspects of the deployment. These can include 8 Figure 2: the three service models of cloud
  • 10. the operating system, applications, and user interactions with the system. Platform as a Service In the PaaS model, the whole platform is→ offered as a single service, instead of offering each of the infrastructure components separately. PaaS provides virtual machines, operating systems, applications, services, development frameworks, transactions, and control structures. The client can deploy its applications on the cloud infrastructure or use applications that were programmed using languages and tools supported by the PaaS service provider. The service provider manages the cloud infrastructure, the operating systems, and the enabling software1 . The client is responsible for installing and managing the application that it is deploying. Software as a Service SaaS simply refers to software that is→ provided on-demand for use. Traditionally, when someone wants to use software he would go to the store, picks up some disks, takes them home, and installs them on a computer. With SaaS, he just uses hosted software. The applications are accessible from various client devices through either a thin client interface, such as a web browser, or a program interface. The consumer does not manage or control the underlying cloud infrastructure. Four Deployment Models [7][8]: Private Clouds are clouds that are accessible only within an→ organization. This is useful if you want full control of the cloud, which is useful from a flexibility, security and performance perspective. Public Clouds the cloud infrastructure is provisioned for open use by→ the general public, like Amazon S32 and Gmail3 . Community cloud the cloud infrastructure is provisioned for→ exclusive use by a specific community of consumers from organizations that have shared concerns. Hybrid Cloud the cloud infrastructure is a composition of two or→ 1 Enabling software means the modification of the design or implementation of software to allow internationalisation to take place. 2 It is an online file storage web service offered by Amazon Web Services 3 It is a free, advertising-supported email service provided by Google 9
  • 11. more distinct cloud infrastructures that remain unique entities, but are bound together by standardized or proprietary technology that enables data and application portability. 1.3 Similar systems and concepts Cloud Computing is the result of the evolution and adoption of existing technologies and paradigms. It has the goal to allow users to take benefit from all of these technologies, without the need for deep knowledge about or expertise with each of them. Cloud computing is based on several technologies [9]: Web Applications A web application refers to: • an application that uses a web browser as a client, • a computer software application written in a browser-supported programming language and is reliant on a web browser to render the application executable. Web applications can permit to exploit cloud computing: when you use them you are actually accessing applications that are sitting on other servers using the web browser that is installed on your computer. For example, trough a web browser you can log into your account in the Google docs website and start writing documents as if the application was installed on your own computer. Using a web browser, users can connect from anywhere and from any device (device and location independence). Clustering [10] A computer cluster consists of a set of connected computers (nodes) that work together. The Clustering middleware is a software layer that manages all the nodes so they can be viewed as a single system by the users. In load-balancing clusters, the computational workload is shared among the nodes to provide better overall performance. For example, a web server cluster may assign different queries to different nodes, so the overall response time will be optimized. So basically, let us 10
  • 12. say a lot of people are trying to access a particular database on the internet and that database is in a cluster. The cluster will redirect each request towards the less load server. Besides, if one of the servers fails, the cluster realizes that and will not send users to that server. Grid Computing [11] Grid computing is a form of distributed computing whereby a “super virtual computer” is a collection of computer resources acting together to perform large tasks. Unlike cluster computing, the grids tend to be more loosely coupled, heterogeneous, and geographically dispersed. Terminal services Back in the sixties and seventies, there was the mainframe, a very powerful computer, and connected to this there were the dumb terminals. This was a simple little device with no intelligence and composed by monitor, keyboard and mouse. The mainframe processed all the operations given by these dumb terminals and then sent the output to their monitors, and it gave every single dumb terminal a slice of time, in order to simulate a dedicated computer for them. Now we have basically the same thing, but we changed the terminology: we use terminal services server instead of mainframe, and thin clients instead of dumb terminals. Thanks to thin clients, you can connect to a terminal services server and work on it, like if you were using your own computer. Let us say you want to enable your employees to work with an application, but you do not want to install it on every machine of the office. You can install the application on the terminal services server, so the employees can use the application through the thin clients. This is called application server. Virtualization [12] Hardware virtualization refers to the creation of a virtual machine that acts like a real computer with an operating system. This means that you can transfer the entire operating system, with applications, settings etc., from one piece of hardware to another piece of hardware and everything continues to function properly. Instead, before virtualization, if you needed to do this operation, you would have to do a backup of all the data, install the OS in the new piece of hardware, reinstall all the 11
  • 13. applications and recover all the data. Hypervisor or virtual machine monitor (VMM) is a piece of computer software, firmware or hardware that creates and runs virtual machines. Type 2 (or hosted) hypervisor is a software installed on your operating system that allows you to install other operating systems, an example is VirtualBox. Type 1 (or native) hypervisor is like an operating system that you install on the machine where you want to run your virtual computers on. After installed it, it does not allow you to do anything. You need to install a management software on whatever computer you are going to use in order to administer your virtual computers. Basically this software connects through the network to the hyper-visor and, trough a graphical interface, permits you to create all the virtual computers that you need and you can move them easily on any other server on your network that is running the hyper-visor. The management software can look and see the status of the physical hardware for all of your servers. So if there is a problem with a server in which your OS was running, this is moved to another hyper-visor. Furthermore the management software can turn on and off physical servers in correspondence to the request of virtual computers, saving power energy and money. Once people figured out that you can separate the operating system from the hardware, some companies realized that they could sell instances of virtual operating systems using the big under-used resources they had. 12
  • 14. Chapter 2 VENDOR LOCK-IN 2.1 Overview In a recent report, the European Network and Information Security Agency (ENISA) highlighted lock-in as one the biggest risks involved with cloud computing [13]: “There is currently little on offer in the way of tools, procedures or standard data formats or services interfaces that could guarantee data and service portability. This makes it extremely difficult for a customer to migrate from one provider to another, or to migrate data and services to or from an in-house IT environment. Furthermore, cloud providers may have an incentive to prevent (directly or indirectly) the portability of their customers services and data. This potential dependency for service provision on a particular CP, depending on the CP's commitments, may lead to a catastrophic business failure should the cloud provider go bankrupt and the content and application migration path to another provider is too costly (financially or time-wise) or insufficient warning is given (no early warning).” Vendor lock-in is a situation in which a customer using a product or service cannot easily transition to a competitor's product or service. Vendor lock-in is usually the result of proprietary technologies that are incompatible with those of competitors. However, it can also be caused by inefficient processes or contract constraints, among other things [14]. Said this, we can already see the problem. As discussed in chapter 1, with Cloud Computing the cost to realize a business application is drastically reduced. So, when a company has to evaluate the possible technologies to use, the choice to rely on a cloud provider may seem obvious if the costs of migrating off the 13
  • 15. system are under-estimated But what happens if the cloud provider decides to double the prices? Or it goes bankrupt? Or, let us assume that a company choose to rely on a cloud provider. Then, its application grows to such a point that will require unexpected new features not supported by that provider, but supported by the competition. In these situations, the companies may have no choice but to migrate and to face the expenses, related to this action, which could be prohibitive and unexpected. Furthermore the companies have to do a new management plan in order to face the complexities of cloud service migration, even if they would like to avoid this cumbersome process. Therefore, some companies may find themselves locked in a provider that does not meet any more their needs. This fear of vendor lock-in is considered a major impediment to cloud service adoption. 2.2 Lock-in issues As discussed in the Chapter 1, if you want to build a businesses application without using the cloud computing, you have to face significant challenges. During these processes, you will do a lot of choices in terms of the hardware you will buy, the OS you will use, the database, etc. With high probability, during the deploying phase, proprietary operating system and database features will be used. Moreover choices will be made taking into consideration the potential of the hardware you have available. There might have been good reasons at the time to do so (e.g. better performance, better integration with development tools, better manageability, etc.) but because of those decisions the cost of changing any of the significant components of that software becomes too high to be practical. You are locked- in [15]. Using cloud computing, you basically have the same thing. You choose a provider who made its choices in term of system structure, SO, DB, hypervisor, APIs supplied, etc. So under this point of view, the lock-in does not seem a big problem. However, this time you do not own all these components. You are 14
  • 16. paying and adapting your businesses application in order to use them. Besides, you do not have full control of your system. If we go back to the comparison between cloud computing and electricity, we can clearly see the problem. If you do not like the company that is providing you electricity, you can just change it. This operation does not imply to change or configure in a different way all the electric appliances that you were using. For more than a decade, IT managers and lawyers have been working tirelessly to enable solutions based on common standards and protocols that can be built, supported, swapped out and replaced, regardless of vendor. For instance, SOA helped free us from the technological vendor lock-in model that we had a decade ago, in terms of motivating the transition towards interoperability level standards. This new degree of cloud vendor lock-in is a step backwards from all the work that has been done with approaches such as service oriented architecture. Cloud computing may be erasing the gains we have made in terms of vendor dependence lock-in [16]. Generally, a service provider is not willing to yield his customers to competitors. This means that the change of service provider can be a difficult task, and we know it from our experience (e.g. changing our telephone operator). In order to block the customers within its own services, many companies even build blocks on service or platform exit making the transition process difficult if not impossible. This happens in the cloud, making the vendor lock-in even bigger. One way they do this is through the security checks. The goal of security controls is to restrict access to data. Therefore it is easy for a service provider refer to the safety requirements as an excuse for not providing parts of key data to allow a safe transition. So the vendor lock-in is a real problem that needs to be faced and deserves further focus. Many cloud platforms and services are proprietary, meaning that they are built on the specific standards, tools and protocols developed by a particular vendor for its particular cloud offering [16]. So when you are building your application, if it relies on a cloud solution, you have to know that you have accepted this situation and you should be ready to pay off handsomely 15
  • 17. when you will not accept it any more. Also if you try to build a standards- compliant application for certain interoperability functions, you will still find the lock-in. Actually, because standards are still being formed and developed, cloud computing is still too immature to reach the point where customers are demanding vendor independence [16]. Despite this, users start to fear the vendor lock-in problem, especially the data lock-in, as shown in Figure 3. The actual hosting of the application, the actual requirements for that application to exist in a cloud environment, to connect to the virtualized resources and whatever administration tools the cloud providers may give you to configure and maintain the application, will be, for the most part, controlled by the cloud provider [16]. Thus, many companies now using cloud services are in big trouble when it is time to move on. And there may be many reasons why it is time to move away from your current cloud provider [16][17]: • it may go bankrupt, • it may increase its services costs, 16 Figure 3: What concern you about moving to the cloud?
  • 18. • it may change its terms of service, • it may decrease the quality of services, • it may have too many network or system outages. It is not rare, it can also happen to the major services providers, it is the case of Amazon Web Services that went down in North Virginia affecting Reddit, Pinterest, Airbnb, Foursquare, Minecraft and others [18], • it may not support some features that you will discover to have need, • it could be bought out by a larger company involving a policies change, • it may change the leasing terms or shift geographically. So you can find conflicts with some legal requirements you may have. Three types of vendor lock-in can occur with cloud computing [19][20]: Platform lock-in While it is possible to write a cross-platform application so it can be OS independent, in the cloud you can not escape from the proprietary libraries and the system configurations. As we said, PaaS provides virtual machines, operating systems, applications, services, development frameworks, transactions, and control structures. While developing your application, you probably will use these features so you will be locked in the cloud provider. Indeed, a different provider could implement the same features in a completely different way or may not even provide the same functionality. Moreover a cloud service tends to be built on one of several possible virtualization platforms, for example VMWare or Xen. Migrating from a cloud provider using one platform to a cloud provider using a different platform could be very complicated. Here is an overview of platform lock-ins in the best known PaaS providers: • Google App Engine: Using the Java languages on GAE you have some limits, you can not find all the APIs, especially if they require access to the file system. Moreover, GAE does not support the servlet JEE. We can not implement a custom security for our applications through the web.xml file so we are forced to use the 17
  • 19. security mechanism of Google. • Windows Azure: Windows Azure is particularly affected by the vendor lock in, because of its unique .NET framework which is the most supported and documented. Despite Windows Azure supports other languages and frameworks, migrating from Windows Azure remain hard because of the lock-in on the operating system and Azure services. • Salesforce: Force.com allows external developers to create applications integrated as much as possible with the Salesforce SaaS environment. In order to ensure this, all the applications have to be develop using Apex, a proprietary language similar to Java, and Visualforce, a syntax similar to XML to design user interfaces in HTML. Data lock-in Since the cloud is still new, standards of ownership are not yet developed. For instance, the ownership about who actually owns the data once it lives on a cloud platform. Therefore, it will be hard for an user moving data off of a cloud vendor's platform. Often you need to do the following steps. First you have to move the data back to the customer's site. Most of the time, the data may have been altered for compatibility with the original provider's system, so you need to reconvert them to their former state. Now, you can move the data to the new provider environment. But what happens if you have a huge DB on the cloud (e.g. 100TB) ? You can not export the data on your computer because you do not have enough memory. You would need to have an enterprise-level solution just to store them, and that does not even count backing the data up. This is assuming that they even give you direct access to the data you have stored. If they do, they may give you the possibility to move the data via a portable storage device or via Internet. In the second case, it can be a problem if you do not have a reasonable time-frame to download them, and it will take time to store them again on the new cloud provider. If they do not give you direct access to the data, you can retrieve through brute force only a part of them which is visible. 18
  • 20. Assuming that you retain ownership of the data and you succeed to put them in local. You may find them in a format you can not easily use, such as backup tapes made by tools you do not own, so they are useless. Moreover, even if you manage to read the format, some data may be encrypted for security reasons preventing the logical access to them, if the cryptographic keys are not provided [21]. Tools lock-in If tools built to manage a cloud environment are not compatible with different kinds of both virtual and physical infrastructure, those tools will only be able to manage data or apps that live in the vendor's particular cloud environment. Let us make an simple example to clarify this point taking a service commonly used by many users, that is Gmail. In Gmail, users can not change email address but they can register a new account and transfer all the mails into the new one using the mail client and IMAP. But, usually, a Gmail account is used for the authentication in other services: Google Reader, Google Docs, Google Plus, YouTube e Picasa. If an user creates a new Gmail address, she needs to create a new profile for each services related with the old one. Instead, if she wants to change mail provider, she can export the emails but without features, like labels. Besides the user will lose all the services related with the old Gmail account and for some of them will be impossible to export the contents, like in Google Plus. In Figure 4, we can see the Thorsten's Lock-in Hypothesis4 : The higher the cloud layer you operate in, the greater the lock-in. This means that if you use an application in the cloud, such as an all-in-one CRM5 package, you have the highest chance of getting locked-in. Move one level down to a platform in the cloud and you are somewhat less likely to get locked-in [17]. 4 The title given to that assertion from Thorsten Von Eicken [17] 5 Customer relationship management (CRM) is a model for managing a company's interactions with current and future customers. It involves using technology to organize, automate, and synchronize sales, marketing, customer service, and technical support [22]. 19
  • 21. This theory is based on the fact that lock-in can actually occur at many levels in the stack. The higher the level, the greater the services an user will receive and the lower will be the control over them. Besides, the more the code is controlled by the cloud and the more you tend to lose freedom. Therefore, the more code s under your control the easier it is to replicate it elsewhere and retain freedom. Here are a number of different layers at which you could find yourself locked-in [17] : • You may not own the application that manages your data or you need to write a new one in order to change cloud provider. • You may have used, in your application, third-party web services. They could be only supported by a particular cloud provider. • Your application may be coded in a proprietary development environment. Besides it may run in a proprietary run-time environment. So you will need to retrain programmers and rewrite your application in order to move it to a different cloud. • Your application may use of a proprietary language and, in order to do the same operations, you will have to use another language supported by the new cloud provider. In this case, you will need expert programmers in both languages who are able to translate the interested 20 Figure 4: Thorsten's Lock-in hypothesis.
  • 22. parties in order to move the application. • Your data may have been stored in a proprietary or hard to reproduce data model or storage system. So, maybe, you will need to transform all your data and the code accessing it. • You may have not access to your data or you can get it in a proprietary format, like we said before. • You may not own all the side information on your application, like the log files, analytic information, metrics, history data, etc. So when you will move in the new cloud provider, you will start from scratch. • You may not control the operating system platform, the versions of libraries and tools. So, when you will move the application into the new cloud, you can not porting the operational procedures. If you subscribe to a PaaS vendor, you will be limited to using IaaS and SaaS products that are compatible with the Platform as a Service you choose. This creates a "black box" effect. For instance, if you write a simple Python application in Google App Engine, you will not have big porting problems, like you will show in section 2.3. But if that application starts using GAE services, it will getting harder moving it. When you move down to an infrastructure cloud, it becomes easier to see how you can move your application stack from one provider to another. After all, there is not much distinguishing the Linux box you get in EC2 from the Linux box you get at GoGrid. But even here, lock-in needs to be thought because the system behaviour, from storage persistence to networking details and on and on, is far from identical [17]. Users of PaaS services are raising the question of the portability of their applications. In some cases, the porting operation has required major changes in software and caused project delays and even productivity losses. This is caused by two specific problems. The first problem is the lack of a consistent platform definition among PaaS providers. The second problem is a lack of credible alternative providers for a platform. Because, if a big company, like Microsoft or Google, creates its PAAS solution, this can discourage the competition. Moreover, IaaS providers, such as Amazon, have been steadily adding services to their IaaS platform, making it a simple form of PaaS. Since there are no 21
  • 23. standards for these added services, using them will lock applications to a cloud provider. The greatest portability risk for PaaS over time will not be with the formal PaaS platform, but with the evolution of IaaS services into PaaS services through the addition of features, such as Amazon's Redshift or caching services [23]. Many users of these platforms will never see themselves as PaaS users and they will not realize of lock-in until the time they will try to move an application to another provider. 2.3 Example In order to show a realistic business case, we will suppose that a company decides, in the first place, to make use of Google App Engine as its cloud service of choice. So it decides to write an application in Python, Google's most supported language, using the Django framework. Let us assume that later, for some reasons, the company wants to change cloud provider, and its choice falls over Microsoft Windows Azure. With the goal of achieving an easier porting, it decides to keep the same language and framework. By exploring this example, we will be able to point out where the lock-in lies. The application is a guestbook, which is very simple but complex enough to show some of the most important features in the Cloud system. You can access the website running on Google App Engine at the address http://draxent-project.appspot.com. When you access on this website, you find the guestbook with all the greetings written on it (Figure 5). Each one of them is composed by the username, who created it, and one row of the original message. If the row does not report all the original information, you can click on it to see the entire message and the optional photo attached on it. Above the book, you can find the menu with the "write on it" and "login" buttons. The first one makes you write on the guestbook through a specific form composed by a textarea and an optional file input field. If you complete and submit the form, you are actually writing on the guestbook, that is you writing on the Google DB. The second one allows the 22
  • 24. authentication through Gmail. In order to provide this features, the guestbook makes use of some Cloud services available on the Google App Engine platform: • Users Authentication through the Google Account API. In this way, all the complexity related to create, manage, memorize and authenticate the users is taken care of by the cloud. • Database access by using Django-nonrel, a project to support Django on non-relational (NoSQL) databases, such as Google App Engine's Datastore. NoSQL databases are designed to be lightweight and more scalable but they introduce some limitations, for example they do not support JOINs. • File storage by using Google Cloud Storage. This provides flexibility with the uploading and retrieving of users-loaded content. For example, storage maintenance tasks, such as getting more space for our files, are greatly simplified: it is not needed to migrate the data on a different 23 Figure 5: guestbook running on GAE
  • 25. server, additional storage can be purchased when needed. We are now going to explain the steps and difficulties which were encountered when porting the application to the Windows Azure platform. The result of the porting can be found at the address http://draxent- 24 Figure 6: Authentication porting
  • 26. project.azurewebsites.net. Even if both platforms support the Django framework, there is nothing as Django-nonrel available for Azure. For this reason, Django was configured to use a standard SQLite database. Furthermore, there were some small discrepancies in the directory structure of the project, due to the different Django versions used. The first real problem occurred when porting the authentication method (Figure 6). The Google App Engine version of the guestbook relied on the simple Google Accounts API for that. Windows Azure provides a complex, flexible system to handle identification, which makes it possible to use a lot of services (Google, Facebook, Yahoo, Live, etc.) to login [24]. Unfortunately, the feature proved to be almost impossible to set-up because, apparently, the APIs have not been ported to Python yet. At the moment of writing, the only feasible approach would be to implement them from scratch, by using the documentation and other languages' examples as a guide. This could be beyond the company's possibilities, because programmers would need specific skills and proficiency in both languages in order to do it. For that reason, it was more convenient to store the authentication details in the database and to implement manually the login and create user processes. 25 Figure 7: File storage porting
  • 27. The most obvious disadvantage is that the passwords could not be recovered, so they were randomly regenerated and the users could then be notified and encouraged to choose a new one. By contrast, porting the cloud storage proved to be quite an easy task, because the APIs were relatively similar (Figure 7). Regarding the scaling policies' configuration, a direct porting would have not been possible, for two reasons. On one hand, this information is stored in a yaml file on Google App Engine, while on Azure they need to be entered through a web interface; on the other hand, the policies provided by the two services are not directly comparable. The second major problem was the data porting, which was divided in two steps: coping the users' files stored on the cloud, and porting the database content. The first goal was achieved with a script running on Google App Engine, using both GAE and Azure libraries. It simply lists the files on the GCS bucket and then goes through each of them to upload it on the Azure container. The second problem was more difficult to solve. The Google's database is not available as a single file that can be downloaded, so the data had to be extracted with brute-force and then stored in a simple format. The produced file was then parsed by a script on the Windows Azure application, and the information was finally written on the target database. 26
  • 28. CONCLUSIONS Now to conclude this report, we are going to do a brief overview on possible solutions to lock-in problems. As we said previously, cloud computing is still too immature to reach the point where customers are demanding vendor independence. For this reason, the problem is not getting smaller. On the contrary, new proprietary features are steadily added on IaaS and PaaS, moreover new SaaS are created and they are not compatible for all the cloud providers. Therefore vendor lock-in may be unavoidable at this point, but what companies need to do is to understand up-front what the exit strategy will be. Basically, during the initial cost analysis, the companies should add this cost on the total. For exit strategy we mean that a company should pick out the cloud provider basing the decision on additional factors and invest on protective measures against the lock-in [25]: • Make sure that your application can be easily ported to other clouds. In this way, you can move it if there is a service outages. • If your application is highly customized, you should have a different cloud proving as backup. Thanks to this, you will suffer less for the lock- in problem. You can switch between the two cloud solution depending of the convenience. • You should know really well the PaaS you have chosen. In order to do that, you should ask questions about where your PaaS is running and how they are managing their risks of failure of a big cloud. • In addition to the point above, you should ask about redundancy and system architecture, as well, and evaluate all the information with the help of network engineers and system architects. • When you write your application, you should pick up a code that is easier and faster to modify. Certain flavours and types of frameworks and Web scripting environments are more difficult to change. Besides you should pick up languages and frameworks that are supported by more cloud provider as possible. 27
  • 29. Another ways, in which a company can protect itself from the lock-in, is based on middleware solutions, like mOSAIC5 [26] and CloudBees6 [27]. In these approaches a middleware layer will behave as a broker between the application and the cloud infrastructures, providing an abstract interface for developers and isolating them from the specific requirements of each cloud vendor [28]. It seems a solution to the problem, because if a client wants to change from one cloud provider to another one it can delegate this task to the middleware. Furthermore this solution allows to the applications to use features of different cloud providers (multi-cloud). However, middleware solutions are often quite complex, heavy and they have to be deployed in conjunction with the application, penalizing the deployment and performance of the software components attached to them. Further yet, the source code of middleware dependent components will be tightly coupled to the specification of the middleware, thereby moving the lock-in effect from vendors to middleware [28]. Several PaaS development groups are working to establish a set of standards and common APIs to act as the middleware between IaaS and SaaS instalments. These initiatives include [29]: Cloud Application Management for Platform (CAMP) CAMP reduces the effort to move applications between clouds, offers cloud providers and consumers a REST-based approach to application management, provides a common development vocabulary and API that can work across multiple clouds without excessive adaptation so puts a common basis for developing multi-cloud management tools [30]. OpenStack OpenStack is a global collaboration of developers and cloud computing technologists producing the ubiquitous open source cloud computing platform for public and private clouds. OpenStack is a cloud operating system is massively scalable and controls large pools of compute, storage, and networking resources throughout a datacenter [31]. It is free open source software released under the terms of the Apache License. Anyone can run it, build on it, or submit changes back to the 28
  • 30. project. According to Openstack.org, this approach is the only way to remove the fear of proprietary lock-in for cloud customers and create a large ecosystem that spans cloud providers. Cloud Foundry VMware has entered the cloud game by offering an open-source package called Cloud Foundry, an open source Platform-as-a-Service released under the terms of the Apache License. It gives developers a way to create and deploy applications in the cloud without being locked in to a proprietary platform. It supports vSphere, vCloud, OpenStack, and Amazon AWS as infrastructures as a service [32]. We conclude this report by observing that enormous potential of cloud computing risks to be quite negatively affected by the vendor lock-in problem. We do hope that some of the possible solutions that we have briefly mentioned in this chapter will succeed in overcoming it. 29
  • 31. AKNOWLEDGMENTS I wish to thank prof. Brogi, supervisor of this thesis , for the availability and for stimulated me with this topic, which it will be definitely useful for my future career. Special thanks go to my brother, an inspiration source for me. He has been my spiritual guide throughout my life, especially in this important occasion. To my parents, that have never held me back in any field and in any my choice. In particular, they allowed me follow this passion for computer science, carry out the change of high school and come to Pisa. To my sister, who has filled me with support words in all the dark times. Then I thank Andrea, it is thanks to his help I can say to be satisfied of my work. Luca, who I consider as a brother, that is my backbone to withstand and overcome the difficulties of life. My girlfriend, the only source of relax in this hard period, my island of peace and support, without her I would be mad. Roberto, a great friend but also a kind of idol, who is for me a great source of inspiration. The prof. Pollastri for making me discover the computer science through his mythical lessons of Pascal. Luca's father, for saying the phrase: “the only limit of computer science is your imagination”. Alessandra, the first who heard my thesis. And lastly, all my friends, that have understood although I have completely neglected them in recent months. 30
  • 32. RINGRAZIAMENTI Desidero ringraziare il prof. Brogi, relatore di questa tesi, per la disponibilità e per avermi stimolato con questo argomento, che mi sarà sicuramente utile per la mia carriera futura. Un ringraziamento particolare va a mio fratello, per me fonte di ispirazione; per avermi fatto da guida durante la mia vita e soprattutto in questa importante occasione. Ai miei genitori, che non mi hanno mai frenato in nessun campo e in nessuna mia scelta. In particolare, mi hanno permesso di seguire questa passione per l'informatica, di effettuare il cambio di liceo e di venire a Pisa. A mia sorella, che mi ha riempito di parole di conforto in tutti i momenti difficili. Poi ringrazio Andrea, e anche grazie al suo aiuto che posso dire di essere soddisfatto del mio lavoro. Luca, che considero come un fratello. che rappresenta la mia colonna portante per sopportare e superare le difficoltà della vita. La mia ragazza, unica fonte di relax in questo duro periodo, la mia isola di tranquillità e conforto, senza di lei sarei impazzito. Roberto, una grande amico ma anche una sorta di idolo, che è per me una grande fonte di ispirazione. Il prof. Pollastri, per avermi fatto scoprire l'informatica grazie alle sue mitiche lezioni di Pascal. Il padre di Luca, per avermi regalato la frase: “l'unico limite dell'informatica e la propria immaginazione”. Alessandra, la prima a subirsi la mia tesi. E infine tutti i miei amici, che mi hanno compreso nonostante li ho trascurati completamente in questi mesi. 31
  • 33. REFERENCES [1] What is Cloud Computing? Last visited on http://www.youtube.com/watch?v=ae_DKNwK_ms [2] GoogleDevelopers. Campfire One: Introducing Google App Engine. Last visited on http://www.youtube.com/watch?v=3Ztr-HhWX1c [3] M. Armbrust et al. A view of cloud computing. Communications of the ACM Vol. 53 No. 4, April 2010. [4] Cloud Computing Explained. Last visited on http://www.youtube.com/watch?v=QJncFirhjPg [5] B. Sosinsky. Cloud Computing Bible, chapter 1. Wiley Publishing, Inc. 2011. [6] R. Buyya, J. Broberg, A. M. Goscinski. Cloud Computing: Principles and Paradigms, chapter 1. Pearson-Prentice Hall. 2010. [7] P. Mell, T. Grance. The NIST Definition of Cloud Computing. [8] I. Jansch, V. Chin. PHP Development in the Cloud. [9] Eli the Computer Guy. Introduction to Cloud Computing. Last visited on http://www.youtube.com/watch?v=QYzJl0Zrc4M [10] D. Bader, R. Pennington. Cluster Computing: Applications. Georgia Tech College of Computing. [11] What is grid computing? Gridcafe. E-sciencecity.org. [12] G.J. Popek, R.P. Goldberg. Formal Requirements for Virtualizable Third Generation Architectures. Communications of the ACM Vol. 17 No. 7, July 1974. [13] D. Catteddu, G. Hogben. Benefits, risks and recommendations for information security, chapter 3. ENISA. November 2009. [14] M. Rouse. Vendor lock-in. Techtarget.com. May 2012. [15] M. Garnaat. Cloud Lock-In. Not your father's lock-in. Elastician.com. April 2009. [16] J. McKendrick. Cloud Computing's Vendor Lock-In Problem: Why the Industry is Taking a Step Backward. Forbes.com. November 2011. [17] T. Von Eicken. The Skinny on Cloud Lock-In. RightScale.com. February 2009. [18] R. Dilletu. Update: Amazon Web Services Down In North Virginia — Reddit, Pinterest, Airbnb, Foursquare, Minecraft And Others Affected. Techcrunch.com. October, 2012. [19] M. Hinkle. Three cloud lock-in considerations. Zenoss Blog. June 2010. [20] L. Monni. Il Lock-In nei servizi cloud. CloudUp, CloudTalk. 32
  • 34. [21] E. Moyle. Cloud computing vendor lock-in: Avoiding security pitfalls. Techtarget.com. June, 2012. [22] R. Shaw. Computer Aided Marketing and Selling. Butterworth- Heinemann Newton. 1991. [23] T. Nolle. Application portability in PaaS: Problems and solutions. Techtarget.com. March, 2013. [24] Microsoft.com. Adding Sign-On to Your Web Application Using Windows Azure AD. [25] A. Salkever. 5 ways to protect against vendor lock-in in the cloud. Gigaom.com. September, 2011. [26] E.M. Maximilien et al. Toward cloud-agnostic middlewares. In OOPSLA '09, pages 619–626. 2009. [27] W. Tsai et al. Service-Oriented Cloud Computing Architecture. In ITNG '10, pages 684-689. 2010. [28] J. Miranda et al. Identifying Adaptation Needs to Avoid the Vendor Lock-in Effect in the Deployment of Cloud SBAs. WAS4FI-Mashups '12, pages 12-19. September, 2012. [29] M. Szynaka. Ask the Expert: Is PaaS vendor lock-in unavoidable? Techtarget.com. April 2013. [30] C. Redwood Shores. Leading Technology Vendors Announce New Specification Designed to Ease Management of Applications Across Public and Private Clouds. Oracle.com. August, 2012. [31] R. Sean et al. OpenStack Training Guide. Introduction to OpenStack, chapter 2,3. OpenStack Foundation. October, 2013. [32] S.Higginbotham. VMware Launches Open-Source Cloud. Gigaom.com. April, 2011. 33