Published on

Achieving QOS in a multi-tenant cloud platforms is still a difficult task and many companies follow different approaches to solve this problem. Here in this document I tried architecting a simple solution for achieving different QOS for different tenants in a Multi-tenant cloud environment based on my experiments with containers , docker and cgroup on Openstack.

Published in: Engineering, Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  2. 2. 2
  3. 3. 3 Table of Contents Introduction 4 Problem statement 4 Solution overview 5 Goals 5 Typical implementation in production environment 6 Linux Containers 8 Components of LXC 8 Control group (cgroup) 9 Subsystems 10 blkio Subsystem 10 CPU Subsystem 10 Memory Subsystem 10 Python Controller 11 Puppet 12 Docker 13 Other options 13 Summary 14 References 14
  4. 4. 4 Introduction Achieving QOS in a multi-tenant cloud platforms is still a difficult task and many companies follow different approaches to solve this problem. Here in this document I tried architecting a simple solution for achieving different QOS for different tenants in a Multi-tenant cloud environment based on my experiments. Problem statement Openstack steps into platform as service by introducing a new component ‘Trove’ Database as Service (DBaaS) offering in its upcoming Icehouse release. But Openstack announced that Trove will be operating as Single tenant service (Which means, for each Database instance, a new VM will be created). This is a costly affair for cloud service providers and also resources may not be used efficiently in this scenario. Many big cloud service providers like Google and Amazon provides options for the same DBaaS as a multi-tenant service. In this case, many instances of the DB will run in a single virtual machine. This reduces the cost of running extra virtual machines. But it also have few problems like QOS, security and isolation. The QOS factors are CPU, Memory, IOPS (Input/Output Operations per second) etc. More than one DB instance will be running in a single machine. In worst case scenarios one DB Instance may end up eating large amount of resources which greatly affects other DB Instances. We need to guarantee the QOS as mentioned in the SLA. Since more than one DB Instance will be running in a single machine, we have some security considerations. When one customer’s database gets affected it must not affect other instances. Also consumers in Single tenant are charged based on their usages like number of IO, total space, CPU, memory etc. But when it comes to multi-tenant it’s hard to estimate the usage as more than one DB instance will be running in a single virtual machine. In another perspective, the existing solution of creating each VM for each customer has a drawback of running separate operating system for each customer. This separate operating system is an extra load for the service provider as it need a lot of data space and memory.
  5. 5. 5 Solution overview In my proposed solution, I used Linux containers running inside a virtual machines for isolation DB Instances. Each database instance will be running inside a Container. Therefore we can achieve true isolation, resources can be controlled and metering is also quite easy. With cgroup feature we can control the IOPS of the container and thereby we can offer different service level (IOPS) for different tenants. Existing offering (In Openstack Trove) Proposed solution Goals  Tenant based QOS  Multi tenancy  True resource isolation for DB Instances  Perfect metering  Automation
  6. 6. 6 Typical implementation in production environment This is a typical implementation in the production environment. The user requests for a database instance using the dashboard. Once when the request is initiated, the Python controller gets the resources specified for a particular flavor in Nova and then consolidates the existing containers. If there is space available in the any virtual machines, the container is created there. Or else a new nova virtual machine is created and then the container is created in that virtual machine with the user specified parameters (CPU, RAM and IOPS).
  7. 7. 7 Each time a virtual machine is created, it is discovered by puppet and the container software (LXC or Docker) is installed. Each time a container is created, MySQL is installed. After the creation and provisioning of the containers the users are provided the access to the database. (IP Address, MySQL username and password). The above is a modular approach in provisioning server. However for smaller companies the architecture can be simplified by using pre-built vagrant or golden images. The following is the brief of all the components mentioned above.
  8. 8. 8 Linux Containers Linux containers provide light weight operating system level virtualization which isolates processes and resources in a simpler way compared to full-scale virtual machines. LXC works in the way similar to virtualization but with the difference that it don’t need separate kernel instance. It allows us to create many number of sand box environment which is completely isolated from the host and other containers. Components of LXC  Namespaces – Used to provide process isolation  cgroups – Used to control System management and resource control  SELinux – Ensures isolation between host and the container and also Individual containers  Libvirt- Tool box to manage containers Since QOS is our primary objective, we are going to focus more on control groups.
  9. 9. 9 Control group (cgroup) Control group is a kernel feature to limit the resources like CPU, System memory and network bandwidth among the user-defined groups of tasks. For example, we can limit a MySQL instance from using all memory. In the same way we can guarantee that the MySQL instance gets the specified resource. In this architecture, I am using cgroup feature on Linux containers to isolate DB Instances and guarantee the minimum QOS for the customer. Limits for a particular container is defined in the containers configuration file. Hence we can allocate different resources for different containers based on customer requirements. In our scenario the containers will be running as process and the processes inside the containers will be running as the sub process.
  10. 10. 10 Subsystems Subsystems are kernel modules that are aware of cgroups. They are resource controllers that allocate varying level of system resources to different cgroups. The following are the subsystems of cgroup. blkio Subsystem The Block I/O subsystem controls and monitors access to I/O on block devices by tasks in cgroups. It offers features like proportional weight division and I/O throttling (Upper limit). Common parameters: blkio.throttle.read_iops_device - specifies the upper limit on the number of read operations a device can perform blkio.throttle.read_bps_device - specifies the upper limit on the number of read operations a device can perform blkio.throttle.write_bps_device - specifies the upper limit on the number of write operations a device can perform CPU Subsystem The cpu subsystem schedules CPU access to cgroups. Common parameters: cpu.shares - contains an integer value that specifies a relative share of CPU time available to the tasks in a cgroup cpu.rt_period_us - specifies a period of time in microseconds (µs, represented here as "us") for how regularly a cgroup's access to CPU resource should be reallocated Memory Subsystem The memory subsystem generates automatic reports on memory resources used by the tasks in a cgroup, and sets limits on memory use by those tasks Common parameters: memory.usage_in_bytes - reports the total current memory usage by processes in the cgroup (in bytes) memory.max_usage_in_bytes - reports the maximum memory used by processes in the cgroup (in bytes) memory.limit_in_bytes - sets the maximum amount of user memory There are also various other subsystems like cpuacct, cpuset, devices, freezer etc. Those can be used in our scenario for enhanced configurations.
  11. 11. 11 Python Controller In a fresh Openstack environment when a user requests an instance, a new VM is created. But in our case we need to provision containers. Hence we need to modify the normal Openstack work flow. One popular way to do this is via REST based API. Since I am a python guy, I am doing this via Python APIs provided by Openstack. All details of the containers created by the users is saved in the local MySQL database. In this scenario, the user is shown a dashboard or a form for database provisioning. When the user requests the instance, this python controller takes control. It gets the flavor details we used to build nova VM by enquiring Openstack. Then it consolidates the containers provisioned by using the local database. If it could not find any space, then a new nova VM is created using API calls and then the process continues. If existing VM has necessary resource to provision a container, then the container is created in that existing VM. The following is a sample python code for creating an Instance. Initially we can set the resource level options for any particular flavor. nova-manage flavor set_key --name m1.small --key quota:disk_read_bytes_sec --value 10240000 nova-manage flavor set_key --name m1.small --key quota:disk_write_bytes_sec --value 10240000
  12. 12. 12 Puppet Openstack can provide any number of machines based on demand. But to get all those machines into production (Installing required softwares like LXC or Docker in our scenario), we need some automation. There are various automation tools for change and configuration management. In this scenario I used puppet. Puppet can manage our servers. In a puppet environment, we describe the necessary machine state in a declarative code. Puppet clients connects to the server and ensures that they are in the state described by the manifest file in server. In our scenario we will be defining manifests for installing LXC or Docker. Once after the necessary container is installed we bring the container under control of puppet for software (MySQL) installation. Puppet manifest for MySQL is available in Github.
  13. 13. 13 Docker Docker is an open source developer-friendly abstraction layer on the top of Linux containers (LXC). Docker gives a simple and meaningful layer to play with containers in a cloud environment. By using Docker we can actually build containers, use it and make changes based on our need, push our used containers to the Docker repository and pull any time and any number of time for further usage. This means a lot in a PAAS market. In a high level terminology, Docker can automate the deployment of applications as highly portable, self-sufficient containers which are independent of hardware, language, framework, packaging system and hosting provider. Docker also provides drivers for Openstack which embeds with Nova and provides ability to work with containers along with nova virtual machines. Since most Openstack production environment need to instantiate various different operating systems, we can have a work around and achieve our need. In this scenario we are going to run a Docker or LXC on the top of a virtual machine. Docker, along with puppet or chef can be very useful for the Platform as a Service providers. They are very useful in automated provisioning of platforms required for developers, in a very convenient and sophisticated way. Thus making operations team work much easier. Other options The above method is one way of creating multi-tenant cloud environment. But there are many number of ways to achieve it using various other options. Rackspace uses OpenVZ to build their cloud platforms. They uses OpenVZ to contain their customers database and for resource isolation. OpenVZ has many advantages over LXC. Resource allocation is made simple in OpenVZ. (i.e. Guaranteed RAM and Burstable RAM are specified using simple commands ). Live migrations are quite easy in OpenVZ when compared to LXC. Oracle follows an interesting architecture in its DBaaS offering. They created a customized ‘Container Database’. All the customer databases are in Pluggable database (PDB) format and they can be plugged to the container database and can work on.
  14. 14. 14 Summary Thus the tenant based QOS feature is achieved in a multi-tenant cloud platform. I haven’t mentioned about some other features and drawbacks like migration, scale up, high availability etc. All those drawbacks can be rectified by having some workaround in the architecture. References 1) Linux Plumbers Conference 2013, Rackspace session 2) 3) 4) 5) mtd.pdf 6) 7)