Distributed Services Scheduling and Cloud Provisioning

Distributed Services Scheduling and
Cloud Provisioning
Project Mentor - Mr. Shashank Srivastava
Department of Computer Science and Engineering 
Motilal Nehru National Institute of Technology Allahabad, Allahabad

Project Members
Name Registration Number
Prashant Mishra 20105094
Nishant Narang 20104085
Sonu Goel 20103030
Trishala Saini 20104119
Arjit Agarwal 20094044

Introduction to Cloud Computing?
In Computer Science, Cloud Computing is a
synonym for distributed computing over a
network, and means the ability to run a
program or application on many connected
computers at the same time , i.e. ,
network-based services, which appear to
be provided by real server hardware, and
are in fact served up by virtual hardware,
simulated by software running on one or
more real machines.

Clients
Control Node
DataBase(Storage)
Computer Network
Application Servers
A Simpliﬁed Cloud Computing Architecture
Cloud

SaaS
Cloud Service Models
PaaS
IaaS
INFRASTRUCTURE AS A SERVICE
IaaS is the delivery of technology infrastructure as an
on demand scalable service.

Offers resources like virtual machine disk image library,
block or ﬁle-based storage, ﬁrewalls, load balancers,
virtual LANs, Software bundles, etc.

Cloud providers typically charge the customers IaaS
services on the amount of resources allocated and
consumed.

SaaS
PaaS
IaaS
PLATFORM AS A SERVICE
PaaS providers make available to the client a
computational platform including typically an OS,
programming language, execution environment, etc.

Clients need to pay only a nominal fee for using cloud
services and thus cost of purchasing underlying
hardware is saved.

SaaS
PaaS
IaaS
SOFTWARE AS A SERVICE
SaaS provides clients to use a software installed on
cloud via an access client or browser (web-service).

Remote Desktop Virtualization is a common example of
SaaS, e.g. VNCViewer.

Usually priced on per-pay-use basis.

Motivation for the Project
To allow user to access cloud services from “anywhere anytime” we have to make
deployment of the cloud applications easy.

The average utilisation of CPU and RAM in a normal users’ system is below 10%. So
these end users can share their resources to the cloud and be benefited and benefit
others in return.

Make Scheduling of applications with efficient usage of resources and preferences.
Parallel handling of requests will lead to faster scheduling of incoming requests and
better utilisation of available resources.

Advantages of Parallel Computing and Virtualization are very cost effective and may
lead to optimum utilisation of available resources.

Proposed Framework
User : Customer who uses cloud services for deploying and executing his apps.

Controlling Master Node : It receives requests from the clients and where Virtual
Machines (VMs) would register to share their resources. Scheduling Algorithm runs on
this node.

Server Database : To store requests received from end-users, their details and details
of the application or commands to be executed. It also maintains the list of VMs
registered and online to serve a client’s request.

Scheduler : Integrated to master node and runs “dynamic priority-based weighted
queue scheduling algorithm”.

Working VM Nodes : These are connected to master node and execute client requests
as per given by scheduler. These VMs provide their remote desktop to clients.
The proposed framework architecture has following components :

User Machines
Controlling Master Node
Server Database
Working VM
Nodes
Scheduler
Architecture of Framework Proposed by Us
(client requests)
Sharing of
resources between
VMs
(TCP Connection initiated by VM with client for Remote Desktop Virtualization)

Implementation of the Framework
Client framework

This application is meant to be installed on client machines. It enables users to connect
to the cloud using known IP addresses and port on which cloud applications are
running.

It allows users to request a remote desktop of a VM running on the cloud constrained
by few requirements which are : 
Minimum RAM required, Minimum Hard Disk Space required, Operating System(Windows
XP/Vista/7/8, Mac OS X, Ubuntu, Linux Distributions etc), Duration, Priority (higher
priority has higher charges per unit time).
The proposed framework is implemented by three applications which are object-oriented
and completely modular and are as follows:

Client Framework has two main
packages :

Client FrontEnd : It provides for
the GUI to the client to input the
various speciﬁcations and connect
to the server.

Client Remote Interface : It
handles the remote desktop that is
tunneled to it by the VM. It also
records all the Mouse Click events
and the Key Typed events and
sends it to the remote VM.

Client FrontEnd

Server Framework

This application needs to run on Controlling Master Node

The main tasks of this framework includes: 
- Listen for Client and VM requests. 
- Invokes a client handler thread and a VM handler thread. 
- On receiving a client request invokes a scheduler client enqueue thread. 
- On receiving the VM register request makes a new entry into DataBase by invoking Database Handler
Object. 
- Starts a NotiﬁcationReceiver thread to periodically update the current load ( interims of CPU and RAM
usage) on the registered VMs. 
- Scheduler Dispatcher thread dequeues appropriate request and selects corresponding VM and dispatches
the job to the VM.

Server Framework has four main packages :

Server FrontEnd : It provides for the
GUI to start the master node, view
registered VMs and log details.

Server Request Handler : It handles the
client requests and starts Client request
thread which extracts the requirements
speciﬁed by the user.

Server DB Handler : It handles the VM
and client request database.

Server Scheduler : It selects
appropriate VM and dispatches client
request to it.
Server FrontEnd

VM Framework

This application is to be installed on the VMs running on the worker nodes spawned by
the hypervisors like VMWare or Oracle VirtualBox. With the help of this framework
VMs can share its resources on the cloud.

The main tasks of this framework includes: 
- Fetches system information and sends to the controlling node at periodic intervals via its Notiﬁer
thread. 
- Listen for dispatcher’s instructions for servicing the client request. 
- Invokes Remote Desktop Sender thread when a request is dispatched to it. 
- Send acknowledgement to the client node. 
- Sends the machine snapshots to the client. 
- Receives the Mouse Click and Key Typed events on the client-end and performs related operations on
the VM whose remote desktop is assigned to it.

VM Framework has two main packages :

VM FrontEnd : It provides GUI to
monitor the VM and to input the various
speciﬁcations and to connect to the
server to periodically update load related
data in the server database. On receiving
a client request, it tries to send an
acknowledgement to the Client.

VM Remote Desktop Sender Interface :
It handles the remote desktop that is
tunneled to the client. It also receives all
the Mouse Click and the Key Typed
events from the client and performs the
corresponding action.

VM FrontEnd

Server
Node
Client
node
VM
Node
Server
Node
Client
node
VM
Node
StartClient
ThreadandVM
Thread
RegisterInformation
ListenonPort6060
StartNotiﬁer
Thread
RequestVM
EnqueueRequest
StartScheduler
Thread 
Findappropriate
VM
DequeClient
Request
SendRandomPort(R)
OpenRandom
Portandwaitfor
connectionfrom
VM
ClientIPandrandomportR
Startthreadto
receivestatus
AcknowledgerequestonportR
StartRemote
ReceiverInterface
SendReadyMessage
StartRemote
SenderInterface
SendSnapshotsPeriodically
FetchMouseClick/
KeyPressEvents
SendCommand(ClickEvents)
ExecuteReceived
Commandand
SendSnapshot
TerminateConnection
RequestServiced
Sequence Flow Diagram

Proposed Scheduling Algorithm
The framework was initially embedded with the traditional FCFS job scheduling
algorithm. But as we know that the cloud infrastructure is “On Demand Pay Per Use”,
we cannot rate all jobs alike. So, it failed to serve the purpose.

This lead us to a priority based job scheduling algorithm where the priority can be
decided on the following basis: 
- Jobs with higher cost per unit time are assigned higher priority than Jobs with lower cost per unit
time. 
- Deadline constrained jobs are given higher priority than jobs whose time limits are not constrained. 
A simple priority based job scheduling algorithm however suffers from a ﬂaw called
STARVATION. To avoid this we have used priority based scheduling with weighted
queues.

Proposed Scheduling Algorithm
The proposed scheduling algorithm employs use of three weighted queues denoting
three distinct levels of priority: 
- High Priority Job Queue with priority 1 
- Normal Priority Job Queue with priority 2 
- Low Priority Job Queue with priority 3
Scheduler starts by invoking two threads Job_Enqueue Thread and Job_Dequeue
Thread.

In our framework the scheduler is encased and implemented at the master node.

High Priority Queue:1
Normal Priority Queue:2
Low Priority Queue:3
Scheduler
Job_Dequeue ThreadJob_Enqueue Thread
J1
J2
J3
J4
1. Client Job Requests arrives at Scheduler.
2. Scheduler
Starts this
Thread.
3. Computes priority of
Job J1 (let it be
Normal).
J1
4. Queues Job to
Corresponding Queue.
J1 J1
5. Always running
and stops if the
three queues are
empty.
6. Runs over the 3
Queues in RRF and
dequeues 3, 2 and 1
Jobs from queues with
priority 1, 2 and 3
respectively.
Helper node
Routine
7. Helper node routine
fetches from Server
Database ids’ of
unallocated VMs that is
eligible for the job.
LoadBalancer
Routine
8. LoadBalancer checks
if the eligible VMs'
load(free RAM, CPU
usage) are present in
cache. If not queries
database for load data.
J1
J1J1 J1J1
9. Finally LoadBalancer
allocates Job J1 to most
under-utilised VM and
update it as allocated in
server database.

Shortcomings of Previous Model
Since there are large number of incoming requests a single cloud master node cannot
be sufﬁcient to handle all the requests simultaneously without degrading the net quality
of service.

Since the number of virtual machines can be very large it is not feasible to store such
huge data and query the database frequently.

The number of tasks in the various weighted queues differ at different instances in
time. Hence we need to alter dequeue rates to adapt to the current demand.

Delivery of service can be improved and Job Drop ratio can be reduced.

Response time of the cloud service can be improved further.
To overcome these shortcomings we proposed a revised Framework.

Revised Framework
Distributed Master Node in place of centralised master node.

Hierarchal structure of nodes and clusters and distribution of computational complexity.

Dequeue rates of the three queues change dynamically maintaining the priority order to
meet up deadline constraints.

We incorporate a system that will keep track of the jobs enqueued and try to minimise
the number of jobs that go past their deadlines and hence producing reliability of
service.

Caching mechanism has been improved to further reduce the number of database
queries.

Nodes are made self-aware , i.e., they are given a choice whether to accept the job
request or to forward it to a less loaded cluster.

J1
J2
J3
J4
Job_Dequeue
Thread
Job_Enqueue
Thread
Scheduler
High Weighted Queue
Medium Weighted Queue
Low Weighted Queue
Helper
node
Routine
dequeue_rate Thread update_priority Thread
DistributedInterfacetoReceiveClientRequests
Windows Cache
Linux Cache
Clusters A and B
Update
Cache
Routine
Update
Cache
Routine
Cluster Gateway
(Serves as Load
Balancer)
Sub-Clusters
Within A
ClusterJ2J2 J2J2
J3J3 J3J3
J1J1 J1J1
Urgent
J4 J4 J4J4

J1
J2
J3
J4
Job_Dequeue
Thread
Job_Enqueue
Thread
Scheduler
High Weighted Queue
Medium Weighted Queue
Low Weighted Queue
Helper
node
Routine
dequeue_rate Thread update_priority Thread
DistributedInterfacetoReceiveClientRequests
J3J3 J3J3
Urgent
J4 J4 J4J4
J5J7J1
J6J8
J9
1. Client Job
Requests arrives
at Distributed
Interface.
2. Scheduler receives requests from
distributed interface.
3.1. If urgent ﬂag set, immediately
sent for resource allocation.
3.2. Scheduler
Starts this
Thread.
4. Computes priority of
Job J3(let it be High)
and queues it in
respective queue.
Jm Jl JiJk
5. Always running
and
stops if the
three queues are
empty.
6. Runs over the 3
Queues in RRF and
dequeues d1, d2 and d3
Jobs from queues with
priority P1, P2 and P3
(P1>P2>P3)respectively.
4.1. Periodically computes job
dequeue rate of three queues based
on their priority ratio and number of
jobs present in queue.
4.2. Based on dequeue rate checks whether
last queued job in a queue can be dequeued
within deadline. If not updates priority of
the job and moves it to higher queue.
J1
J1
J1
J1

Job_Dequeue
Thread
Helper
node
Routine
Windows Cache
Linux Cache
Update
Cache
Routine
Urgent
Update
Cache
Routine
Jm
Jl
Ji
Jk7. Jobs arriving at
Helper Node Routine
from queues or
directly from
scheduler.
8. Helper node routine checks
the OS bit in speciﬁcation
header and queries
corresponding cache for
clusters’ ids that are eligible
for the Job.
8.1. If it does not get any entry
corresponding to its speciﬁcation it

sends a query to corresponding gateway.
8.2. Cluster returns
an ID and updates
same to the cache.
9. Sends the job to the
subcluster Id attained

from cache/gateway
with a Load bit set 1- if
direct from the cache

0-if from the gateway.
Jk
Jk

10. If the Load bit is
0, process request.
Jk
10.1 If it is 1 and the load on the
sub-cluster > threshold_Load
then the sub-cluster checks by
sending the query to Gateway to
find a more appropriate cluster.
If any such exists then

update cache with new sub-
cluster id send the job to the
new sub-cluster with Load bit as
0 so that it does not query
again.
Jk
Update
Cache
Routine
10.1.1 updates cache with
new sub-cluster Id
Cluster_load_notifier
Thread
Runs periodically to update
current average load on sub-
cluster to its gateway.
The Gateway maintains a priority_data
structure to structure the sub-clusters
according to their current load so that it
answer any request query in O(1) time
complexity.!
The Gateway also maintains the Job-id and
Cluster-id mapping to check the status of
jobs.
Internally the VMs register at the host
machines regarding their specifiation and
hosts register on sub-cluster node marked
in blue circle.!
Finally, it allocates jobs to the VM to serve
client requests.
Runs
periodically on VMs to notify
current status and load on it to
its core node.
VM_notifier Thread

Simulation Plots
Job Queue Rate used as an Input for Testing

Simulation Plots
Jobs Dequeued v/s Time Plot
Previous Model Revised Model

Simulation Plots
Jobs Missing Deadline v/s Time Plot
Previous Model Revised Model

Simulation Plots
Load v/s Time Plot for 5 VMs

Distributed Services Scheduling and Cloud Provisioning

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Distributed Services Scheduling and Cloud Provisioning

Similar to Distributed Services Scheduling and Cloud Provisioning (20)

Recently uploaded

Recently uploaded (20)

Distributed Services Scheduling and Cloud Provisioning