DISTRIBUTED LOAD BALANCING
AND MULTIPLE DATA CENTERS
ANALYSIS
Presented by:
Sowmya C
1
CLOUD
Cloud computing is the delivery of computing services
over the Internet.
Characteristics of cloud
• On demand services.
• Broad network access.
• Reliability.
• Resource pooling.
• Rapid elasticity.
• Measured service.
2
BIG DATA
Big data is similar to small data but bigger.
Having data bigger it requires different
approaches:
• Techniques ,tools and architecture.
Big data comes from sensor devices, video,
audio, networks, social media, transactional
applications.
3
WHY BIG DATA?
Big data enables:
• Increased storage capacity
• Increased processing power
• Helps to make better business
decision
• Examining large amount of data.
• Effective marketing
4
PROBLEM STATEMENT
• Load balancing is the main challenge in cloud
computing, centralized systems are subjected to
single point of failure hence it is required to
distribute the dynamic local workload across all the
nodes.
• The outcome of data centers is huge and it is
necessary to use an efficient technology to analyse
the data.
5
OBJECTIVE
• Achieving load balancing in datacenters using
distributed load balancing system to increase
performance and resource utilization.
• Data analysis using an efficient tool called hadoop.
6
Load balancing in data centers
Load balancing is the
process of improving the
performance of the
system by shifting of
workload among the
processors.
Data centers are the
locations containing a
group of servers.
7
Types of load balancing
Static load balancing
 The decision of shifting the load
does not depend on the current
state of the system.
 Algorithms are non preemptive.
 Round Robin.
 Central Manager.
 Threshold algorithm.
 randomized algorithm
Dynamic load balancing
 current state of the system is used to
make any decision for load
balancing.
 Dynamic load balancing algorithms
are preemptive.
 Types of Dynamic load balancing
 Local Queue Algorithm.
 Central Queue algorithm.
8
CENTRALIZED LOAD BALANCING
Limitations of centralized load
balancing
• Only suitable for WAN’s
where traffic is predictable
and stable.
• Example : google’s inter-
datacenters traffic
engineering algorithm needs
to run just 550 times per day
Existing system architecture
9
Main
Controller
c1
DISTRIBUTED DATA CENTERS
Needs for distributed
systems :
• High speed of system.
• High performance
• Huge processing
power
Proposed system architecture
Distributed load balancer
S1 S2 S3 S4
App.A App.cApp.B
Network
10
computer1 computer2 computer3 computer4
DISTRIBUTED LOAD BALANCING SYSTEM
Distribution systems can be defined as collection of computing
and communication resources located in distributed data centers
which are shared by several end users.
Advantages of distributed systems
• High performance
• Distribution
• Transparency
• Reliability
• Incremental growth
11
Data
Need for big data
12
HADOOP
Open source data storage and processing API.
Massively scalable and automatically parallelizable.
Core components :
• Hadoop common
• Hadoop distributed file system
• Map reduce
• YARN
13
MAPREDUCE ABSTRACTION
Map returns
information •Map
Reduce
accepts
information
•combine
Reduces applies
a user defined
function to
reduce data
•reduce
14
CONCLUSION
We can achieve high throughput, resource
utilization.
we can reach high user satisfaction.
15
16

Distributed load balancing with multiple datacenter analysis

  • 1.
    DISTRIBUTED LOAD BALANCING ANDMULTIPLE DATA CENTERS ANALYSIS Presented by: Sowmya C 1
  • 2.
    CLOUD Cloud computing isthe delivery of computing services over the Internet. Characteristics of cloud • On demand services. • Broad network access. • Reliability. • Resource pooling. • Rapid elasticity. • Measured service. 2
  • 3.
    BIG DATA Big datais similar to small data but bigger. Having data bigger it requires different approaches: • Techniques ,tools and architecture. Big data comes from sensor devices, video, audio, networks, social media, transactional applications. 3
  • 4.
    WHY BIG DATA? Bigdata enables: • Increased storage capacity • Increased processing power • Helps to make better business decision • Examining large amount of data. • Effective marketing 4
  • 5.
    PROBLEM STATEMENT • Loadbalancing is the main challenge in cloud computing, centralized systems are subjected to single point of failure hence it is required to distribute the dynamic local workload across all the nodes. • The outcome of data centers is huge and it is necessary to use an efficient technology to analyse the data. 5
  • 6.
    OBJECTIVE • Achieving loadbalancing in datacenters using distributed load balancing system to increase performance and resource utilization. • Data analysis using an efficient tool called hadoop. 6
  • 7.
    Load balancing indata centers Load balancing is the process of improving the performance of the system by shifting of workload among the processors. Data centers are the locations containing a group of servers. 7
  • 8.
    Types of loadbalancing Static load balancing  The decision of shifting the load does not depend on the current state of the system.  Algorithms are non preemptive.  Round Robin.  Central Manager.  Threshold algorithm.  randomized algorithm Dynamic load balancing  current state of the system is used to make any decision for load balancing.  Dynamic load balancing algorithms are preemptive.  Types of Dynamic load balancing  Local Queue Algorithm.  Central Queue algorithm. 8
  • 9.
    CENTRALIZED LOAD BALANCING Limitationsof centralized load balancing • Only suitable for WAN’s where traffic is predictable and stable. • Example : google’s inter- datacenters traffic engineering algorithm needs to run just 550 times per day Existing system architecture 9 Main Controller c1
  • 10.
    DISTRIBUTED DATA CENTERS Needsfor distributed systems : • High speed of system. • High performance • Huge processing power Proposed system architecture Distributed load balancer S1 S2 S3 S4 App.A App.cApp.B Network 10 computer1 computer2 computer3 computer4
  • 11.
    DISTRIBUTED LOAD BALANCINGSYSTEM Distribution systems can be defined as collection of computing and communication resources located in distributed data centers which are shared by several end users. Advantages of distributed systems • High performance • Distribution • Transparency • Reliability • Incremental growth 11
  • 12.
  • 13.
    HADOOP Open source datastorage and processing API. Massively scalable and automatically parallelizable. Core components : • Hadoop common • Hadoop distributed file system • Map reduce • YARN 13
  • 14.
    MAPREDUCE ABSTRACTION Map returns information•Map Reduce accepts information •combine Reduces applies a user defined function to reduce data •reduce 14
  • 15.
    CONCLUSION We can achievehigh throughput, resource utilization. we can reach high user satisfaction. 15
  • 16.