this presentation is about how the data is processed and stored and analysed in the data center. It includes different approaches extracted from the research journals available on the internet. it also covers some additional topics like multiprocessor, virtual machines in data center.
2. What is a Data center ?
•A large group of networked computer servers typically used by
organizations for the remote storage, processing, or distribution of
large amounts of data.
•It doesn’t house only servers but also contains backup power
supplies, communication connections, air conditioning, fire
supplies etc.
•“A data center is a factory that transforms and stores bits”
3. A few glimpses of Data Center of a few organizations …
Rackspace - Richardson,TX Facebook – Lulea, Sweden
Google- Douglas County, Georgia Amazon – Virginia, outside Washington D.C
3
5. Data Center workload
• Amount of processing that the computer has been given to do at a
given time.
• Workload — in the form of web requests, data analysis, multimedia rendering,
or other applications – is placed in the data center
Ref: http://searchdatacenter.techtarget.com/definition/workload
5
6. Classification of workloads based on time criticality
Critical Workloads Non-critical Workloads
“Cannot tolerate even a few minutes
of downtime”
can tolerate a wide range of outage times
7. Ways to improve data protection
• Prevent downtime by reducing resource contention :
Managers accommodates drastically changing demands on workloads by allowing easy creation of
additional workloads without changing or customizing its applications.
• Replicate workloads into cloud to create asymmetric “Hot back-ups”:
Clone the complete workload stack. Import into public/private cloud
• Using dissimilar infrastructure for off-premises redundancies:
Workloads are replicated off-site to different cloud providers.
• Concept of “Failures or Failback” reserved only for critical workloads:
Automating the switching of users or processes from production to recovery instances
8. Characterizing Data Analysis workloads in Data Centers
• Data Analysis is important improving future performance of data center
• Data center workloads services workload (web search, media streaming)
data analysis workload ( business intelligence, machine learning )
• We concentrate on internet services workload here
• Data analysis workloads are diverse in speedup performance and micro-architectural characteristics.
Therefore, there is a need to analyze many applications
• 3 important application domains are in internet services are : 1) search engine 2) social networks 3) electronic
commerce
8
9. Workload requirements :
1)most important application domain
2) data is distributed, data can not be processed on single node
3)consider recently used data
9
11. DCBench :
• Benchmarks used to evaluate new designs and systems benefit
• DCBench is a benchmark suite for data center computing, with an
open source license.
• Includes online and offline workload
• Includes different programming model like MPI versus MapReduce
• Helpful for performing architecture and small to medium scale system
researches for data center computing.
11
15. Work-seeks-bandwidth
• chip designers prefer placing components that interact often (e.g.,
cpu-L1 cache, multiple CPU cores) close by to get high bandwidth
interconnections on the cheap
• Jobs are placed in data center that rely on heavy traffic exchanges
with each other in areas where high network bandwidth is available.
15
16. Contd..
This translates to the engineering decision of placing jobs within the
same server, within servers on the same rack or within servers in
the same VLAN and so on with decreasing order of preference and
hence the work-seeks-bandwidth pattern.
16
17. Scatter gather pattern
• data is partitioned into small chunks, each of which is worked on by
different servers, and the resulting answers are later aggregated.
17
18. Congestion
• Periods of low network utilization indicate
Application that demands more of other resources- CPU, disk than network
Application can be rewritten to make better use of available bandwidth
18
19. Evacuation event (congestion)
• When a server repeatedly experiences problems, the automated
management system in the cluster evacuates all the usable blocks on
that server prior to alerting a human that the server is ready to be re-
imaged.
19
20. Read failure
• When a job does not make any progress it is killed (unable to find
input data, or unable to connect to a machine)
20
21. Contd.
• To attribute network traffic to the applications that generate it, the
network event logs and logs at the application-level were merged that
describe which job and phase (e.g., map, reduce) were active at that
time. Results showed that, jobs in the reduce phase are responsible
for a fair amount of the network traffic.
• Note that in the reduce phase of a map-reduce job, data in each
partition that is present at multiple servers in the cluster (e.g., all
personnel records that start with ‘A‘) has to be pulled to the server
that handles the reduce for the partition .
21
22. Monitoring Data Center Workload
• For coordinated monitoring and control of data centers, the most
commonly approaches are based on Monitor, Analyze ,Plan and
Execute (MAPE ) control loops.
Overview
22
23. Modern Data Center Operation
• Workload in the form of web requests, data analysis, etc is placed in the
data center.
• An instrumentation infrastructure logs sensor readings.
• The results are fed into a policy engine that creates a plan to utilize
resources.
• External interfaces or Actuators implement the plan.
23
24. Workload Monitoring using Splice
• Splice aggregates sensor and performance data in a relational
database.
• It also gathers data from many sources through different interfaces
with different formats.
• Splice uses change of value filter that retains only those values that
differ significantly from the previously logged values.
• It reduces minimal loss of information.
24
26. Implementation
• Splice uses change of value filter that retains only those values that
differ significantly from the previously logged values.
• It reduces minimal loss of information.
26
27. Analysis
• Data analysis is done by two main classes- attribute behavior and
correlation.
• Attribute behavior describes the value of the observed readings and
how those values change over time.
• Data correlation methods determine the strength of the correlations
among the attributes affecting each other.
27
28. Virtualization in Data Centers
• Virtualization is a combination of software and hardware features that creates
virtual CPUs (vCPU) or virtual systems-on-chip (vSoC).
• Virtualization provides the required level of isolation and partitioning of resources.
• Each VM is protected from interference from another VM.
Reference: Multicore Processing: Virtualization and Data Center By: Syed Shah, Nikolay Guenov
29. Why Virtualization
• Reduced power consumption and building space, providing high availability
for critical applications and streamlining application deployment and
migration.
• To support multiple operating systems and consolidation of services on a
single server by defining multiple VMs.
• Multiple VMs can run on a single server, the advantage is of reduced server
inventory and better server utilization.
Reference: Multicore Processing: Virtualization and Data Center By: Syed Shah,
Nikolay Guenov
31. Multi Core Processing
• A multi-core processor is a single computing component with two or more
independent actual processing units (called "cores"), which are the units that read
and execute program instructions.
Reference: Multicore Processing: Virtualization and Data Center By: Syed Shah,
Nikolay Guenov
32. Virtualization and
Multicore Processing
• With multicore SoCs, given enough processing capacity and virtualization, control
plane applications and data plane applications can be run without one affecting the
other.
• Data or control traffic that is relevant to the customized application and operating
system (OS) can be directed to the appropriate virtualized core without impacting
or compromising the rest of the system.
Reference: Multicore Processing: Virtualization and Data Center By: Syed Shah,
Nikolay Guenov
33. Control and Data Plane Application Consolidation in
virtualized Multicore SoC
34. • Functions that were previously implemented on different boards now can be
consolidated onto a single card and a single multicore SoC.
Reference: Multicore Processing: Virtualization and Data Center By: Syed Shah,
Nikolay Guenov
35. Data center Reliability
Network Reliability
Characterizing most failure
prone network elements
Estimating the impact of
failures
Analyzing the
effectiveness of network
redundancy
Reference: Understanding Network Failures in Data Centers: Measurement,
Analysis, and Implications By: Phillipa Gill, Navendu Jain, Microsoft Research
36. Key Observations
• Data center networks are reliable
• Low-cost, commodity switches are highly reliable
• Load balancers experience a high number of software faults
• Failures potentially cause loss of a large number of small packets.
• Network redundancy helps, but it is not entirely effective
Reference: Understanding Network Failures in Data Centers: Measurement, Analysis, and Implications By: Phillipa
Gill, Navendu Jain Microsoft Research
37. Reasons to change from traditional
Significant changes in computing power, network bandwidth, and
network file system usage
• Network file system workloads
• No CIFS protocol studies
• Limited file system workloads
Reference: Measurement and Analysis of Large-Scale Network File System Workloads by Andrew W. Leung, Shankar
Pasupathy, Garth Goodson, Ethan L. Miller
38. Access Pattern
Read Only Write Only Read and Write
Analysis
• File Access Patterns:
Reference: Measurement and Analysis of Large-Scale Network File System Workloads by Andrew W. Leung, Shankar
Pasupathy, Garth Goodson, Ethan L. Miller
39. Sequential Access
Entire Partial
• Sequentiality Analysis:
Reference: Measurement and Analysis of Large-Scale Network File System Workloads
by Andrew W. Leung, Shankar Pasupathy, Garth Goodson, Ethan L. Miller
40. File Lifetime
• CIFS, files can be either deleted through an explicit delete request,
which frees the entire file and its name, or through truncation, which
only frees the data
• CIFS users begin a connection to the file server by creating an
authenticated user session and end by eventually logging off.
Reference: Measurement and Analysis of Large-Scale Network File System Workloads by Andrew W. Leung,
Shankar Pasupathy, Garth Goodson, Ethan L. M.
41. Architecture
Load Balancer
IP address to which requests are
sent is called a virtual IP address
(VIP)
IP addresses of the servers over
which the requests are spread are
known as direct IP addresses
(DIPs).
• Inside the data center, requests are spread among a pool of front- end
servers that process the requests. This spreading is typically performed
by a specialized load balancer.
Reference: Towards a Next Generation Data Center Architecture: Scalability and
Commoditization By Albert Greenberg, David A. Maltz Microsoft Research, WA, USA
42. Challenges and Requirements
Challenges
• Fragmentation of resources
• Poor server to server connectivity
• Proprietary hardware that scales up, not out
Requirements:
• Placement anywhere
• Server to server bandwidth
• Commodity hardware that scales out
• Support 100,000 servers
Reference: Towards a Next Generation Data Center Architecture: Scalability and
Commoditization By Albert Greenberg, David A. Maltz Microsoft Research, WA, USA
43. Load Balancing
Load Balancing
Load Spreading:
requests spread evenly over a
pool of servers
Load Balancing:
place load balancers in front of the
actual servers
Reference: Towards a Next Generation Data Center Architecture: Scalability and
Commoditization By Albert Greenberg, David A. Maltz Microsoft Research, WA, USA
45. – a few real-time
scenarios
Why build a Data center at Virginia when there is one at California?
• Reduce the time to send a page to users on the East Coast
• California – running out of space
Virginia – lots of room to grow
• restricting to one datacenter meant that in the event of disaster(earthquake,
power failure, Godzilla) Facebook could be usable for extended amount of time.
46. The hardware and network were set up soon..but how to
handle cache consistency?
Master
DB
Sl
47. Facebook’s Scheduling with Corona
• With Facebook’s user base expanding at an enormous rate, the
development of a new scheduling framework called CORONA came
into place.
• Initially a MapReduce implementation of Apache Hadoop served as
the foundation of the infrastructure. But this system over the years
developed several issues. These were:
Scheduling overhead
Pull based scheduling model
Static slot-based resource management model
48. Facebook’s Solution • Corona introduces a cluster manager
whose only purpose is to track the
nodes in the cluster and the amount of
free resources.
• Corona uses push based scheduling.
This reduces scheduling latency.
• The separation of duties allows Corona
to manage a lot more jobs and achieve
better cluster utilization.
• The cluster manager also implements
fair-share scheduling.
49. Future of Corona
• New features such as
Resource based scheduling than slot based model
Online upgrades to the cluster manager
Expansion of user base by scheduling applications such as Peregrine
50. Characterizing backend
workload(at Google)
Ref: Towards Characterizing Cloud Backend Workloads: Insights from Google Compute Clusters (Asit K. Mishra Joseph L.
Hellerstein Walfredo Cirne Chita R. Das)
51. Pre-requisites
• Capacity planning to determine which machine resources must grow
and by how much and
• Task scheduling to achieve high machine utilization and to meet
service level objectives
• Both these require good understanding of task resource consumption
i.e CPU and memory usage.
52. The approaches
1. Make each task its own workload
Scales poorly since tens of thousands of tasks execute daily on google
computes clusters.
2. View all tasks as belonging to one single task
Results on large variances in predicted resource consumptions.
53. The proposed methodology
• identifying the workload dimensions
• constructing task classes using an off-the-shelf algorithm such as k-
means
• determining the break points for qualitative coordinates within the
workload dimensions
• merging adjacent task classes to reduce the number of workloads
54. Based on
• the duration of task executions is bimodal in that tasks either have a
short duration or a long duration
• most tasks have short durations
• Most resources are consumed by a few tasks with long duration that
have large demands for CPU and memory
55. Objective
• construct a small number of task classes such that tasks within each
class have similar resource usage.
• We use qualitative coordinates to distinguish workload- small(s),
medium(m), large(l)
56.
57. First step
• Identify the workload dimensions.
• For example, in analysis of the Google Cloud Backend, the workload
dimensions are task duration, average core usage, and average
memory usage
58. Second step
• Constructs preliminary task classes that have fairly homogeneous
resource usage. It is done by using the workload dimensions as a
feature vector and applying an off-the-shelf clustering algorithm such
as k-means
59. Third step
• determining the break points for the qualitative coordinates of the
workload dimensions. It has two considerations. First, break points
must be consistent across workloads. For example, the qualitative
coordinate small for duration must have the same break point (e.g., 2
hours) for all workloads. Second, the result should produce low
within-class variability
60. Fourth step
merges classes to form the final set of task classes. These classes define
our workloads. This involves combining “adjacent” preliminary task
classes. Adjacency is based on the qualitative coordinates of the class.
For example, in the Google data, duration has qualitative coordinates
small and large; for cores and memory, the qualitative coordinates are
small, medium, large. Thus, the workload smm is adjacent to sms and
sml in the third dimension. Two preliminary classes are merged if the
CV(coefficient of variance) of the merged classes does not differ much
from the CVs of each of the preliminary classes. Merged classes are
denoted by the wild card “*”. For example, merging the classes sms,
smm and sml yields the class sm*