Intelligent Placement of Datacenter for Internet Services

NECST Lab @ Politecnico di Milano

Slide for EEDC homework 6, based on this paper: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5961695

EEDC

34330
Execution Intelligent Placement of
Environments for Datacenter for Internet
Distributed Services
Computing
Master in Computer Architecture,
Networks and Systems - CANS

Homework number: 6
by
Arinto Murdopo – arinto@gmail.com

Problem Statement

where? dónde? di mana? oú? waar?

Data Center

dove? どこですか? πού? 在哪里？어디?

Response time, availability,
cost, environmental concerns

2

Proposed Solution

Framework
Produce tool to compare
Solve optimization efficiency and accuracy
Problem

Characterization

3

Framework
Efficiently select data center locations

Response Time
Minimize Cost
Consistency
Availability

4

Solve Optimization Problem
Problem formulation

Approaches:
• Simple Linear Programming (LP0)
• Pre-set Linear Programming (LP1)
• Brute force (Brute)
• Heuristic-based on LP (Heuristic)
• Simulated Annealing plus LP1 (SA+LP1)
• Optimzed SA + LP1 (OSA + LP1)

5

Placement Tool
Available Inputs:
 MaxS
 1/ratioServerUser
 MAXLAT
 MAXDELAY
 MINAVAIL
 area of interest
 Granularity
 existing data center

6

Placement Tool
Location-dependent data:
 Network backbones: latency data from backbone ISP
 Power plants, transmission lines, and CO2 emissions:
obtained from DOE
 Electricity, land, water and temperature: obtained from DOE
as well
 Missing data are obtained from neighboring location

7

Placement Tool
Datacenter characteristics:
Cooling : CRACs and Water Chillers for cooling

Connection: It costs $500k/mile of transmission
line, and $480k/mile of fiber. Amortization of 12
years

Building: Its costs depends of the maximum
power

Land: 6 K square feet per Megawatt

8

Placement Tool
Datacenter characteristics:
Water: 24K gallons of water per MW per day

Server: Each server costs $2000 (4 years
amortization), each interconnect switch costs
$20K (4 years amortization)

Staff: $0.05 per Watt per month. $100K per year
salary for 1K servers

9

Characterization
Characterize 7 locations in US

10

Characterization
Evaluate each location with Placement Tools
Parameters

11

Characterization
 Evaluate each location with Placement Tools
Parameters

12

Broadening The Scope
Distribution of cost assuming 500 potential locations

13

Sample Output

Specifications: Results
1. 60 K servers Three locations :
2. Latency <= 60 ms 1. Seattle(A, 1789 servers)
3. Consistency Delay <= 85 ms 2. St. Louis (B, 22712 servers)
4. Minimum Availability = 5 nines 3. Oklahoma city(C, 5501 servers)

14

Evaluation of Chosen Approach
Based on this specification:
1. 60 K servers
2. Latency <= 60 ms
3. Consistency Delay <= 85 ms
4. Minimum Availability = 5 nines

15

Evaluation of Chosen Approach
Running Times of Solution Approaches

17

Evaluation of Chosen Approach
Solution Quality

18

Evaluation of Chosen Approach
Recommended approach:

OSA + LP1, since it provides best tradeoff between
running time and search quality

19

Exploring Placement Tradeoff
Latency

Latency of 50 ms strikes the best compromise between latency and cost

20

Exploring Placement Tradeoff
Availability

It is usually cheaper to build networks out of less redundant datacenters
Tier II data centers are the best option

21

Exploring Placement Tradeoff
Consistency Delay

Low latency and low consistency are conflicting goals

22

Exploring Placement Tradeoff
Green datacenters

Green network is less than $100k more expensive per
month than the cost-optimal network when the maximum
latency can be relatively high (> 70ms)
23

Exploring Placement Tradeoff
Chiller-less data center

Avoiding chillers reduces costs by 8% for max latencies >= 70ms

24

Conclusions

• Proposed and implemented optimization
framework for automatic data center
placement for Internet Services

• Characterized US regions

• Evaluated solutions based on the framework

25

An elastic data stream processing system is able to handle changes in workload by dynamically scaling out and scaling in. This allows for handling of unexpected load spikes without the need for constant overprovisioning. One of the major challenges for an elastic system is to find the right point in time to scale in or to scale out. Finding such a point is difficult as it depends on constantly changing workload and system characteristics. In this paper we investigate the application of different auto-scaling techniques for solving this problem. Specifically: (1) we formulate basic requirements for an autoscaling technique used in an elastic data stream processing system, (2) we use the formulated requirements to select the best auto scaling techniques, and (3) we perform evaluation of the selected auto scaling techniques using the real world data. Our experiments show that the auto scaling techniques used in existing elastic data stream processing systems are performing worse than the strategies used in our work.

Umit hw6

civcimix

This document proposes a framework for intelligently placing datacenters to optimize response time, availability, costs and emissions. It defines relevant parameters like costs, response time, consistency delay and availability. It formulates the placement problem as an optimization problem aiming to minimize costs while meeting constraints. The problem is solved using simulated annealing and linear programming. A tool is developed to automatically select datacenter locations based on this approach. Experimental results demonstrate millions of dollars can be saved through optimized placement.

cloud schedualing

twomarkopolo

This document proposes a new Ranking Chaos Optimization (RCO) algorithm to solve the dual scheduling problem of cloud services and computing resources (DS-CSCR) in private clouds. It introduces the DS-CSCR concept and models the characteristics of cloud services and computing resources. The RCO algorithm uses ranking selection, individual chaos, and dynamic heuristic operators. Experimental results show RCO has better searching ability, time complexity, and stability compared to other algorithms for solving DS-CSCR. Future work is needed to study additional quality of service properties and improve RCO for other optimization problems.

HYPPO - NECSTTechTalk 23/04/2020

Energy proportionality is the key in order to reduce the Total Cost of Ownership (TCO) of Warehouse Scale Computer (WSC) systems, yet is difficult to achieve in practice. Typical WSC hardware usually does not meet this principle. Furthermore, critical services (e.g. billing) require all servers to remain up regardless the current traffic intensity. These two issues make existing power management technique ineffective at reducing energy use in a WSC dimension. We present Hybrid Performance-aware Power-capping Orchestrator (HyPPO), a distributed Observe Decide Act (ODA) control loop for optimizing energy proportionality of a distribute containerized infrastructures. This first version of HyPPO uses Kubernetes resource metrics (e.g. milli-cpus consumption) in order to dynamically adjust node power consumption, while respecting the Service Level Agreement (SLA) agreement defined by the containerized application owners.

Task Scheduling using Tabu Search algorithm in Cloud Computing Environment us...

AzarulIkhwan

1. The document proposes using Tabu Search algorithm for task scheduling in cloud computing environments using CloudSim simulator. It aims to maximize throughput and minimize turnaround time compared to traditional algorithms like FCFS. 2. The methodology section describes how CloudSim simulator works and the components involved in task scheduling. It also provides an overview of how the Tabu Search algorithm guides the search process to avoid getting stuck at local optima. 3. The expected result is that Tabu Search algorithm will provide higher throughput and lower turnaround times for cloud tasks compared to FCFS, as Tabu Search is designed to escape local optima and find better solutions.

Task scheduling Survey in Cloud Computing

Ramandeep Kaur

This document provides an overview of task scheduling algorithms for load balancing in cloud computing. It begins with introductions to cloud computing and load balancing. It then surveys several existing task scheduling algorithms, including Min-Min, Max-Min, Resource Awareness Scheduling Algorithm, QoS Guided Min-Min, and others. It discusses the goals, workings, results and problems of each algorithm. It identifies the need for an optimized task scheduling algorithm. It also discusses tools like CloudSim that can be used to simulate scheduling algorithms and evaluate performance.

CoolDC'16: Seeing into a Public Cloud: Monitoring the Massachusetts Open Cloud

Ata Turk

The document describes a monitoring system for the Massachusetts Open Cloud (MOC) public cloud. It has four phases: 1) collecting data from different cloud layers, 2) storing and consolidating the data, 3) hosting services to access the data, and 4) implementing applications that use the monitoring data. Two case studies are presented that use the monitoring data to reduce datacenter power costs: peak shaving and participating in regulation service reserve programs. The monitoring system allows for more optimization and innovation in public clouds by exposing detailed performance and resource usage data.

Genetic Algorithm for task scheduling in Cloud Computing Environment

Swapnil Shahade

This document proposes a modified genetic algorithm to schedule tasks in cloud computing environments. It begins with an introduction and background on cloud computing and task scheduling. It then describes the standard genetic algorithm approach and introduces the modified genetic algorithm which uses Longest Cloudlet to Fastest Processor and Smallest Cloudlet to Fastest Processor scheduling algorithms to generate the initial population. The implementation and results show that the modified genetic algorithm reduces makespan and cost compared to the standard genetic algorithm.

This document outlines a proposed methodology for developing efficient task scheduling strategies in cloud computing. It begins with introductions to cloud computing and task scheduling. It then reviews several relevant existing task scheduling algorithms from literature that focus on objectives like reducing costs, minimizing completion time, and maximizing resource utilization. The problem statement indicates the goals are to reduce costs, minimize completion time, and maximize resource allocation. An overview of the proposed methodology's flow is then provided, followed by references.

Automatic Energy-based Scheduling

Maria Stylianou

This document discusses automatic energy-aware scheduling for distributed computing. It summarizes the Green500 list which ranks supercomputers by energy efficiency. Server virtualization can improve efficiency by consolidating workloads. Automatic scheduling that places applications dynamically based on power usage could address underutilization. Current solutions include VMturbo's intelligent workload management and using machine learning to model scheduling. The conclusion is that automatic energy-based scheduling should be more widely adopted to further improve supercomputer efficiency.

REVIEW PAPER on Scheduling in Cloud Computing

Jaya Gautam

This document reviews scheduling algorithms for workflow applications in cloud computing. It discusses characteristics of cloud computing, deployment and service models, and the importance of scheduling in cloud computing. The document analyzes several scheduling algorithms proposed in literature that consider parameters like makespan, cost, load balancing, and priority. It finds that algorithms like Max-Min, Min-Min, and HEFT perform better than traditional algorithms in optimizing these parameters for workflow scheduling in cloud environments.

IRJET- Time and Resource Efficient Task Scheduling in Cloud Computing Environ...

Mario Jose Villamizar Cano

This document summarizes a research paper that proposes a Task Based Allocation (TBA) algorithm to efficiently schedule tasks in a cloud computing environment. The algorithm aims to minimize makespan (completion time of all tasks) and maximize resource utilization. It first generates an Expected Time to Complete (ETC) matrix that estimates the time each task will take on different virtual machines. It then sorts tasks by length and allocates each task to the VM that minimizes its completion time, updating the VM wait times. The algorithm is evaluated using CloudSim simulation and is shown to reduce makespan, execution time and costs compared to random and first-come, first-served scheduling approaches.

Self-Tuning and Managing Services

Reza Rahimi

dynamic resource allocation using virtual machines for cloud computing enviro...

Kumar Goud

Abstract—Cloud computing allows business customers to scale up and down their resource usage based on needs., we present a system that uses virtualization technology to allocate data center resources dynamically based on application demands and support green computing by optimizing the number of servers in use. We introduce the concept of “skewness” to measure the unevenness in the multidimensional resource utilization of a server. By minimizing imbalance, we will mix completely different of workloads nicely and improve the overall utilization of server resources. We develop a set of heuristics that prevent overload in the system effectively while saving energy used. Many of the touted gains in the cloud model come from resource multiplexing through virtualization technology. In this paper Trace driven simulation and experiment results demonstrate that our algorithm achieves good performance. Index Terms—Cloud computing, resource management, virtualization, green computing.

Energy-aware VM Allocation on An Opportunistic Cloud Infrastructure

UnaCloud is an opportunistic based cloud infrastructure (IaaS) that allows to access on-demand computing capabilities using commodity desktops. Although UnaCloud tried to maximize the use of idle resources to deploy virtual machines on them, it does not use energy-efficient resource allocation algorithms. In this paper, we design and implement different energy-aware techniques to operate in an energyefficient way and at the same time guarantee the performance to the users. Performance tests with different algorithms and scenarios using real trace workloads from UnaCloud, show how different policies can change the energy consumption patterns and reduce the energy consumption in opportunistic cloud infrastructures. The results show that some algorithms can reduce the energy-consumption power up to 30% over the percentage earned by opportunistic environment.

An optimized scientific workflow scheduling in cloud computing

DIGVIJAY SHINDE

The document discusses optimizing scientific workflow scheduling in cloud computing. It begins with definitions of workflow and cloud computing. Workflow is a group of repeatable dependent tasks, while cloud computing provides applications and hardware resources over the Internet. There are three cloud service models: SaaS, PaaS, and IaaS. The document explores how to efficiently schedule workflows in the cloud to reduce makespan, cost, and energy consumption. It reviews different scheduling algorithms like FCFS, genetic algorithms, and discusses optimizing objectives like time and cost. The document provides a literature review comparing various workflow scheduling methods and algorithms. It concludes with discussing open issues and directions for future work in optimizing workflow scheduling for cloud computing.

Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...

This document proposes a Multi Queue (MQ) task scheduling algorithm for heterogeneous tasks in cloud computing. It aims to improve upon the Round Robin and Weighted Round Robin algorithms by overcoming their drawbacks. The MQ algorithm splits tasks and resources into separate queues based on size/length and speed. Small tasks are scheduled on slower resources and large tasks on faster resources. The document compares the performance of MQ to Round Robin and Weighted Round Robin algorithms based on makespan, average resource utilization, and load balancing level using CloudSim simulations. The results show that MQ scheduling performs better than the other algorithms in most cases in terms of these metrics.

Application of selective algorithm for effective resource provisioning in clo...

ijccsa

Modern day continued demand for resource hungry services and applications in IT sector has led to development of Cloud computing. Cloud computing environment involves high cost infrastructure on one hand and need high scale computational resources on the other hand. These resources need to be provisioned (allocation and scheduling) to the end users in most efficient manner so that the tremendous capabilities of cloud are utilized effectively and efficiently. In this paper we discuss a selective algorithm for allocation of cloud resources to end-users on-demand basis. This algorithm is based on min-min and max-min algorithms. These are two conventional task scheduling algorithm. The selective algorithm uses certain heuristics to select between the two algorithms so that overall makespan of tasks on the machines is minimized. The tasks are scheduled on machines in either space shared or time shared manner. We evaluate our provisioning heuristics using a cloud simulator, called CloudSim. We also compared our approach to the statistics obtained when provisioning of resources was done in First-Cum-First- Serve(FCFS) manner. The experimental results show that overall makespan of tasks on given set of VMs minimizes significantly in different scenarios.

Green scheduling

Vincenzo De Maio

This document discusses green computing and proposes a simulator for evaluating green scheduling algorithms. It begins with background on green computing and why it is important. It then outlines the key components of the simulator, including: a computation model using DAGs, an energy consumption model based on CPU throttling levels, and an abstraction for energy-aware schedulers. The document describes classes for modeling cores, throttling levels, and the overall simulation framework, which is designed to be extensible to different scheduling algorithms, core types, and energy models. The goal is to simulate and evaluate scheduling heuristics to minimize energy consumption while meeting performance targets.

Detecting Lateral Movement with a Compute-Intense Graph Kernel

Data Works MD

Cybersecurity Analytics on a D-Wave Quantum Computer Effective cybersecurity analysis requires frequent exploration of graphs of many types and sizes, the computational cost of which can be overwhelming if not carefully chosen. After briefly introducing the D-Wave quantum computing system, we describe an analytic for finding “lateral movement” in an enterprise network, i.e., an intruder or insider threat hopping from system to system to gain access to more information. This analytic depends on maximum independent set, an NP-hard graph kernel whose computational cost grows exponentially with the size of the graph and so has not been widely used in cyber analysis. The growing strength of D-Wave’s quantum computers on such NP-hard problems will enable new analytics. We discuss practicalities of the current implementation and implications of this approach. Steve Reinhardt has built hardware/software systems that deliver new levels of performance usable via conceptually simple interfaces, including Cray Research’s T3E distributed-memory systems, ISC’s Star-P parallel-MATLAB software, and YarcData/Cray’s Urika graph-analytic systems. He now leads D-Wave’s efforts working with customers to map early applications to D-Wave systems.

Hybrid Task Scheduling Approach using Gravitational and ACO Search Algorithm

The document proposes a hybrid task scheduling approach for cloud computing called ACGSA that combines ant colony optimization and gravitational search algorithms. It describes using the Cloudsim simulator to test the performance of ACGSA and comparing it to ant colony optimization. The results show that ACGSA achieves better performance than the basic ant colony approach on relevant parameters like task scheduling time and resource utilization.

Learning Software Performance Models for Dynamic and Uncertain Environments

Pooyan Jamshidi

This document provides background on Pooyan Jamshidi's research related to learning software performance models for dynamic and uncertain environments. It summarizes his past work developing techniques for modeling and optimizing performance across different systems and environments, including using transfer learning to reuse performance data from related sources to build more accurate models with fewer measurements. It also outlines opportunities for using transfer learning to adapt performance models to new environments and systems.

Review on Green Networking Solutions

International Journal of Engineering Inventions www.ijeijournal.com

Empirical studies have revealed that a significant amount of energy is lost unnecessarily in the network architectures, protocols, routers and various other network devices. Thus there is a need for techniques to obtain green networking in the computer architecture which can lead to energy saving. Green networking is an emerging phenomenon in the computer industry because of its economic and environmental benefits. Saving energy leads to cost-cutting and lower emission of greenhouse gases which are apparently one of the major threats to the environment. ’Greening’ as the name suggests is the process of constructing network architecture in such a way so as to avoid unnecessary loss of power and energy due its various components and can be implemented using various techniques out of which four are mentioned in this review paper, namely Adaptive link rate (ALR), Dynamic Voltage and Frequency scaling(DVFS), Interface proxying and energy aware applications and software.

call for papers, research paper publishing, where to publish research paper, ...

This document summarizes and compares various scheduling algorithms used in cloud computing environments. It begins with an introduction to cloud computing and the need for scheduling algorithms in cloud environments. It then describes several existing scheduling algorithms, including compromised-time-cost scheduling, particle swarm optimization-based heuristic, improved cost-based algorithm, resource-aware scheduling, innovative transaction intensive cost-constraint scheduling, scalable heterogeneous earliest-finish-time algorithm, and multiple QoS constrained scheduling strategy of multi-workflows. These algorithms aim to optimize metrics such as execution time, cost, deadline, load balancing, and quality of service. The document concludes by comparing the different scheduling strategies.

Adaptive Digital Filter Design for Linear Noise Cancellation Using Neural Net...

This document discusses using neural networks for adaptive digital filter design to cancel linear noise. It begins by introducing adaptive filters and their use in noise cancellation applications. An adaptive noise cancellation system structure is shown using an adaptive filter to estimate noise from a reference input and subtract it from the noisy primary input. Neural networks can be used for adaptive filtering, with the exact random basis function (RBF) network presented as a suitable architecture. Simulation results show that the RBF network achieves much lower error than a linear layer function by producing an output signal close to the desired target. The paper concludes the RBF network is well-suited for this application as it minimizes the error between the output and target signals, effectively canceling linear noise

Quality of Service based Task Scheduling Algorithms in Cloud Computing

IJECEIAES

In cloud computing resources are considered as services hence utilization of the resources in an efficient way is done by using task scheduling and load balancing. Quality of service is an important factor to measure the trustiness of the cloud. Using quality of service in task scheduling will address the problems of security in cloud computing. This paper studied quality of service based task scheduling algorithms and the parameters used for scheduling. By comparing the results the efficiency of the algorithm is measured and limitations are given. We can improve the efficiency of the quality of service based task scheduling algorithms by considering these factors arriving time of the task, time taken by the task to execute on the resource and the cost in use for the communication.

REAL-TIME ADAPTIVE ENERGY-SCHEDULING ALGORITHM FOR VIRTUALIZED CLOUD COMPUTING

ijdpsjournal

Cloud computing becomes an ideal computing paradigm for scientific and commercial applications. The increased availability of the cloud models and allied developing models creates easier computing cloud environment. Energy consumption and effective energy management are the two important challenges in virtualized computing platforms. Energy consumption can be minimized by allocating computationally intensive tasks to a resource at a suitable frequency. An optimal Dynamic Voltage and Frequency Scaling (DVFS) based strategy of task allocation can minimize the overall consumption of energy and meet the required QoS. However, they do not control the internal and external switching to server frequencies, which causes the degradation of performance. In this paper, we propose the Real Time Adaptive EnergyScheduling (RTAES) algorithm by manipulating the reconfiguring proficiency of Cloud ComputingVirtualized Data Centers (CCVDCs) for computationally intensive applications. The RTAES algorithm minimizes consumption of energy and time during computation, reconfiguration and communication. Our proposed model confirms the effectiveness of its implementation, scalability, power consumption and execution time with respect to other existing approaches.

Queens Parh Rangers AD410 น.ส.ฐิติมา ประเสริฐชัย เลขที่8yaying-yingg

Moodboards edaedaozdemir

What's hot

Intelligent Placement of Datacenters for Internet Services

Maria Stylianou

Task Scheduling methodology in cloud computing

Qutub-ud- Din

Automatic Energy-based Scheduling

Maria Stylianou

REVIEW PAPER on Scheduling in Cloud Computing

Jaya Gautam

IRJET- Time and Resource Efficient Task Scheduling in Cloud Computing Environ...

Mario Jose Villamizar Cano

Self-Tuning and Managing Services

Reza Rahimi

dynamic resource allocation using virtual machines for cloud computing enviro...

Kumar Goud

Energy-aware VM Allocation on An Opportunistic Cloud Infrastructure

An optimized scientific workflow scheduling in cloud computing

DIGVIJAY SHINDE

Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...

Application of selective algorithm for effective resource provisioning in clo...

ijccsa

Green scheduling

Vincenzo De Maio

Detecting Lateral Movement with a Compute-Intense Graph Kernel

Data Works MD

Hybrid Task Scheduling Approach using Gravitational and ACO Search Algorithm

Learning Software Performance Models for Dynamic and Uncertain Environments

Pooyan Jamshidi

Review on Green Networking Solutions

International Journal of Engineering Inventions www.ijeijournal.com

call for papers, research paper publishing, where to publish research paper, ...

Adaptive Digital Filter Design for Linear Noise Cancellation Using Neural Net...

Quality of Service based Task Scheduling Algorithms in Cloud Computing

IJECEIAES

REAL-TIME ADAPTIVE ENERGY-SCHEDULING ALGORITHM FOR VIRTUALIZED CLOUD COMPUTING

ijdpsjournal

What's hot (20)

Intelligent Placement of Datacenters for Internet Services

Task Scheduling methodology in cloud computing

Automatic Energy-based Scheduling

REVIEW PAPER on Scheduling in Cloud Computing

IRJET- Time and Resource Efficient Task Scheduling in Cloud Computing Environ...

Self-Tuning and Managing Services

dynamic resource allocation using virtual machines for cloud computing enviro...

Energy-aware VM Allocation on An Opportunistic Cloud Infrastructure

An optimized scientific workflow scheduling in cloud computing

Scheduling of Heterogeneous Tasks in Cloud Computing using Multi Queue (MQ) A...

Application of selective algorithm for effective resource provisioning in clo...

Green scheduling

Detecting Lateral Movement with a Compute-Intense Graph Kernel

Hybrid Task Scheduling Approach using Gravitational and ACO Search Algorithm

Learning Software Performance Models for Dynamic and Uncertain Environments

Review on Green Networking Solutions

call for papers, research paper publishing, where to publish research paper, ...

Adaptive Digital Filter Design for Linear Noise Cancellation Using Neural Net...

Quality of Service based Task Scheduling Algorithms in Cloud Computing

REAL-TIME ADAPTIVE ENERGY-SCHEDULING ALGORITHM FOR VIRTUALIZED CLOUD COMPUTING

Viewers also liked

Queens Parh Rangers AD410 น.ส.ฐิติมา ประเสริฐชัย เลขที่8yaying-yingg

Moodboards edaedaozdemir

Pechakucha

cmcgrupo8

Cultura mitesComalat1D

FacebookRaquel Palau Capell

Dan-leiri 2012

Marko Havu

Cultura mitesComalat1D

Maailmassa on parempia pankkeja

Pankki2

Architecting a Cloud-Scale Identity Fabric

This document discusses architecting an identity fabric for cloud-scale computing. It argues that a new approach is needed for identity management in cloud environments due to issues around cross-cutting nature, organizational impacts, and lack of management skills. The key components of a cloud-scale identity fabric are discussed, including access control, authentication, user management, auditing, and meeting requirements of cloud platforms. Building identity as a distributed fabric can significantly reduce management costs and complexity compared to traditional centralized models. Identity must integrate and abstract to provide infrastructure as a service for applications and users in cloud environments.

Arviointi ja palaute 2011

Marko Havu

Netcare csi kelvin's talk aug 2015

Kelvin Glen

Kelvin Glen has over 22 years of experience in fundraising and corporate social investment in South Africa. He has managed CSR programs for many corporations and non-profits. Kelvin is a social worker by trade and holds an MBA. He currently serves on the boards of several non-profits and runs two social enterprises. The document goes on to define corporate social investment and the responsibilities of businesses to contribute to economic and social development. It describes the Netcare Foundation, which manages Netcare's corporate social investment programs for the benefit of South African citizens. Kelvin encourages Netcare to strengthen employee volunteer programs and leverage its brand to have greater social impact.

Practica 2 luis ivan cruz val.

persi-10

Sam houston chess team

Sam Houston Lmc

The counting system for small animals in japanese

CheyanneStotlar

AB Consulting

An Integer Programming Representation for Data Center Power-Aware Management ...

This document describes research on scheduling jobs in data center grids. It presents an integer linear programming (ILP) model to optimize revenue, power usage, and quality of service. It also describes a heuristic algorithm as an alternative to solve larger problems. The authors test the ILP and heuristic on generated data sets of various sizes. Results show the heuristic achieves near-optimal solutions faster than the ILP for large problems. Lower values of a parameter alpha and random node selection produced the best heuristic results.

Parts of Speech

Jen Lawson

The document is a lesson on the eight parts of speech taught by Miss Lawson to 9th grade students. It defines each part of speech and provides examples to identify them, including nouns, pronouns, verbs, adjectives, adverbs, prepositions, conjunctions, and interjections. Students are instructed to identify the parts of speech in sentences for each category and email their answers to the teacher for a grade.

Quantum Cryptography and Possible Attacks

This report discusses quantum cryptography and potential attacks on quantum key distribution systems. It provides background on quantum cryptography and describes the BB84 quantum key distribution protocol. It then analyzes several potential attacks on quantum key distribution systems, including photon number attacks, spectral attacks, and random number attacks that are relatively easy to solve. It focuses on the more challenging "faked-state" attack, providing details on how an attacker could implement this attack in practice using superconducting nanowire single-photon detectors. The report evaluates the security of quantum key distribution against these attacks.

Pankki 2.0-hankkeen esittelyPankki2

Distributed Computing - What, why, how..

DataWorks Summit/Hadoop Summit

Viewers also liked (20)

Queens Parh Rangers AD410 น.ส.ฐิติมา ประเสริฐชัย เลขที่8

Moodboards eda

Pechakucha

Cultura mites

Facebook

Dan-leiri 2012

Cultura mites

Maailmassa on parempia pankkeja

Architecting a Cloud-Scale Identity Fabric

Arviointi ja palaute 2011

Netcare csi kelvin's talk aug 2015

Practica 2 luis ivan cruz val.

Sam houston chess team

The counting system for small animals in japanese

An Integer Programming Representation for Data Center Power-Aware Management ...

Parts of Speech

Quantum Cryptography and Possible Attacks

Pankki 2.0-hankkeen esittely

Distributed Computing - What, why, how..

Similar to Intelligent Placement of Datacenter for Internet Services

EEDC Intelligent Placement of Datacenters

Roger Rafanell Mas

The document discusses intelligent placement of datacenters for internet services. It aims to minimize costs and environmental impacts by developing a framework to model datacenter characteristics, costs, incentives and select optimal locations. The approach uses simulated annealing combined with linear programming to evaluate solutions and optimize total costs, subject to constraints like response time and availability. Evaluating various locations shows smart placement can save millions. Future work includes testing with real service data and incentives from other regions.

6 intelligent-placement-of-datacenters

zafargilani

The document discusses a framework for intelligently placing datacenters to minimize costs while meeting performance objectives. It considers factors like location costs, power availability, latency, and environmental impact. The framework models it as an optimization problem. It evaluates solutions like heuristics and simulated annealing and finds heuristics provide good results within a few days. Case studies show tradeoffs between latency, availability, consistency and costs. Intelligently placing datacenters can lower costs significantly.

Intelligent Datacenter placement

Francesc Lordan Gomis

The document discusses factors involved in optimally placing datacenters for internet services. It introduces parameters like cost, response time, consistency delay, and availability that must be considered. Several frameworks are proposed, including linear programming models and heuristic algorithms, to determine the best locations and sizes for datacenters given constraints. The placement tool developed allows users to specify requirements and obtain a solution. Tradeoffs between factors like latency, availability, and green initiatives are also explored.

40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility

inside-BigData.com

In this deck from the Swiss HPC Conference, Mark Wilkinson presents: 40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility. "DiRAC is the integrated supercomputing facility for theoretical modeling and HPC-based research in particle physics, and astrophysics, cosmology, and nuclear physics, all areas in which the UK is world-leading. DiRAC provides a variety of compute resources, matching machine architecture to the algorithm design and requirements of the research problems to be solved. As a single federated Facility, DiRAC allows more effective and efficient use of computing resources, supporting the delivery of the science programs across the STFC research communities. It provides a common training and consultation framework and, crucially, provides critical mass and a coordinating structure for both small- and large-scale cross-discipline science projects, the technical support needed to run and develop a distributed HPC service, and a pool of expertise to support knowledge transfer and industrial partnership projects. The on-going development and sharing of best-practice for the delivery of productive, national HPC services with DiRAC enables STFC researchers to produce world-leading science across the entire STFC science theory program." Watch the video: https://wp.me/p3RLHQ-k94 Learn more: https://dirac.ac.uk/ and http://hpcadvisorycouncil.com/events/2019/swiss-workshop/agenda.php Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter

Intelligent placement of_datacenters_for_internet_services_ioanna_tsalouchidou

Ioanna Tsalouchidou

This document presents a framework for intelligently placing datacenters for internet services. It discusses parameters like costs, response time, and emissions that are considered. The framework formulates the placement problem by taking inputs like user numbers, servers, and existing datacenters. It evaluates solutions like linear programming and simulated annealing to find an optimal placement configuration with minimum cost. A placement tool is developed that considers location-dependent data. The tool is used to evaluate placements and tradeoffs around latency, availability, consistency, emissions and energy efficiency. The document concludes that the framework and tool can automatically place datacenters by optimizing multiple objectives and parameters.

Data Replication In Cloud Computing

Rahul Garg

1) The document discusses quality of service (QoS)-aware data replication for data-intensive applications in cloud computing systems. It aims to minimize data replication cost and number of QoS violated replicas. 2) It presents a mathematical model and algorithm to optimally place QoS-satisfied and QoS-violated data replicas. The algorithm uses minimum-cost maximum flow to obtain the optimal placement. 3) The algorithm takes as input a set of requested nodes and outputs the optimal placement for QoS-satisfied and QoS-violated replicas by modeling the problem as a network flow graph and applying existing polynomial-time algorithms.

Grid optical network service architecture for data intensive applications

Tal Lavian Ph.D.

Integrated SW System Provide the “Glue” Dynamic optical network as a fundamental Grid service in data-intensive Grid application, to be scheduled, to be managed and coordinated to support collaborative operations From Super-computer to Super-network In the past, computer processors were the fastest part peripheral bottlenecks In the future optical networks will be the fastest part Computer, processor, storage, visualization, and instrumentation - slower "peripherals” eScience Cyber-infrastructure focuses on computation, storage, data, analysis, Work Flow. The network is vital for better eScience

Dcn invited ecoc2018_short

Shuangyi Yan

1) Scaling up data center networks (DCNs) requires new switching technologies as hyperscale DCNs continue growing dramatically in size and traffic. 2) Optical switching technologies such as optical time-slot switching show potential for deployments in hybrid optical/electrical DCNs by providing higher switching capacity and bandwidth than electrical switches alone. 3) The University of Bristol has explored optical time-slot switching and its scheduling algorithms, demonstrating SDN control of prototype optical switches for DCN virtualization.

Benefits of Hadoop as Platform as a Service

The document summarizes research done at the Barcelona Supercomputing Center on evaluating Hadoop platforms as a service (PaaS) compared to infrastructure as a service (IaaS). Key findings include: - Provider (Azure HDInsight, Rackspace CBD, etc.) did not significantly impact performance of wordcount and terasort benchmarks. - Data size and number of datanodes were more important factors, with diminishing returns on performance from adding more nodes. - PaaS can save on maintenance costs compared to IaaS but may be more expensive depending on workload and VM size needed. Tuning may still be required with PaaS.

An Architecture for Data Intensive Service Enabled by Next Generation Optical...

Tal Lavian Ph.D.

DWDM-RAM - An architecture for data intensive Grids enabled by next generation dynamic optical networks, incorporating new methods for lightpath provisioning. DWDM-RAM: An architecture designed to meet the networking challenges of extremely large scale Grid applications. Traditional network infrastructure cannot meet these demands, especially, requirements for intensive data flows DWDM-RAM Components Include: Data management services Intelligent middleware Dynamic lightpath provisioning State-of-the-art photonic technologies Wide-area photonic testbed implementation

SDN-enabled Data Center Bridging

Art Fewell

This document discusses using SDN and OpenFlow to configure Data Center Bridging (DCB) primitives for improved quality of service. It summarizes the key DCB primitives, including Priority Flow Control (PFC), Enhanced Transmission Selection (ETS), and Congestion Notification (CN). It then outlines some problems with the traditional static configuration of DCB. The document proposes that SDN can centrally program DCB primitives across multiple switches, eliminating the need for the DCB Exchange Protocol (DCBX) and improving flexibility, scalability, and interoperability. A demo is shown applying different DCB configuration profiles using SDN to test bandwidth allocation.

The UCLouvain Public Defense of my EMJD-DC Double Doctorate Ph.D. degree

Pradeeban Kathiravelu, Ph.D.

20-datacenter-measurements.pptx

Steve491226

This document summarizes a presentation on analyzing network traffic characteristics of data centers. Some key findings include: - 75% of traffic stays within a single rack, showing applications are not uniformly placed; - Half of all packets are small (<200B), indicating keep-alive traffic is important for applications; - At most 25% of core network links are highly utilized, suggesting better routing could reduce utilization; - Assumptions about needing more bandwidth between network switches (bisection) or that traffic is unpredictable may not always hold true.

Resilient Network Design Concepts Educat

SamGrandprix

The document discusses network design concepts for building a resilient network. It emphasizes the importance of considering redundancy at multiple levels, from the physical infrastructure to network protocols. Well-designed networks are modular, have clearly defined functional layers, and incorporate redundancy through techniques like load balancing and diverse circuit paths. Hierarchical network designs with logical areas can also improve convergence times during failures.

Mininet: Moving Forward

ON.Lab

The document discusses Mininet, an open source network emulator used for testing SDN ideas. It provides an overview of Mininet 1.0 and its functional fidelity before describing plans for Mininet 2.0 to improve performance fidelity through techniques like resource isolation, network invariants, and reproducible experiments. The document uses the example of DCTCP traffic to demonstrate how network invariants can validate emulator results.

Virtualization Go Green

Government Technology Exhibition and Conference

Health Canada consolidated servers across data centers in the National Capital Region using virtualization technologies. This reduced physical server space by 45% and energy consumption by 60%, saving over $2.4 million. The project also virtualized regional servers. Future plans include further expanding virtualization to regional sites to reduce physical servers from 171 to 48, lowering costs and administration. The project created a shared test environment and platform to support further IT improvements.

Mesos Network Isolation at Criteo

Frederic Boismenu

Network-aware Data Management for Large Scale Distributed Applications, IBM R...

balmanme

The document discusses network-aware data management for large-scale distributed applications. It provides an outline for a presentation on this topic, including discussing the performance of VSAN and VVOL storage in virtualized environments, the PetaShare distributed storage system and Stork data scheduler, data streaming in high-bandwidth networks, and several other related topics like network reservations and scheduling. The presenter's background and experience working on data transfer scheduling, distributed storage, and high-performance computing networks is also briefly summarized.

Improving Efficiency of Machine Learning Algorithms using HPCC Systems

HPCC Systems

1) The document discusses improving the efficiency of machine learning algorithms using the HPCC Systems platform through parallelization. 2) It describes the HPCC Systems architecture and its advantages for distributed machine learning. 3) A parallel DBSCAN algorithm is implemented on the HPCC platform which shows improved performance over the serial algorithm, with execution times decreasing as more nodes are used.

Virtualization in 4-4 1-4 Data Center Network.

Ankita Mahajan

Similar to Intelligent Placement of Datacenter for Internet Services (20)

EEDC Intelligent Placement of Datacenters

6 intelligent-placement-of-datacenters

Intelligent Datacenter placement

40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility

Intelligent placement of_datacenters_for_internet_services_ioanna_tsalouchidou

Data Replication In Cloud Computing

Grid optical network service architecture for data intensive applications

Dcn invited ecoc2018_short

Benefits of Hadoop as Platform as a Service

An Architecture for Data Intensive Service Enabled by Next Generation Optical...

SDN-enabled Data Center Bridging

The UCLouvain Public Defense of my EMJD-DC Double Doctorate Ph.D. degree

20-datacenter-measurements.pptx

Resilient Network Design Concepts Educat

Mininet: Moving Forward

Virtualization Go Green

Mesos Network Isolation at Criteo

Network-aware Data Management for Large Scale Distributed Applications, IBM R...

Improving Efficiency of Machine Learning Algorithms using HPCC Systems

Virtualization in 4-4 1-4 Data Center Network.

More from Arinto Murdopo

Distributed Decision Tree Learning for Mining Big Data Streams

This document presents a distributed decision tree learning algorithm called Vertical Hoeffding Tree (VHT) for mining big data streams. It summarizes the contributions of the master's thesis, which include: (1) Developing the SAMOA framework for distributed streaming machine learning, (2) Integrating SAMOA with the Storm distributed stream processing engine, and (3) Implementing the VHT algorithm to improve scalability over the standard Hoeffding Tree algorithm when dealing with high-dimensional data streams. The evaluation shows that VHT achieves similar accuracy to Hoeffding Tree but higher throughput, especially on datasets with many attributes.

Distributed Decision Tree Learning for Mining Big Data Streams

The document presents a master's thesis that proposes and develops Scalable Advanced Massive Online Analysis (SAMOA), a distributed streaming machine learning framework. SAMOA aims to address the big data challenges of volume, velocity, and variety by providing flexible APIs for developing machine learning algorithms, and integrating with Storm, a stream processing engine, to inherit its scalability. The thesis describes SAMOA's modular components, its integration with Storm, and evaluates a distributed online classification algorithm implemented on SAMOA and Storm to demonstrate its features.

Next Generation Hadoop: High Availability for YARN

The document proposes a new architecture for YARN to solve its availability limitation of single-point-of-failure in the resource manager. The key aspects of the proposed architecture are: 1. It utilizes a stateless failure model where all necessary states and information used by the resource manager are stored in a persistent storage. 2. MySQL Cluster (NDB) is proposed as the storage technology due to its high availability, linear scalability, and high throughput of up to 1.8 million writes per second. 3. A proof-of-concept implementation was done using NDB to store application states and their corresponding attempts. Evaluations showed the architecture is able to increase YARN's availability and NDB

High Availability in YARN

An Integer Programming Representation for Data Center Power-Aware Management ...

This document describes an integer linear programming (ILP) model and heuristic approach for scheduling jobs in a data center while maximizing benefits related to power costs, revenue, migration costs, and quality of service. The ILP formulation is implemented in CPLEX and a greedy randomized adaptive search procedure (GRASP) metaheuristic is designed to find near-optimal solutions more efficiently. Two variants of the GRASP heuristic are tested on generated problem instances and results are compared to the optimal ILP solutions in terms of solution quality and runtime.

Quantum Cryptography and Possible Attacks-slide

Quantum cryptography uses principles of quantum mechanics to securely distribute encryption keys. The BB84 protocol is a seminal quantum key distribution protocol that works as follows: Alice sends Bob polarized photons encoded with random key bits and basis choices. Bob measures the photons randomly in different bases. They communicate to discard mismatched bases, leaving a shared raw key. They test for errors from eavesdropping and apply privacy amplification to distill a final secure key. However, quantum cryptography is vulnerable to attacks like the faked-state attack, where an eavesdropper Eve blind's Bob's detectors and forces measurement outcomes to match her own. If successful, Eve can learn almost all the raw key without introducing errors.

Parallelization of Smith-Waterman Algorithm using MPI

The document describes parallelizing the Smith-Waterman algorithm for sequence alignment using MPI. It explores different parallelization techniques including blocking and blocking with interleaving. It presents solutions using scatter-gather and send-receive approaches. Performance is evaluated on an Altix cluster for various problem sizes, numbers of processors, and blocking/interleaving parameters to determine optimal configuration. Code was modified to improve load balancing and further optimize performance.

Dremel Paper Review

Megastore - ID2220 Presentation

The document describes Megastore, a scalable and highly available storage system for interactive services. Megastore provides ACID semantics through entity groups and uses a modified Paxos algorithm for synchronous replication across groups. It scales through data partitioning and ensures availability by replicating write-ahead logs within entity groups. The system aims to balance the easy usability of relational databases with the scalability of NoSQL systems.

Flume Event Scalability

This document analyzes the scalability of Apache Flume by conducting experiments with different Flume configurations and load levels. Two experiment setups are used: one with a one-to-one relationship between Flume nodes and load generators, and another using a cascading setup to aggregate events from two scale nodes into a collector node. The results show that doubling the channel capacity does not necessarily double the maximum event rate, and that a cascading setup can improve scalability over a non-cascading setup.

Large Scale Distributed Storage Systems in Volunteer Computing - Slide

This document discusses using decentralized storage systems (DSS) with volunteer computing (VC). It outlines the problem, defines VC and DSS, and reviews several DSS approaches. Key criteria for DSS like availability, scalability, and consistency are examined. The document then analyzes characteristics of DSS that could integrate with VC, challenges of providing incentives, and security issues. It concludes that balancing functionality with complexity is important when integrating DSS with VC systems.

Large-Scale Decentralized Storage Systems for Volunter Computing Systems

This document provides a survey of existing decentralized storage systems and their suitability for use in volunteer computing systems. It discusses several decentralized storage systems including Farsite, Ivy, Overnet/Kademlia, PAST, PASTIS, Voldemort, OceanStore, Glacier, Total Recall, Cassandra, Riak, Dynamo, and Attic. It evaluates each system based on availability, scalability, eventual consistency, performance, and security. The document proposes that the most suitable state-of-the-art decentralized storage system for volunteer computing would combine the best properties of these existing systems.

Rise of Network Virtualization

The document discusses the rise of network virtualization and software-defined networking (SDN). It describes early research projects on campus networks like Ethane in 2007 that led to the OpenFlow specification in 2008. This allowed experiments with new network protocols. The document outlines the founding of Nicira in 2007 and the company's product launch of the Network Virtualization Platform in 2012, which used SDN to virtualize networks.

Consistency Tradeoffs in Modern Distributed Database System Design

Distributed Storage System for Volunteer Computing

This document presents a preliminary proposal for a distributed storage system for volunteer computing. It discusses using volunteer computing resources to create a distributed storage system with no single point of failure that provides data availability and integrity. Some challenges include that distributed storage systems for volunteer computing do not currently exist. The proposal reviews existing peer-to-peer distributed storage systems and surveys which may be suitable. The objectives are to evaluate systems based on security, availability, and reliability, and possibly experiment with suitable systems in a distributed storage testbed.

Apache Flume

This document outlines Apache Flume, a distributed system for collecting large amounts of log data from various sources and transporting it to a centralized data store such as Hadoop. It describes the key components of Flume including agents, sources, sinks and flows. It explains how Flume provides reliable, scalable, extensible and manageable log aggregation capabilities through its node-based architecture and horizontal scalability. An example use case of using Flume for near real-time log aggregation is also briefly mentioned.

Why File Sharing is Dangerous?

File sharing networks can expose sensitive personal and financial information due to misconfigured sharing settings, confusing interfaces, and malware distribution. An experiment sharing documents with credit card and calling card details showed the information was quickly downloaded and redistributed, with the funds drained within a week. A separate experiment sharing mock business documents saw them downloaded 12 times within a week, illustrating the risk of unintended secondary disclosures on these networks. The growing usage and "set and forget" tendencies of some users increases the risk of privacy leaks and losses over such file sharing networks.

Why Use “REST” Architecture for Web Services?

Distributed Systems