Optimization Models for on-demand GPUs in the Cloud

•

0 likes•91 views

This document discusses optimization models for scheduling deep learning jobs on demand GPUs in the cloud. It aims to jointly plan VM capacity and schedule DL training jobs to minimize costs. The proposed model reduces total costs by over 90% compared to FIFO, priority, and EDF scheduling based on preliminary results for multiple node and job simulations. Performance models for predicting GPU-based deep learning applications are described in a referenced paper. The work is co-funded by the European Commission Horizon 2020 program.

Co-funded by the European Commission
Horizon 2020 - Grant #777154
Optimization Models for on-demand
GPUs in the Cloud
Arezoo Jahani, Marco Lattuada, Michele Ciavotta,
Danilo Ardagna, Edoardo Amaldi, Li Zhang
atmosphere-eubrazil.eu @AtmosphereEUBR

Motivations & Goal
• Deep learning is widely used in commonplace activities
• Model learning greatly benefits from GPUs
• GPUs performance is 5 to 40x better than CPUs, but GPU-
based VMs are characterized by high costs
Online joint capacity planning of on-demand VMs
and DL training jobs scheduling

3
Performance models described in:
E. Gianniti, L. Zhang, D. Ardagna. Performance Prediction of GPU-based Deep
Learning Application. Closer 2019 Proceedings. 279-286. Crete, Greece.
Reference system

4
Proposed
Model
FIFO
Preliminary results – 1 node 4 jobs
Total costs:
Proposed Model: 3,261$
FIFO: 14,061$
EDF: 10,196$
Priority: 18,245$

5Optimization Models for on-demanded GPUs
Preliminary results – 3 nodes 32
jobs
Savings:
FIFO: 91%
EDF: 80%
Priority: 92%

For the full video of this presentation, please visit: https://www.embedded-vision.com/platinum-members/embedded-vision-alliance/embedded-vision-training/videos/pages/may-2019-embedded-vision-summit-kraft For more information about embedded vision, please visit: http://www.embedded-vision.com Adam Kraft, Deep Learning Engineer at Orbital Insight, presents the "Challenges and Approaches for Extracting Meaning from Satellite Imagery" tutorial at the May 2019 Embedded Vision Summit. Orbital Insight is a geospatial big data company leveraging the rapidly growing availability of satellite, UAV and other geospatial data sources to understand and characterize socioeconomic trends at global, regional and hyper-local scales. The company uses recent advances in deep learning and cloud computing to process and understand the information available in millions of images. In this talk, Kraft explores three sets of technical challenges the company had to overcome to develop a robust solution. First, he discusses challenges associated with fusing data that can come from different types of image sensors as well as different ground truth measurement sources. Next, he explores challenges related to detecting trends and analyzing changes in imagery over time. Finally, he examines challenges involved in choosing the right machine learning methods for the task at hand.

Implementing AI: Running AI at the Edge: Adapting AI to available resource in...

KTN

The Implementing AI: Running AI at the Edge, hosted by KTN and eFutures, is the second event of the Implementing AI webinar series. To make products more intelligent, more responsive and to reduce the data generated, it is advantageous to run AI on the product itself, as opposed to in the cloud. The focus of this webinar was the opportunities and challenges of moving the AI processing to “the Edge”. The webinar had four presentations from experts covering overviews of the opportunity, implementation techniques and case studies. Find out more: https://ktn-uk.co.uk/news/just-launched-implementing-ai-webinar-series

Data Con LA 2022 - Democratizing AI Across Clouds: Low-Cost, Easy-to-Deploy M...

Data Con LA

John Thorpe, Head of Product, BreezeML Machine learning (especially deep learning) is becoming increasingly complex and expensive. Many companies build their core businesses (e.g., self-driving, credit card fraud detection, item recommendation, etc.) upon continuous model training and/or inferencing, which is typically performed with dozens or even hundreds of GPU machines on a (public or on-premise) cloud. While a cloud-based environment makes it possible for these jobs to dynamically scale with load changes (e.g., user requests), running these jobs under the cloud's pay-as-you-go pricing model incurs large monetary costs, which would rapidly grow with the model size/complexity, the size of datasets, and the number of users. BreezeML democratizes AI/ML by helping AI companies significantly increase their performance-per-dollar by making effective use of preemptible GPU instances. Rooted in years of research at UCLA and Princeton, BreezeML provides (1) a preemption-resilient software system that allows users to reliably run ML training/inference jobs on preemptible instances (such as spot instances) and (2) a virtual cloud interface that performs intelligent selection and scheduling of (spot and on-demand) instances to minimize the monetary costs with strong SLA guarantees. Currently, BreezeML provides two services: 1. An API server (http://windmill.breezeml.ai/apis/) that allows ML engineers to upload batch jobs for free trails. It also allows customers to use their own cloud (e.g., AWS) credential to log in and use BreezeML to run jobs under their own cloud configurations. 2. We provide a docker image of the Breeze runtime, which includes the Breeze-enhanced Pytorch/Tensorflow/XGBoost as well as a new K8S-based orchestration system that can be easily deployed in the user's local environment (compliant with the user's local security policies). Our runtime allows the user to (a) use cheap spot instances in the cloud or (b) sharing resources between (low-priority) training and (high-priority) inference jobs in their on-premise cluster, thereby significantly improving GPU resource utilization. Experiments across a wide range of vision, language, and classification models demonstrate that BreezeML improve the performance-per-dollar by an average of 3 times. Our approach also eliminates the need of resource over-provisioning in on-premise clusters by allowing (high-priority) inference jobs to safely preempt (low-priority) training jobs.

Presentation of Eco-efficient Cloud Computing Framework for Higher Learning I...

rodrickmero

Tanzanian Higher Learning Institutions (HLIs) are facing challenges in providing the necessary Information Technology (IT) support for education, research and development activities. Currently, HLIs use traditional computing (TC) which has proven to be uneconomical in terms of maintenance, software purchase costs, huge power consumption and staffing. Cloud computing (CC) is the way forward for HLIs in solving the computing challenges. However, the HLIs policies regarding security of critical data in CC environment prevent adoption of CC services from existing vendors. The reliable and secure way is to establish and operate CC data centers dedicated to HLIs critical data and services. Owning and operating the traditional data centers is a challenge to HLIs because it consumes huge amounts of power. Tanzania like other developing countries has a low level of electrification, while the need for electric power consumption is increasing year after year. The need to consider energy efficient approaches in data center operation is very important for reducing both the operation costs and carbon footprint to the environment. Therefore, this thesis presents the eco-efficient cloud computing framework that integrates renewable and non-renewable power sources, and free cooling in reducing carbon emission and power consumption in HLIT cloud data centers. To develop the framework, we conducted a study in Tanzania HLIs to explore the current situation and cloud computing requirements. Interview, Observation, and document review were data collection method used by the study. After analysis of the results, we defined guidelines for developing CC building blocks. We used CloudSim tool kit and Netbin IDE to develop and to simulate eco-efficient framework. At the end, eco-efficient framework has shown improvement on power consumption, efficiency and carbon emission. Therefore, eco-efficient approaches give HLIs of Tanzania sustainable solution to their computing needs by significantly reducing operating costs. Moreover, it ensures environment protection for the benefit of current and future generations.

Sustainable Development using Green Programming

IRJET Journal

The document discusses sustainable development using green programming. It notes that programmers typically receive training on programming languages and methodologies but not on software energy consumption. Modern technologies like mobile apps and cloud computing require increased awareness of energy usage. The document outlines various functions that are associated with high energy consumption like graphics, computation, algorithms, memory usage, and networking. It then discusses methods to improve energy efficiency such as using better algorithms, caching, multithreading, and native code. A survey found that programmers have limited knowledge of energy efficiency and best practices for reducing software energy usage. The document argues for educating programmers on the importance of creating energy-efficient software.

Sdn on (3)

Rabah GUEDREZ

- The document proposes a PhD project to design and implement an SDN controller for optical networks. It argues that current optical networks are difficult and expensive to manage due to proprietary software and lack of global control. - The author's internship at Orange showed him the limitations of existing approaches and the need for open, standardized solutions to better manage optical networks and reduce costs. - His proposed PhD would leverage SDN to provide global control and optimization of optical network resources to increase bandwidth, reduce energy consumption, and support exponential internet traffic growth in a smarter way.

IBM Cloud Côte d'Azur Meetup - 20190328 - Optimisation

IBM France Lab

The document discusses the unit commitment problem in power grid optimization and describes different analytics techniques for solving it. It introduces predictive analytics to forecast renewable generation and demand, and decision optimization techniques like deterministic, stochastic, and robust optimization to determine the optimal commitment of generation units while accounting for uncertainty. By applying robust optimization, one study was able to improve plan quality by 45.7% and increase renewable energy utilization by 5.1% compared to stochastic optimization.

En partenerait avec l'INFOPOLE Cluster TIC, le Cluster TWEED a eu le plaisir de vous convier au troisième workshop du cycle "Digital Energy Business & Technology Club", dont le thème était celui de l'Intelligence Artificielle dans l'énergie - tendances et opportunités. Découvrez les présentations des nombreux orateurs : DC Brain, Energis, Ingestic, N-Side, Opinum, Thelis-Réseau IA et Yazzoom !

GVirtuS4j

Valentina Pelliccia

This document discusses enabling GPU computation on Android devices through cloud computing using GVirtuS. GVirtuS is a software component that provides GPU virtualization and remote access. It intercepts CUDA API calls and sends them to a backend server with a physical GPU for execution. This allows low-power Android devices without GPUs to leverage remote GPU resources. The document provides an overview of GVirtuS architecture and demonstrates a simple "device query" example running a CUDA application on an Android phone through GVirtuS.

CK: from ad hoc computer engineering to collaborative and reproducible data s...

Grigori Fursin

Designing novel computer systems and optimizing their software is becoming too tedious, ad hoc, time consuming and error prone due to enormous number of available design and optimization choices. Empirical autotuning combined with run-time adaptation and machine learning has been demonstrating some potential to address above challenges for several decades but is still far from the widespread production. The main reasons include unbearably long exploration and training times, ever changing tools and their interfaces, lack of a common experimental methodology, lack of diverse and representative benchmarks, and lack of unified mechanisms for knowledge building and exchange apart from publications where reproducibility and reusability of results is often not even considered. I will present our community-driven solution to above problems based on our open-source Collective Knowledge technology (CK) that can gradually organize, exchange and reuse knowledge and experience in computer engineering. CK helps share various artifacts (benchmarks, data sets, libraries, tools) as unified, reusable and Python-based components with JSON meta description via GITHUB. Researchers can then quickly prototype and crowdsource various experimental workflows such as performance and energy autotuning, design space exploration and run-time adaptation. At the same time, CK continuously analyzes and extrapolates all collected knowledge using powerful data science techniques to automatically model computer systems' behavior, predict better optimizations or hardware configurations, and eventually enable faster, more power efficient, reliable and self-tuning software and hardware. Furthermore, CK can record any unexpected behavior in a reproducible way and expose it to an interdisciplinary community to find missing features and improve models. Live demo of our approach is available at http://cknowledge.org/repo .

SigOpt at GTC - Tuning the Untunable

SigOpt

Training and tuning models with lengthy training cycles like those in deep learning can be extremely expensive and may sometimes involve techniques that degrade performance. We'll explore recent research on optimization strategies to efficiently tune these types of deep learning models. We will provide benchmarks and comparisons to other popular methods for optimizing the models, and we'll recommend valuable areas for further applied research.

Automatic Energy-based Scheduling

Maria Stylianou

This document discusses automatic energy-aware scheduling for distributed computing. It summarizes the Green500 list which ranks supercomputers by energy efficiency. Server virtualization can improve efficiency by consolidating workloads. Automatic scheduling that places applications dynamically based on power usage could address underutilization. Current solutions include VMturbo's intelligent workload management and using machine learning to model scheduling. The conclusion is that automatic energy-based scheduling should be more widely adopted to further improve supercomputer efficiency.

Knowledge Distillation for Federated Learning: a Practical Guide

XiachongFeng

This document summarizes knowledge distillation techniques for federated learning. It discusses how knowledge distillation has been used to enable model heterogeneity by exchanging model outputs or features instead of parameters. It also describes how distillation can be applied at the server-side to refine model aggregation or at the client-side to mitigate the effects of non-IID data distributions. The document structures the discussion according to whether distillation is used to allow for model heterogeneity or address data heterogeneity and provides examples of approaches within each category.

The Role of Machine Learning in Fluid Network Control and Data Planes.pdf

Förderverein Technische Fakultät

This document summarizes a presentation on machine learning and fluid network planes. It begins with an agenda and introduction to fluid network planes and instances. It then discusses the role of machine learning in fluid network planes, including applications such as optimization, virtual network embedding problems, run-time operations, and intent-based closed-loop automation. Recent research is presented on machine learning-based YouTube QoE estimation using real 4G/5G network traces to predict video quality and inform control actions. Results are shown comparing 4G and 5G networks in terms of radio parameters, stalling events, handovers, and video resolutions under different mobility conditions.

Optimization of Fog computing for Industrial IoT applications

Sabelo Dlamini

The document proposes a scheme to optimize fog computing for industrial IoT applications using a Hidden Markov Model. It aims to enable fog-based systems to self-manage through self-configuration, self-optimization, and self-healing with minimal human intervention. The proposed scheme would use a Hidden Markov Model sitting in the edge node to automatically change the network state if performance indicators do not meet requirements. The states considered are distributed, hybrid, and centralized based on available resources and connectivity to optimize latency, network usage, and backhaul link consumption.

Parallel & Distributed Deep Learning - Dataworks Summit

Rafael Arana

HEXTRADENSE - High Density Computing For Your Technology

cosmicsleuth

LEGaTO: Machine Learning Use Case

LEGATO project

Graphics processing unit ppt

Sandeep Singh

The document discusses the evolution of GPU architecture and capabilities over time. It describes how GPUs have become massively parallel processors with programmable capabilities beyond just graphics. The document outlines the core components of a GPU including the graphics pipeline and programming model. It also discusses how GPUs are well suited for parallel, data-intensive applications and how their capabilities have expanded into general purpose computing through technologies like CUDA.

HPC with Clouds and Cloud Technologies

Inderjeet Singh

This document discusses using cloud computing technologies for data analysis applications. It presents different cloud runtimes like Hadoop, DryadLINQ, and CGL-MapReduce and compares their features to MPI. Applications like Cap3 and HEP are well-suited for cloud runtimes while iterative applications show higher overhead. Results show that as the number of VMs per node increases, MPI performance decreases by up to 50% compared to bare metal nodes. Integration of MapReduce and MPI could help improve performance of some applications on clouds.

Continuous Machine and Deep Learning with Apache Ignite

Denis Magda

With most machine learning (ML) and deep learning (DL) frameworks, it can take hours to move data, and hours to train models. It's also hard to scale, with data sets increasingly being larger than the capacity of any single server. The size of the data also makes it hard to incrementally test and retrain models in near real-time to improve results. Learn how Apache Ignite and GridGain help to address these limitations with model training and execution, and help achieve near-real-time, continuous learning. It will be explained how ML/DL work with Apache Ignite, and how to get started. Topics include: — Overview of distributed ML/DL including design, implementation, usage patterns, pros and consn — Overview of Apache Ignite ML/DL, including prebuilt ML/DL, and how to add your own ML/DL algorithms — Model execution with Apache Ignite, including how to build models with Apache Spark and deploy them in Ignite — How Apache Ignite and TensorFlow can be used together to build distributed DL model training and execution

Multi Layer Federated Learning.pptx

TimePass43152

The document proposes a multilayer federated learning model that uses advantages of both cross-device and cross-silo federated learning to improve privacy and efficiency. It selects eligible clients for training, broadcasts models/weights, performs local computation on clients, aggregates updates, and updates the shared model. The accuracy is maintained while making the model more robust, error tolerant, and private. Further analysis is needed to compare performance to other models on real-world datasets.

Panel: NRP Science Impacts

Larry Smarr

The document discusses accelerating science discovery with AI inference-as-a-service. It describes showcases using this approach for high energy physics and gravitational wave experiments. It outlines the vision of the A3D3 institute to unite domain scientists, computer scientists, and engineers to achieve real-time AI and transform science. Examples are provided of using AI inference-as-a-service to accelerate workflows for CMS, ProtoDUNE, LIGO, and other experiments.

IRJET- Generating 3D Models Using 3D Generative Adversarial Network

IRJET Journal

This document discusses using a 3D generative adversarial network (GAN) to generate 3D models without needing 3D modeling software. A 3D GAN uses 3D convolutional layers in both the generator and discriminator networks. The generator maps random noise to a 3D voxel space, and the discriminator tries to determine if a 3D model is real or generated. The networks are trained adversarially, with the generator trying to fool the discriminator and the discriminator trying to accurately classify models. The goal is for the generator to learn the data distribution and output realistic 3D models without supervision by sampling latent vectors and passing them through the generator network.

Driven by data - Why we need a Modern Enterprise Data Analytics Platform

Arne Roßmann

Study of Energy Efficient Images with Just Noticeable Difference Threshold Ba...

ijtsrd

This document presents a novel method for producing energy-efficient images using a Feature Transform based Just Noticeable Difference Threshold (FTJNDT) model. The proposed method aims to reduce image energy consumption on displays like OLED by lowering pixel luminance below the just-noticeable difference threshold while maintaining perceptual quality. The FTJNDT model determines individual luminance thresholds for each image block based on visual saliency and non-linear modulation functions. An optimization framework is used to estimate modulation parameters and feature values using an objective image quality assessment. Experimental results showed the method reduced image energy consumption by an average of 4.31% compared to original images.

Presentation 7.pptx

Shivam327815

This document discusses using machine learning to predict laptop prices based on laptop specifications. It proposes using a random forest algorithm on a dataset containing variables like laptop model, RAM, storage, GPU, CPU, display, and touchscreen to predict laptop price. Explanatory data analysis and preprocessing are performed before implementing the random forest model. The model achieves 89% prediction accuracy. A streamlit web app is created to demonstrate the model's laptop price predictions based on user-selected configurations. The conclusion is that the model can help students select appropriately priced laptops that meet their needs.

Software Defined Networking in the ATMOSPHERE project

Optimization Models for on-demand GPUs in the Cloud

Recommended

Recommended

More Related Content

Similar to Optimization Models for on-demand GPUs in the Cloud

Similar to Optimization Models for on-demand GPUs in the Cloud (20)

More from ATMOSPHERE .

More from ATMOSPHERE . (20)

Recently uploaded

Recently uploaded (20)

Optimization Models for on-demand GPUs in the Cloud